Global metrics for comparing prokaryotic genomes
Density of RNA genes
global: total bases in RNA/genome size.
local: compare regions of high RNA gene density between multiple genomes.
Density of sRNA
global: total bases in sRNA/genome size.
local: compare regions of high sRNA density between multiple genomes.
Density of repeat regions.
global: total bases in repeats/genome size.
local: compare regions of high repeat density between multiple genomes.
Density of coding regions (genes).
global: total bases in genes/genome size.
local: compare regions of high gene density between multiple genomes.
Density of promoters.
global: total bases in promoters/genome size.
local: compare regions of high promoter density between multiple genomes.
Density of pseudo-genes.
global: total bases in pseudo-genes/genome size.
local: compare regions of high pseudo-gene density between multiple genomes.
Density of transposons.
global: total bases in transposons/genome size.
local: compare regions of high transposon density between multiple genomes.
GC content.
Genome size.
Functional enrichment correlation score.
GO calc for stat. enriched functions.
1. build profile of enriched functions for each genome, compare to all other genomes.
2. choose one or a few functions and calc enrich for all genomes.
Entropy Density Function for AAs.
plot on parallel axes? each axis is a genome... think more about this.
Average nucleotide identity (ANI).
global: ?? this is a pairwise comparison (ANI of all conserved genes between two organisms). Need some way to convert this to a global parameter.
Average amino acid identity (AAI).
global: ?? this is a pairwise comparison (AAI of all conserved genes between two organisms). Need some way to convert this to a global parameter.
Horizontal gene transfer (HGT vine width).
global: ?? this is a pairwise comparison. Need some way to convert this to a global parameter.
Genome conservation (phylogenetic distances between genomes).
global: ?? this is a pairwise comparison. Need some way to convert this to a global parameter.
Regulatory index (how regulated is a genome?).
global: number of regulator binding sites (quasi-palindromic sequences?)/genome size.
local: compare regions of high regulator binding site density between multiple genomes.
Entropy Distance Ratio
This would have to be relative to some entropy distance profile:
1. From a target replicon
2. From a profile over all bacteria