Phylogenomics is the intersection of the fields of evolution and genomics. The term has been used in multiple ways to refer to analysis that involves genome data and evolutionary reconstructions. It is a group of techniques within the larger fields of phylogenetics and genomics. Phylogenomics draws information by comparing entire genomes, or at least large portions of genomes. Phylogenetics compares and analyzes the sequences of single genes, or a small number of genes, as well as many other types of data. Three major areas fall under phylogenomics:
- Prediction of gene function
- Establishment and clarification of evolutionary relationships
- Prediction and retracing lateral gene transfer.
Prediction of Gene Function
When Jonathan Eisen originally coined phylogenomics, it applied to prediction of gene function. Before the use of phylogenomic techniques, predicting gene function was done primarily by comparing the gene sequence with the sequences of genes with known functions. When several genes with similar sequences but differing functions are involved, this method alone is ineffective in determining function. A specific example is presented in the paper “Gastronomic Delights: A movable feast”. Gene predictions based on sequence similarity alone had been used to predict that Helicobacter pylori can repair mismatched DNA. This prediction was based on the fact that this organism has a gene for which the sequence is highly similar to genes from other species in the "MutS" gene family which included many known to be involved in mismatch repair. However, Eisen noted that H. pylori lacks other genes thought to be essential for this function (specifically, members of the MutL family). Eisen suggested a solution to this apparent discrepancy - phylogenetic trees of genes in the MutS family revealed that the gene found in H. pylori was not in the same subfamily as those known to be involved in mismatch repair. Furthermore he suggested that this "phylogenomic" approach could be used as a general method for prediction functions of genes. This approach was formally described in 1998. For reviews of this aspect of phylogenomics see Brown D, Sjölander K. Functional classification using phylogenomic inference.
Prediction and Retracing Lateral Gene Transfer
Traditional phylogenetic techniques have difficulty establishing differences between genes that are similar because of lateral gene transfer and those that are similar because the organisms shared an ancestor. By comparing large numbers of genes or entire genomes among many species, the genes acquired through lateral gene transfer become more evident. Using these methods, researchers were able to identify over 2,000 metabolic enzymes obtained by various eukaryotic parasites from lateral gene transfer.
Establishment of Evolutionary Relationships
Traditional single-gene studies are effective in establishing phylogenetic trees among closely related organisms, but have drawbacks when comparing more distantly related organisms or microorganisms. This is because of lateral gene transfer, convergence, and varying rates of evolution for different genes. By using entire genomes in these comparisons, the anomalies created from these factors are overwhelmed by the pattern of evolution indicated by the majority of the data. Through phylogenomics, it has been discovered that most of the photosynthetic eukaryotes are linked and possibly share a single ancestor. Researchers compared 135 genes from 65 different species of photosynthetic organisms. These included plants, chromalveolates, rhizarians, haptophytes and cryptomonads. This has been referred to as the Plants+HC+SAR megagroup. Using this method, it is theoretically possible to create fully resolved phylogenetic trees, and timing constraints can be recovered more accurately. However, in practice this is not always the case. Due to insufficient data, multiple trees can sometimes be supported by the same data when analyzed using different methods.
- BioMed Central | Full text | Overview of the First Phylogenomics Conference
- Pennisi, Elizabeth (27). "Building the Tree of Life, Genome by Genome". Science 320 (5884): 1716–1717. doi:10.1126/science.320.5884.1716. PMID 18583591.
- Eisen JA, Kaiser D, Myers RL. 1997. Gastrogenomic delights: a movable feast. Nat Med. 3(10):1076-8. PMID 9334711
- Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Hickey EK, Berg DE, Gocayne JD, Utterback TR, Peterson JD, Kelley JM, Cotton MD, Weidman JM, Fujii C, Bowman C, Watthey L, Wallin E, Hayes WS, Borodovsky M, Karp PD, Smith HO, Fraser CM, Venter JC. 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388(6642):539-47.
- Eisen JA. 1998. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Res. 8(3):163-7. PMID 9521918
- Brown D, Sjölander K. Functional classification using phylogenomic inference. PLoS Comput Biol. 2006 Jun 30;2(6):e77. PMID 16846248
- Sjölander K. Bioinformatics. 2004 Jan 22;20(2):170-9. Phylogenomic inference of protein molecular function: advances and challenges. doi:10.1093/bioinformatics/bth021 PMID 14734307
- Whitaker JW, McConkey GA, Westhead DR. The transferome of metabolic genes explored: analysis of the horizontal transfer of enzyme encoding genes in unicellular eukaryotes. Genome Biology. 2009. 10. R36 PMID 19368726
- Delsuc F, Brinkmann H, Philippe H.Phylogenomics and the reconstruction of the tree of life. Nat Rev Genet. 2005 6(5):361-75.
- Philippe H, Snell EA, Bapteste E, Lopez P, Holland PW, Casane D. Mol Biol Evol. 2004 Sep;21(9):1740-52. Phylogenomics of eukaryotes: impact of missing data on large alignments.
- Jeffroy O, Brinkmann H, Delsuc F, Philippe H. Phylogenomics: the beginning of incongruence? Trends Genet. 2006 Apr;22(4):225-31. PMID 16490279
- Burki, Fabien; Shalchian-Tabrizi, Kamran; Pawlowski, Jan (23). "Phylogenomics reveals a new 'megagroup' including most photosynthetic eukaryotes". Biology Letters 4 (4): 366–369. doi:10.1098/rsbl.2008.0224. PMC 2610160. PMID 18522922.
- Dos Reis, M.; Inoue, J.; Hasegawa, M.; Asher, R. J.; Donoghue, P. C. J.; Yang, Z. (2012). "Phylogenomic datasets provide both precision and accuracy in estimating the timescale of placental mammal phylogeny". Proceedings of the Royal Society B: Biological Sciences 279 (1742): 3491. doi:10.1098/rspb.2012.0683.
- Philippe, Herve'; Delsuc, Frederic; Brinkmann, Henner; Lartillot, Nicolas (2005). "Phylogenomics". Annual Review of Ecology, Evolution, and Systematics 36: 541–562. doi:10.1146/annurev.ecolsys.35.112202.130205.