Comparative genomics is the study of the relationship of genome structure and function across different biological species or strains. Comparative genomics is an attempt to take advantage of the information provided by the signatures of selection to understand the function and evolutionary processes that act on genomes. While it is still a young field, it holds great promise to yield insights into many aspects of the evolution of modern species. The sheer amount of information contained in modern genomes (3.2 gigabases in the case of humans) necessitates that the methods of comparative genomics are automated. Gene finding is an important application of comparative genomics, as is discovery of new, non-coding functional elements of the genome.
Comparative genomics exploits both similarities and differences in the proteins, RNA, and regulatory regions of different organisms to infer how selection has acted upon these elements. Those elements that are responsible for similarities between different species should be conserved through time (stabilizing selection), while those elements responsible for differences among species should be divergent (positive selection). Finally, those elements that are unimportant to the evolutionary success of the organism will be unconserved (selection is neutral).
One of the important goals of the field is the identification of the mechanisms of eukaryotic genome evolution. It is however often complicated by the multiplicity of events that have taken place throughout the history of individual lineages, leaving only distorted and superimposed traces in the genome of each living organism. For this reason comparative genomics studies of small model organisms (for example the model Caenorhabditis elegans and closely related Caenorhabditis briggsae) are of great importance to advance our understanding of general mechanisms of evolution.
Having come a long way from its initial use of finding functional proteins, comparative genomics is now concentrating on finding regulatory regions and siRNA molecules. Recently, it has been discovered that distantly related species often share long conserved stretches of DNA that do not appear to code for any protein (see conserved non-coding sequence). One such ultra-conserved region, that was stable from chicken to chimp has undergone a sudden burst of change in the human lineage, and is found to be active in the developing brain of the human embryo.
Computational approaches to genome comparison have recently become a common research topic in computer science. A public collection of case studies and demonstrations is growing, ranging from whole genome comparisons to gene expression analysis. This has increased the introduction of different ideas, including concepts from systems and control, information theory, strings analysis and data mining. It is anticipated that computational approaches will become and remain a standard topic for research and teaching, while multiple courses will begin training students to be fluent in both topics.
See also 
- Stein LD, et al. (2003). "The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics". PLoS Biology 1 (2): E45. doi:10.1371/journal.pbio.0000045. PMC 261899. PMID 14624247.
- "Newly Sequenced Worm a Boon for Worm Biologists". PLoS Biology 1 (2): e4–e4. 2003. doi:10.1371/journal.pbio.0000044.
- Bejerano, Gill; Michael Pheasant, Igor Makunin, Stuart Stephen, W James Kent, John S Mattick, David Haussler (2004-05-28). "Ultraconserved elements in the human genome". Science 304 (5675): 1321–1325. doi:10.1126/science.1098119. ISSN 1095-9203. PMID 15131266.
- Pollard, Katherine S.; Sofie R. Salama, Nelle Lambert, Marie-Alexandra Lambot, Sandra Coppens, Jakob S. Pedersen, Sol Katzman, Bryan King, Courtney Onodera, Adam Siepel, Andrew D. Kern, Colette Dehay, Haller Igel, Manuel Ares, Pierre Vanderhaeghen, David Haussler (2006). "An RNA gene expressed during cortical development evolved rapidly in humans". Nature 443 (7108): 167–172. doi:10.1038/nature05113. ISSN 0028-0836. PMID 16915236. Retrieved 2012-01-13.
- Cristianini N and Hahn M (2006). Introduction to Computational Genomics. Cambridge University Press. ISBN 0-521-67191-4.
- Via, Allegra; Javier De Las Rivas, Teresa K. Attwood, David Landsman, Michelle D. Brazas, Jack A. M. Leunissen, Anna Tramontano, Maria Victoria Schneider (2011-10-27). "Ten Simple Rules for Developing a Short Bioinformatics Training Course". PLoS Comput Biol 7 (10): e1002245. doi:10.1371/journal.pcbi.1002245. Retrieved 2011-12-03.
Further reading 
||This article includes a list of references, but its sources remain unclear because it has insufficient inline citations. (April 2009)|
- Bergman NH, ed. (2007). Comparative Genomics: Volumes 1 and 2. Totowa (NJ): Humana Press. ISBN 978-193411-537-4. PMID 21250292.
- Kellis M, Patterson N, Endrizzi M, Birren B, Lander E (2003-05-15). "Sequencing and comparison of yeast species to identify genes and regulatory elements". Nature 423 (6937): 241–254. doi:10.1038/nature01644. PMID 12748633.
- Cliften P, Sudarsanam P, Desikan A (2003-07-04). "Finding functional features in Saccharomyces genomes by phylogenetic footprinting". Science 301 (5629): 71–76. doi:10.1126/science.1084337. PMID 12775844.
- Hardison RC (2003). "Comparative genomics". PLoS Biology 1 (2): e58. doi:10.1371/journal.pbio.0000058. PMC 261895. PMID 14624258.
- Boffeli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, Rubin EM (2003). "Phylogenetic shadowing of primate sequences to find functional regions of the human genome". Science 299 (5611): 1391–1394. doi:10.1126/science.1081331. PMID 12610304.
- Dujon B, et al. (2004-07-01). "Genome evolution in yeasts". Nature 430 (6995): 35–44. doi:10.1038/nature02579. PMID 15229592.
- Filipski A, Kumar S (2005). "Comparative genomics in eukaryotes". In T.R. Gregory. The Evolution of the Genome. San Diego: Elsevier. pp. 521–583.
- Gregory TR, DeSalle R (2005). "Comparative genomics in prokaryotes". In T.R. Gregory. The Evolution of the Genome. San Diego: Elsevier. pp. 585–675.
- Xie X, Lu J. Kulbokas EJ, Golub T, Mootha V, Lindblad-Toh K, Lander E, Kellis M (2005). "Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals". Nature 434 (7031). doi:10.1038/nature03441. PMC 2923337. PMID 15735639.
- Champ PC, Binnewies TT, Nielsen N, Zinman G, Kiil K, Wu H, Bohlin J, Ussery DW (2006). "Genome update: purine strand bias in 280 bacterial chromosomes". Microbiology 152 (3): 579–583. doi:10.1099/mic.0.28637-0.
- Kumar L, Breakspear A, Kistler A, Ma L-J, Xie X (2010). "Systematic discovery of regulatory motifs in Fusarium graminearum by comparing four Fusarium genomes". BMC Genomics 11: 208. doi:10.1186/1471-2164-11-208. PMC 2853525. PMID 20346147.
- Genomes OnLine Database (GOLD)
- Genome News Network
- JCVI Comprehensive Microbial Resource
- Pathema: A Clade Specific Bioinformatics Resource Center
- CBS Genome Atlas Database
- The UCSC Genome Browser
- The U.S. National Human Genome Research Institute
- Ensembl The Ensembl Genome Browser
- Genolevures, comparative genomics of the Hemiascomycetous yeasts
- Phylogenetically Inferred Groups (PhIGs), a recently developed method incorporates phylogenetic signals in building gene clusters for use in comparative genomics.
- Metazome, a resource for the phylogenomic exploration and analysis of Metazoan gene families.
- IMG The Integrated Microbial Genomes system, for comparative genome analysis by the DOE-JGI.
- Dcode.org Dcode.org Comparative Genomics Center.
- SUPERFAMILY Protein annotations for all completely sequenced organisms
- Comparative Genomics
- Blastology and Open Source: Needs and Deeds