Proteogenomics is an emerging field of biological research at the intersection of proteomics and genomics. While this intersection is large and can be defined in multiple ways, the term proteogenomics commonly refers to studies that use proteomic information, often derived from mass spectrometry, to improve gene annotations.
Proteogenomics has been applied to improve the gene annotations of various organisms. The term proteogenomics was first used in this context by a Harvard team in 2004, although the research in this field had been building up in the previous decade. Since then, the approach has been extended to other species including Arabidopsis thaliana, humans, multiple species of Shewanella bacteria,  chicken, among many others.
Besides improving gene annotations, proteogenomic studies can also provide valuable information about the presence of programmed frameshifts, N-terminal methionine excision, signal peptides, proteolysis and other posttranslational modifications.
The main idea behind the proteogenomic approach is to identify peptides in a biological sample using mass spectrometry by searching the six-frame translation of the genome sequence, as opposed to searching the protein database. This enables identification of protein regions that are absent from or incorrectly represented in current gene annotations, and thus allows improvement of the gene annotations.
Comparative proteogenomics is a branch of proteogenomics that compares proteomic data from multiple related species concurrently and exploits the homology between their proteins to improve annotations with higher statistical confidence.
- Gupta N., Tanner S., Jaitly N., Adkins J.N., Lipton M., Edwards R., Romine M., Osterman A., Bafna V., Smith R.D., et al. Whole proteome analysis of post-translational modifications: Applications of mass-spectrometry for proteogenomic annotation. Genome Res. 2007;17:1362–1377.
- . Ansong C., Purvine S. O., Adkins J. N., Lipton M. S., Smith R. D. ( 2008) Proteogenomics: needs and roles to be filled by proteomics in genome annotation. Brief. Funct. Genomics Proteomics 7, 50– 62.
- Jaffe J.D., Berg H.C., Church G.M. Proteogenomic mapping as a complementary method to perform genome annotation. Proteomics. 2004a;4:59–77.
- Shevchenko A., Jensen O. N., Podtelejnikov A. V., Sagliocco F., Wilm M., Vorm O., Mortensen P., Shevchenko A., Boucherie H., Mann M. ( 1996) Linking genome and proteome by mass spectrometry: large-scale identification of yeast proteins from two dimensional gels. Proc. Natl. Acad. Sci. U.S.A 93, 14440– 14445.
- Castellana N. E., Payne S. H., Shen Z., Stanke M., Bafna V., Briggs S. P. ( 2008) Discovery and revision of Arabidopsis genes by proteogenomics. Proc. Natl. Acad. Sci. U.S.A 105, 21034– 21038.
- Tanner S., Shen Z., Ng J., Florea L., Guigo R., Briggs S.P., Bafna V. Improving gene annotation using peptide mass spectrometry. Genome Res. 2007;17:231–239.
- Gupta N., Benhamida J., Bhargava V., Goodman D., Kain E., Kerman I., Nguyen N., Ollikainen N., Rodriguez J., Wang J., et al. Comparative proteogenomics: Combining mass spectrometry and comparative genomics to analyze multiple genomes. Genome Res. 2008;18:1133–1142.
- McCarthy FM, Cooksey AM, Wang N, Bridges SM, Pharr GT, Burgess SC. (2006) Modeling a whole organ using proteomics: the avian bursa of Fabricius. Proteomics 6(9):2759-71.
- Gallien S., Perrodou E., Carapito C., Deshayes C., Reyrat J. M., Van Dorsselaer A., Poch O., Schaeffer C., Lecompte O. ( 2009) Ortho-proteogenomics: multiple proteomes investigation through orthology and a new MS-based protocol. Genome Res 19, 128– 135.