In biochemistry, a hypothetical protein is a protein whose existence has been predicted, but for which there is a lack of experimental evidence that it is expressed in vivo. Sequencing of several genomes has resulted in numerous predicted open reading frames to which functions cannot be readily assigned. These proteins, either orphan or conserved hypothetical proteins, make up ~ 20% to 40% of proteins encoded in each newly sequenced genome. Even when there is enough evidence that the product of the gene is expressed, by techniques such as microarray and mass-spectrometry, it is difficult to assign a function to it given its lack of identity to protein sequences with annotated biochemical function. Nowadays, most protein sequences are inferred from computational analysis of genomic DNA sequence. Hypothetical proteins are created by gene prediction software during genome analysis. When the bioinformatic tool used for the gene identification finds a large open reading frame without a characterised homologue in the protein database, it returns "hypothetical protein" as an annotation remark.
The function of a hypothetical protein can be predicted by domain homology searches with various confidence levels. Conserved domains are available in the hypothetical proteins which need to be compared with the known family domains by which hypothetical protein could be classified into particular protein families even though they have not been in vivo investigated. The function of hypothetical protein could also be predicted by homology modelling, in which hypothetical protein has to align with known protein sequence whose three dimensional structure is known and by modelling method if structure predicted then the capability of hypothetical protein to function could be ascertained computationally.
Further, approaches to annotate function to hypothetical proteins include determination of 3-dimensional structure of these proteins by structural genomics initiatives, understanding the nature and mode of prosthetic group/metal ion binding, fold similarity with other proteins of known functions and annotating possible catalytic site and regulatory site. Structure prediction with biochemical function assessment by screening for various substrate is another promising approach to annotate function
- Galperin MY (2001). "Conserved 'hypothetical' proteins: new hints and new puzzles.". Comp Funct Genomics. 2 (1): 14–18. doi:10.1002/cfg.66. PMC . PMID 18628897.
- Eisenstein E; et al. (2000). "Biological function made crystal clear - annotation of hypothetical proteins via structural genomics.". Curr Opin Biotechnol. 11 (1): 25–30. doi:10.1016/j.exppara.2015.01.013. PMID 10679350.
- Srinivasan B; et al. (2015). "Prediction of substrate specificity and preliminary kinetic characterization of the hypothetical protein PVX_123945 from Plasmodium vivax.". Exp Parasitol. 151-152: 56–63. doi:10.1016/j.exppara.2015.01.013. PMID 25655405.
- Sunil Pande Dilip Gore (2015). "Does hypothetical proteins of Yersinia pestis CO92 Capable of Coding Enzymes?". Journal of Pharmacy Research. 9: 278–287.
- Dilip Gore Ashish Chakule (2012). "Homology modeling and function prediction in uncharacterized proteins of Pseudoxanthomonas spadix". Biocompx. 1: 23–32.
- Zarembinski TI, Hung LW, Mueller-Dieckmann HJ, Kim KK, Yokota H, Kim R, Kim SH (December 1998). "Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics". Proceedings of the National Academy of Sciences of the United States of America. 95 (26): 15189–93. doi:10.1073/pnas.95.26.15189. PMC . PMID 9860944.
- Nan J, Brostromer E, Liu XY, Kristensen O, Su XD (2009). "Bioinformatics and structural characterization of a hypothetical protein from Streptococcus mutans: implication of antibiotic resistance". Plos One. 4 (10): e7245. doi:10.1371/journal.pone.0007245. PMC . PMID 19798411.
- Hernández S, Gómez A, Cedano J, Querol E (October 2009). "Bioinformatics annotation of the hypothetical proteins found by omics techniques can help to disclose additional virulence factors". Current Microbiology. 59 (4): 451–6. doi:10.1007/s00284-009-9459-y. PMID 19636617.
- Dilip Gore (2009). "In silico Prediction of Structure andEnzymatic Activity for Hypothetical Proteins of Shigellaflexneri. Biofrontiers". Biofrontiers. 1 (2): 1–10.
- Dilip gore; Alankar raut (2009). "Computational Functionand Structural Annotations for Hypothetical proteins ofBacillus anthracis". Biofrontiers. 1 (1): 27–36.
- Dogra Pranay; Dilip Gore (2010). "Prediction of EnzymaticFunction and Structure of H. influenzae HypotheticalProteins - An In silico Approach". IJSCB. 1 (in press).
- D G Gore; A P Denge; N M Amrute (2010). "Homology Modeling and Enzyme Function Prediction in the Hypothetical Proteins of Helicobacter pylori - an Insilico Approach". Biomirror. 1: 1–5.
|This protein-related article is a stub. You can help Wikipedia by expanding it.|