Orphan genes (in bacteria, also called ORFan genes) are genes without homologues in genomes of other organisms. Estimates of the percentage of genes which are orphans varies enormously between species and between studies; 10-30% is a commonly cited figure. At least in bacteria, there is no correlation between organism complexity and orphan genes percentage. Likewise, in bacteria, there is no correlation between orphan percentage and genome length.
Orphan genes are a previously unexplored sector of the world of biology that potentially can offer scientists much more and constitute a rich source of discovery. A neuropharmacologist and his team studying orphan genes said “The challenge is to figure out what it is doing. That's what really drives us." They are rather unusual compared to the lineage-specific genes that are studied frequently in evolution. The commonly accepted model of evolution is significantly based on duplication, rearrangement, and mutation of genes with the idea of common descent. This fits with lineage-specific genes but not quite with orphan genes. Understanding orphan genes can help adapt the working model of evolution. This helps explain why evolutionary biologists can be fascinated with orphan genes. However, they have comparatively received a relatively small amount of attention compared to lineage-specific genes. This could be because the only biological purpose of orphan genes that is currently known is that they are adaptively useful for organisms to evolve. This has been proven by the gene contents of bacterial genomes varying greatly amongst species and is well accepted in the scientific community. This is all intriguing, but the study of lineage-specific genes has generated much more interest in the scientific community because their causes and effects are easier to determine.
In order to be considered an orphan gene, the gene must be encoding a protein that lacks homology to any predicted peptide from other genomes of similar species. Orphans are a subset of taxonomically-restricted genes (TRGs), which are unique to a specific taxonomic level (e.g. plant-specific). In contrast, orphans are usually considered unique to a very narrow taxon, generally a species.
History of orphan genes
Orphan genes were first discovered when the yeast genome-sequencing project began in 1996. Orphan genes accounted for an estimated 26% of the yeast genome, but it was believed that these genes could be classified when more genomes were sequenced. Since there are between an estimated 1 and 20 million animal species in the world, the discovery was ignored for some time. However, the cumulative number of orphan genes in sequenced genomes did not level off as time passed. In the sequencing of Schizosaccharomyces pombe and Schizosaccharomyces cerevisiae in 2002, researchers found that 14 percent and 19 percent, respectively, of the protein encoding genes were totally unique to that specific species. Unfortunately for the study of orphan genes, researchers were more interested in studying the similar gene sequences and not the unknown regions.
It wasn’t until 2003 that orphan genes were directly accessed. In a study of Caenorhabditis briggsae and related species, researchers compared over 2000 genes. They proposed that these genes must be evolving too quickly to be detected and are consequently sites of very rapid evolution. In 2005, Wilson examined 122 bacterial species to try to examine whether the large number of orphan genes in many species was legitimate. The study found that it was legitimate and played a role in bacterial adaptation. The definition of taxonomically-restricted genes was introduced into the literature to make orphan genes seem less “mysterious.”
In 2009, a study went into “‘the dark matter of protein space’’ to analyze the 2,200 domains of unknown function and concluded that they facilitated evolution of novel functions. This was important because orphan genes were recognized to have a purpose at the level of proteins.
In 2011, a comprehensive genome-wide study of the extent and evolutionary origins of orphan genes in plants was conducted in the model plant Arabidopsis thaliana ”
How to identify orphan genes
Genes can be tentatively classified as orphans if no similar predicted proteins can be found in nearby species.
The most common method used to estimate similarity is the Basic Local Alignment Search Tool (BLAST). BLAST is biology's Google, allowing nucleotide or protein sequences to be rapidly searched against large sequence databases. Simulations suggest that under certain conditions BLAST is suitable for detecting distant relatives of a gene. However genes that are short and evolve rapidly can easily be missed by BLAST.
Where do orphan genes come from?
According to the Carvunis Model, novel orphan genes continually arise from "random", non-coding sequence. These novel genes may be sufficiently beneficial to be swept to fixation by selection. Or more likely, they will fade back into the non-genic background. That young genes are more likely to go extinct (become pseudogenes) has recently been confirmed in Drosophila .
Orphans genes tend to be very short (~6 times shorter than mature genes), weakly expressed, tissue specific and simple in codon usage and amino acid composition. Orphan genes are encoded mostly intrinsically disordered proteins.
Some researchers have proposed that orphan genes drive morphological specification because they allow organisms to "adapt to constantly changing ecological conditions." These all give more possibilities of differences within a population to help it survive in its environment, which can be helpful if it recently experienced a bottleneck.
- Khalturin, K; Hemmrich, G; Fraune, S; Augustin, R; Bosch, TC (2009). "More than just orphans: are taxonomically-restricted genes important in evolution?". Trends in Genetics 25 (9): 404–413. doi:10.1016/j.tig.2009.07.006.
- Fukuchi, S.; Nishikawa, K. (2004). "Estimation of the number of authentic orphan genes in bacterial genomes". DNA Research 11 (4): 219–231.
- Jaroszewski, L.; Li, Z.; Krishna, S. S.; Bakolitsa, C.; Wooley, J.; Deacon, A. M.; Godzik, A. (2009). "Exploration of uncharted regions of the protein universe". PLOS Biology 7 (9): 1–15. doi:10.1371/journal.pbio.1000205.
- Maugh, T. H. II., 1999, Sep 16. SCIENCE FILE; Homes for orphan genes; Scientists are making progress in decoding the human blueprint, but the function of many genes remains a mystery. using sophisticated techniques, researchers are solving some of the puzzles, and getting closer to new treatments for such ailments as high blood pressure. Los Angeles Times: 2.
- Toll-Riera, M.; Bosch, N.; Bellora, N.; Castelo, R.; Armengol, L.; Estivill, X.; Alba, M. M. (2009). "Origin of primate orphan genes: a comparative genomics approach". Molecular Biology and Evolution 26 (3): 603–612. doi:10.1093/molbev/msn281.
- Tautz, D.; Domazet-Lošo, T. (2011). "The evolutionary origin of orphan genes". Nature Reviews Genetics 12: 692–702. doi:10.1038/nrg3053. PMID 21878963.
- Wissler, L.; Gadau, J.; Simola, D. F.; Helmkampf, M.; Bornberg-Bauer, E. (2013). "Mechanisms and Dynamics of Orphan Gene Emergence in Insect Genomes". Genome Biology and Evolution 5 (2): 439–455. doi:10.1093/gbe/evt009.
- Carroll, S. B., Grenier, J., & Weatherbee, S. 2009. From DNA to diversity: Molecular genetics and the evolution of animal design. Blackwell Publishing: Oxford.
- Wilson, G. A.; Bertrand, N.; Patel, Y.; Hughes, J. B.; Feil, E. J.; Field, D. (2005). "Orphans as taxonomically restricted and ecologically important genes". Microbiology 151 (8): 2499–2501. doi:10.1099/mic.0.28146-0.
- Donoghue, M.T.A; Keshavaiah, C.; Swamidatta, S.H.; Spillane, C. (2011). "Evolutionary origins of Brassicaceae specific genes in Arabidopsis thaliana". BMC Evolutionary Biology 11 (1): 47. doi:10.1186/1471-2148-11-47.
- Altschul, S. (1 September 1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs". Nucleic Acids Research 25 (17): 3389–3402. doi:10.1093/nar/25.17.3389. PMC 146917. PMID 9254694.
- "NCBI BLAST homepage".
- Alba, M; Castresana, J (2007). "On homology searches by protein BLAST and the characterization of the age of genes". BMC Evol. Biol. 7 (53).
- Moyers, B. A.; Zhang, J. (13 October 2014). "Phylostratigraphic Bias Creates Spurious Patterns of Genome Evolution". Molecular Biology and Evolution 32 (1): 258–267. doi:10.1093/molbev/msu286.
- Carvunis, Anne-Ruxandra; Rolland, Thomas; Wapinski, Ilan; Calderwood, Michael A.; Yildirim, Muhammed A.; Simonis, Nicolas; Charloteaux, Benoit; Hidalgo, César A.; Barbette, Justin; Santhanam, Balaji; Brar, Gloria A.; Weissman, Jonathan S.; Regev, Aviv; Thierry-Mieg, Nicolas; Cusick, Michael E.; Vidal, Marc (24 June 2012). "Proto-genes and de novo gene birth". Nature 487 (7407): 370–374. doi:10.1038/nature11184.
- Palmieri, Nicola; Kosiol, Carolin; Schlötterer, Christian (19 February 2014). "The life cycle of orphan genes". eLife 3. doi:10.7554/eLife.01311.
- Arendsee, Zebulun W.; Li, Ling; Wurtele, Eve Syrkin (November 2014). "Coming of age: orphan genes in plants". Trends in Plant Science 19 (11): 698–708. doi:10.1016/j.tplants.2014.07.003.
- Mukherjee, S.; Panda, A.; Ghosh, T.C. (March 2015). "Elucidating evolutionary features and functional implications of orphan genes in Leishmania major". Infection Genetics and Evolution. doi:10.1016/j.meegid.2015.03.031.