Retrotransposons (also called transposons via RNA intermediates) are genetic elements that can amplify themselves in a genome and are ubiquitous components of the DNA of many eukaryotic organisms. They are a subclass of transposon. They are particularly abundant in plants, where they are often a principal component of nuclear DNA. In maize, 49-78% of the genome is made up of retrotransposons. In wheat, about 90% of the genome consists of repeated sequences and 68% of transposable elements. In mammals, almost half the genome (45% to 48%) is transposons or remnants of transposons. Around 42% of the human genome is made up of retrotransposons, while DNA transposons account for about 2-3%.
The retrotransposons' replicative mode of transposition by means of an RNA intermediate rapidly increases the copy numbers of elements and thereby can increase genome size. Like DNA transposable elements (class II transposons), retrotransposons can induce mutations by inserting near or within genes. Furthermore, retrotransposon-induced mutations are relatively stable, because the sequence at the insertion site is retained as they transpose via the replication mechanism.
Retrotransposons copy themselves to RNA and then back to DNA that may integrate back to the genome. The second step of forming DNA may be carried out by a reverse transcriptase, which the retrotransposon encodes. Transposition and survival of retrotransposons within the host genome are possibly regulated both by retrotransposon- and host-encoded factors, to avoid deleterious effects on host and retrotransposon as well, in a relationship that has existed for many millions of years between retrotransposons and their hosts. The understanding of how retrotransposons and their hosts' genomes have co-evolved mechanisms to regulate transposition, insertion specificities, and mutational outcomes in order to optimize each other's survival is still in its infancy.
Most retrotransposons are very old and, through accumulated mutations, are no longer able to retrotranspose.
Types of retrotransposons
LTR retrotransposons have direct LTRs that range from ~100 bp to over 5 kb in size. LTR retrotransposons are further sub-classified into the Ty1-copia-like (Pseudoviridae), Ty3-gypsy-like (Metaviridae), and Pao-BEL-like groups based on both their degree of sequence similarity and the order of encoded gene products. Ty1-copia and Ty3-gypsy groups of retrotransposons are commonly found in high copy number (up to a few million copies per haploid nucleus) in animals, fungi, protista, and plants genomes. Pao-BEL like elements have so far only been found in animals. About 8% of the human genome and approximately 10% of the mouse genome are composed of LTR transposons.
Are also widely distributed, including both gymnosperms and angiosperms.
Non-LTR retrotransposons consist of two sub-types, long interspersed elements (LINEs) and short interspersed elements (SINEs). They can also be found in high copy numbers (up to 250,000) in the plant species. Non-long terminal repeat (LTR) retroposons are widespread in eukaryotic genomes. LINEs possess two ORFs, which encode all the functions needed for retrotransposition. These functions include reverse transcriptase and endonuclease activities, in addition to a nucleic acid-binding property needed to form a ribonucleoprotein particle. SINEs, on the other hand, co-opt the LINE machinery and function as nonautonomous retroelements.
Long Interspersed Elements are a group of genetic elements that are found in large numbers in eukaryotic genomes. They are transcribed to an RNA using an RNA polymerase II promoter that resides inside the LINE. LINEs code for the enzyme reverse transcriptase, and many LINEs also code for an endonuclease (e.g. RNase H). The reverse transcriptase has a higher specificity for the LINE RNA than other RNA, and makes a DNA copy of the RNA that can be integrated into the genome at a new site. The endonuclease encoded by non-LTR retroposons may be AP (Apurinic/Pyrimidinic) type or REL (Restriction Endonuclease Like) type. R2 group of elements have REL type endonuclease which shows site specificity in insertion.
The 5' UTR contains the promoter sequence, while the 3' UTR contains a polyadenylation signal (AATAAA) and a poly-A tail. Because LINEs (and other class I transposons, eg. LTR retrotransposons and SINEs) move by copying themselves (instead of moving by a cut and paste like mechanism, as class II transposons do), they enlarge the genome. The human genome, for example, contains about 500,000 LINEs, which is roughly 17% of the genome. Of these, approximately 7,000 are full-length, a small subset of which are capable of retrotransposition.
Interestingly, it was recently found that specific LINE-1 retroposons in the human genome are actively transcribed and the associated LINE-1 RNAs are tightly bound to nucleosomes and essential in the establishment of local chromatin environment.
Short Interspersed Elements are short DNA sequences (<500 bases) that represent reverse-transcribed RNA molecules originally transcribed by RNA polymerase III into tRNA, 5S ribosomal RNA, and other small nuclear RNAs. SINEs do not encode a functional reverse transcriptase protein and rely on other mobile elements for transposition. The most common SINEs in primates are called Alu sequences. Alu elements are approximately 350 base pairs long, do not contain any coding sequences, and can be recognized by the restriction enzyme AluI (hence the name). With about 1,500,000 copies, SINEs make up about 11% of the human genome. While historically viewed as "junk DNA", recent research suggests that, in some rare cases, both LINEs and SINEs were incorporated into novel genes so as to evolve new functionality. The distribution of these elements has been implicated in some genetic diseases and cancers. Although sequence analysis of human Alu subfamilies shows the existence of mosaic (recombinant) elements, experimental evidence is lacking. In the primitive eukaryote Entamoeba histolytica, the frequent exchange of sequence during retrotransposition has been reported; this results in a mosaic pattern in its SINE sequences.
Composite SINE Transposons
Two SINEs may act in concert to flank and mobilize an intervening single copy DNA sequence. This was reported for a 710 bp DNA sequence upstream of the bovine beta globin gene. The DNA arrangement forms a composite transposon whose presence has been confirmed by the complete bovine genomic sequence where the mobilized sequence may be found on bovine chromosome 15 in contig NW_001493315.1 nucleotides #1085432–1086142 and the originating sequence may be found on bovine chromosome 2 in contig NW_001501789.2 nucleotides #1096679–1097389. It is likely that similar composite transposons exist in other bovine genomic regions and other mammalian genomes. They could be detected with suitable algorithms.
- Copy-number variation
- Endogenous retrovirus
- Genomic organization
- Interspersed repeat
- Retrotransposon markers, a powerful method of reconstructing phylogenies.
- SanMiguel P, Bennetzen JL (1998). "Evidence that a recent increase in maize genome size was caused by the massive amplification of intergene retrotranposons". Annals of Botany 82 (Suppl A): 37–44. doi:10.1006/anbo.1998.0746.
- Li W, Zhang P, Fellers JP, Friebe B, Gill BS (November 2004). "Sequence composition, organization, and evolution of the core Triticeae genome". Plant J. 40 (4): 500–11. doi:10.1111/j.1365-313X.2004.02228.x. PMID 15500466.
- Lander ES, Linton LM, Birren B, et al. (February 2001). "Initial sequencing and analysis of the human genome". Nature 409 (6822): 860–921. doi:10.1038/35057062. PMID 11237011.
- Dombroski BA, Feng Q, Mathias SL, et al. (July 1994). "An in vivo assay for the reverse transcriptase of human retrotransposon L1 in Saccharomyces cerevisiae". Mol. Cell. Biol. 14 (7): 4485–92. PMC 358820. PMID 7516468.
- Copeland CS, Mann VH, Morales ME, Kalinna BH, Brindley PJ (2005). "The Sinbad retrotransposon from the genome of the human blood fluke, Schistosoma mansoni, and the distribution of related Pao-like elements". BMC Evol. Biol. 5: 20. doi:10.1186/1471-2148-5-20. PMC 554778. PMID 15725362.
- Wicker T, Sabot F, Hua-Van A, et al. (December 2007). "A unified classification system for eukaryotic transposable elements". Nat. Rev. Genet. 8 (12): 973–82. doi:10.1038/nrg2165. PMID 17984973.
- McCarthy EM, McDonald JF (2004). "Long terminal repeat retrotransposons of Mus musculus". Genome Biol. 5 (3): R14. doi:10.1186/gb-2004-5-3-r14. PMC 395764. PMID 15003117.
- Yadav, VP; Mandal, PK, Rao, DN, Bhattacharya, S (2009 Dec). "Characterization of the restriction enzyme-like endonuclease encoded by the Entamoeba histolytica non-long terminal repeat retroposon EhLINE1". The FEBS journal 276 (23): 7070–82. doi:10.1111/j.1742-4658.2009.07419.x. PMID 19878305.
- Singer MF (March 1982). "SINEs and LINEs: highly repeated short and long interspersed sequences in mammalian genomes". Cell 28 (3): 433–4. doi:10.1016/0092-8674(82)90194-5. PMID 6280868.
- Ohshima K, Okada N (2005). "SINEs and LINEs: symbionts of eukaryotic genomes with a common tail". Cytogenet. Genome Res. 110 (1–4): 475–90. doi:10.1159/000084981. PMID 16093701.
- Yadav, VP; Mandal, PK, Rao, DN, Bhattacharya, S (2009 Dec). "Characterization of the restriction enzyme-like endonuclease encoded by the Entamoeba histolytica non-long terminal repeat retrotransposon EhLINE1". The FEBS journal 276 (23): 7070–82. doi:10.1111/j.1742-4658.2009.07419.x. PMID 19878305.
- Deininger PL, Batzer MA (October 2002). "Mammalian retroelements". Genome Res. 12 (10): 1455–65. doi:10.1101/gr.282402. PMID 12368238.
- Richard Cordaux and Mark Batzer (October 2009). "The impact of retrotransposons on human genome evolution". Nature Reviews Genetics 10 (10): 691–703. doi:10.1038/nrg2640. PMC 2884099. PMID 19763152.
- Griffiths, Anthony J. (2008). Introduction to genetic analysis (9th ed.). New York: W.H. Freeman. p. 505. ISBN 0-7167-6887-9.
- Rangwala S, Kazazian HH (2009). "Many LINE1 elements contribute to the transcriptome of human somatic cells". Genome Biology 10 (9): R100. doi:10.1186/gb-2009-10-9-r100. PMC 2768975. PMID 19772661.
- Chueh, A.C.; Northrop, Emma L.; Brettingham-Moore, Kate H.; Choo, K. H. Andy; Wong, Lee H. (Jan 2009). "LINE Retrotransposon RNA Is an Essential Structural and Functional Epigenetic Component of a Core Neocentromeric Chromatin". In Bickmore, Wendy A. PLoS Genetics 5 (1): e1000354. doi:10.1371/journal.pgen.1000354. PMC 2625447. PMID 19180186.
- Stansfield, William D.; King, Robert C. (1997). A dictionary of genetics (5th ed.). Oxford [Oxfordshire]: Oxford University Press. ISBN 0-19-509441-7.
- Santangelo, Andrea; de Souza, Flavio; Franchini, Lucia; Bumaschny, Viviana; Low, Malcolm; Rubinstein,Marcelo (2007-10). "Ancient Exaptation of a CORE-SINE Retroposon into a Highly Conserved Mammalian Neuronal Enhancer of the Proopiomelanocortin Gene". PLoS Genetics (Public Library of Science) 3 (10): 1813–26. doi:10.1371/journal.pgen.0030166. PMC 2000970. PMID 17922573. Retrieved 2007-12-31.
- Yadav, Vijay Pal; Mandal, Prabhat Kumar; Bhattacharya, Alok; Bhattacharya, Sudha (21 May 2012). "Recombinant SINEs are formed at high frequency during induced retrotransposition in vivo". Nature Communications 3: 854. doi:10.1038/ncomms1855.
- Zelnick CR, Burks DJ, Duncan CH (December 1987). "A composite transposon 3' to the cow fetal globin gene binds a sequence specific factor". Nucleic Acids Res. 15 (24): 10437–53. doi:10.1093/nar/15.24.10437. PMC 339954. PMID 2827124.