Junk DNA

From Wikipedia, the free encyclopedia

Junk DNA is a synonym for nonfunctional DNA or DNA that has no relevant biological function.[1][2] Most organisms have some junk DNA in their genomes - mostly pseudogenes and fragments of transposons and viruses - but it's possible that some organisms have substantial amounts of junk DNA.[3]

All protein-coding regions of genes are generally considered as functional elements in genomes. Additionally, non-protein coding regions such as genes for ribosomal RNA and tRNA, regulatory sequences controlling expression of those genes, elements of the genome involving origins of replication (in all species), centromeres, telomeres, and scaffold attachment regions (in eukaryotes); these are generally considered as functional elements of genomes as well.

It's difficult to determine whether other regions of the genome are functional or nonfunctional and there is considerable controversy over which criteria should be used to identify function. Many scientists have an evolutionary view of the genome and they prefer criteria based on whether DNA sequences are preserved by natural selection.[4][5][6] Other scientists dispute this view or have different interpretations of the data.[7][8][9]

The history of junk DNA[edit]

The term "junk DNA" was used in the 1960s.[2][10][11] but it only became widely known in 1972 in a paper by Susumu Ohno.[12] Ohno noted that the mutational load from deleterious mutations placed an upper limit on the number of functional loci that could be expected given a typical mutation rate. He hypothesized that mammalian genomes could not have more than 30,000 loci under selection before the "cost" from the mutational load would cause an inescapable decline in fitness, and eventually extinction.[12] Similar calculations focusing on nucleotides rather than gene loci come to the similar conclusion that the functional portion of the human genome (given mutation rates, genome size and population size) can only be maintained up to approximately 15%.[13] The presence of junk DNA also explained the observation that even closely related species can have widely (orders-of-magnitude) different genome sizes (C-value paradox).[1] The term "junk DNA" is contentious and different exact definitions (and associated methods) yield widely different estimates of its prevalence.[6] Contributing to this, is that there is no consensus on what constitutes a "functional" element in the genome since geneticists, evolutionary biologists, and molecular biologists employ different approaches and definitions of "function" in the literature, yielding different estimates.[9] Leading to different schools of thought on the matter.[14]


The term "junk DNA" is contentious and different exact definitions (and associated methods) yield widely different estimates of its prevalence.[6] Some authors assert that the term occurs mainly in popular science and is no longer used in serious research articles.[15] It has also been pointed out that the term 'junk' can imply that its accumulation is disadvantageous, whereas the majority of non-functional sequence is likely merely neutral.[16] Strong reactions to the term "junk DNA" have also lead some to recommend more neutral terminology, such as "nonfunctional DNA."[1]

Measurement and estimates[edit]

Different methodologies rest on different implicit definitions yield different estimates of the non-functional fraction of the genome.[6]

For example, 20% of human genomic DNA shows no detectable biochemical activity,[17] but comparative genomics methods estimate a nonfunctional fraction of 85-92%.[18][9][19] Consequently, different exact definitions of Junk DNA would yield different exact proportions. Each method has limitations, for example, genetic approaches may miss functional elements that do not manifest physically on the organism, evolutionary approaches have difficulties using accurate multispecies sequence alignments since genomes of even closely related species vary considerably, and biochemical signatures do not always automatically signify a function.[9] Ultimately genetic, evolutionary, and biochemical approaches can all be used in a complementary way to identify regions that may be functional in human biology and disease.[9]

Biochemical activity[edit]

Detectable biochemical activity (e.g. transcription, transcription factor association, chromatin structure, and histone modification) was observed for at least 80% of human genomic DNA by the Encyclopedia of DNA Elements (ENCODE) project.[17] This forms an upper estimate of the functional portion of the human genome since biochemical activity is not necessarily biological function or selective advantage.[20][1][21][2][22] For example, transcription factor binding sites are short and can be found by chance over the whole genome[23] and 70% of transcribed sequences are below 1 transcript per cell and so may be spurious background transcription.[9]

Genetic function[edit]

Contributing to the debate is that there is no consensus on what constitutes a "functional" element in the genome since geneticists, evolutionary biologists, and molecular biologists employ different approaches and definitions of "function",[9] often with a lack of clarity of what they mean in the literature.[24] Due to the ambiguity in the terminology, there are different schools of thought over this matter.[14]

However, widespread transcription and splicing in the human genome has been discussed as another indicator of genetic function in addition to genomic conservation which may miss poorly conserved functional sequences.[9] And much of the apparent junk DNA is involved in epigenetic regulation and appears to be necessary for the development of complex organisms.[25][26][27]

Some critics have argued that functionality can only be assessed in reference to an appropriate null hypothesis. In this case, the null hypothesis would be that these parts of the genome are non-functional and have properties, be it on the basis of conservation or biochemical activity, that would be expected of such regions based on our general understanding of molecular evolution and biochemistry. According to these critics, until a region in question has been shown to have additional features, beyond what is expected of the null hypothesis, it should provisionally be labelled as non-functional.[28]

Evolutionary impact[edit]

One indication of functionality of a genomic region is if that sequence has been maintained by purifying selection (or if mutating away the sequence is deleterious to the organism). Estimates for the functionally constrained fraction of the human genome based on evolutionary conservation using comparative genomics range between 8 and 15%.[18][9][19] These may still be an underestimate when lineage-specific constraints are included. However, others have argued against relying solely on estimates from comparative genomics due to its limited scope since non-coding DNA has been found to be involved in epigenetic activity and complex networks of genetic interactions and is explored in evolutionary developmental biology.[25][9][26][27]

Biologically functional sequences may also have different evolutionary impacts on the sequence itself or the organism that it is found in. Much of the DNA in large genomes originates from selfish amplification of transposable elements. Some of this sequence has biological function (transposition and self replication in the host genome) but does not provided a selective advantage to the host organism.[29]

An additional complication is that the large body of nonfunctional background transcripts produced by non-function sequences can evolve into functional elements de novo.[30][31] Therefore a sequence fitting a strict defining of junk as having no biological function and no fitness effect can still have long-term evolutionary significance.[32][33]

See also[edit]


  1. ^ a b c d Eddy SR (November 2012). "The C-value paradox, junk DNA and ENCODE". Current Biology. 22 (21): R898–R899. doi:10.1016/j.cub.2012.10.002. PMID 23137679. S2CID 28289437.
  2. ^ a b c Palazzo AF, Gregory TR (May 2014). "The case for junk DNA". PLOS Genetics. 10 (5): e1004351. doi:10.1371/journal.pgen.1004351. PMC 4014423. PMID 24809441.
  3. ^ Gil R, and Latorre A (2012). "Factors behind junk DNA in bacteria". Genes. 3: 634–650. doi:10.3390/genes3040634.
  4. ^ Ohno S (1972). "An argument for the genetic simplicity of man and other mammals". Journal of Human Evolution. 1: 651–662. doi:10.1016/0047-2484(72)90011-5.
  5. ^ Morange, Michel (2014). "Genome as a Multipurpose Structure Built by Evolution". Perspectives in Biology and Medicine. 57: 162–171. doi:10.1353/pbm.2014.0008.
  6. ^ a b c d Palazzo, A F; Kejiou, N S (2022). "Non-Darwinian Molecular Biology". Front. Genet. 13: 831068. doi:10.3389/fgene.2022.831068. PMC 8888898. PMID 35251134.
  7. ^ Germain PL, Ratti E, and Boem F (2014). "Junk or functional DNA? ENCODE and the function controversy". Biology & Philosophy. 29: 807–821. doi:10.1007/s10539-014-9441-3.
  8. ^ Mattick, John S (2023). "RNA out of the mist". TRENDS in Genetics. 39: 187–207. doi:10.1016/j.tig.2022.11.001.
  9. ^ a b c d e f g h i j Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, et al. (April 2014). "Defining functional DNA elements in the human genome". Proceedings of the National Academy of Sciences of the United States of America. 111 (17): 6131–6138. Bibcode:2014PNAS..111.6131K. doi:10.1073/pnas.1318948111. PMC 4035993. PMID 24753594.
  10. ^ Ehret CF, De Haller G (October 1963). "Origin, development, and maturation of organelles and organelle systems of the cell surface in Paramecium". Journal of Ultrastructure Research. 23: SUPPL6:1–SUPPL642. doi:10.1016/S0022-5320(63)80088-X. PMID 14073743.
  11. ^ TR, ed. (2005). The Evolution of the Genome. Elsevier. pp. 29–31. ISBN 978-0-12-301463-4.
  12. ^ a b Ohno, S (1972). "So much 'junk' DNA in our genome". Brookhaven Symposia in Biology. 23: 366–70. OCLC 101819442. PMID 5065367.
  13. ^ Graur, D (2017). "An Upper Limit on the Functional Fraction of the Human Genome". Genome Biol. Evol. 9 (7): 1880–1885. doi:10.1093/gbe/evx121. PMC 5570035. PMID 28854598.
  14. ^ a b Doolittle, W. Ford (December 2018). "We simply cannot go on being so vague about 'function'". Genome Biology. 19 (1): 223. doi:10.1186/s13059-018-1600-4. PMC 6299606. PMID 30563541.
  15. ^ Khajavinia A, Makalowski W (May 2007). "What is "junk" DNA, and what is it worth?". Scientific American. 296 (5): 104. doi:10.1038/scientificamerican0507-104. PMID 17503549.
  16. ^ Brenner, Sydney (September 1998). "Refuge of spandrels". Current Biology. 8 (19): R669. doi:10.1016/s0960-9822(98)70427-0. PMID 9776723. S2CID 2918533.
  17. ^ a b Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, et al. (The ENCODE Project Consortium) (September 2012). "An integrated encyclopedia of DNA elements in the human genome". Nature. 489 (7414): 57–74. Bibcode:2012Natur.489...57T. doi:10.1038/nature11247. PMC 3439153. PMID 22955616..
  18. ^ a b Ponting CP, Hardison RC (November 2011). "What fraction of the human genome is functional?". Genome Research. 21 (11): 1769–1776. doi:10.1101/gr.116814.110. PMC 3205562. PMID 21875934.
  19. ^ a b Rands CM, Meader S, Ponting CP, Lunter G (July 2014). "8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage". PLOS Genetics. 10 (7): e1004525. doi:10.1371/journal.pgen.1004525. PMC 4109858. PMID 25057982.
  20. ^ McKie R (24 February 2013). "Scientists attacked over claim that 'junk DNA' is vital to life". The Observer.
  21. ^ Doolittle WF (April 2013). "Is junk DNA bunk? A critique of ENCODE". Proceedings of the National Academy of Sciences of the United States of America. 110 (14): 5294–5300. Bibcode:2013PNAS..110.5294D. doi:10.1073/pnas.1221376110. PMC 3619371. PMID 23479647.
  22. ^ Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E (2013). "On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE". Genome Biology and Evolution. 5 (3): 578–590. doi:10.1093/gbe/evt028. PMC 3622293. PMID 23431001.
  23. ^ Lambert SA, Jolma A, Campitelli LF, Das PK, Yin Y, Albu M, et al. (February 2018). "The Human Transcription Factors". Cell. 172 (4): 650–665. doi:10.1016/j.cell.2018.01.029. PMID 29425488. S2CID 3599827.
  24. ^ Linquist, Stefan; Doolittle, W. Ford; Palazzo, Alexander F. (1 April 2020). "Getting clear about the F-word in genomics". PLOS Genetics. 16 (4): e1008702. doi:10.1371/journal.pgen.1008702. PMC 7153884. PMID 32236092.
  25. ^ a b Carey M (2015). Junk DNA: A Journey Through the Dark Matter of the Genome. Columbia University Press. ISBN 978-0-231-17084-0.[page needed]
  26. ^ a b Liu G, Mattick JS, Taft RJ (July 2013). "A meta-analysis of the genomic and transcriptomic composition of complex life". Cell Cycle. 12 (13): 2061–2072. doi:10.1186/1877-6566-7-2. PMC 4685169. PMID 23759593.
  27. ^ a b Morris K, ed. (2012). Non-Coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection. Norfolk, UK: Caister Academic Press. ISBN 978-1-904455-94-3.[page needed]
  28. ^ Palazzo AF, Lee ES (2015). "Non-coding RNA: what is functional and what is junk?". Frontiers in Genetics. 6: 2. doi:10.3389/fgene.2015.00002. PMC 4306305. PMID 25674102.
  29. ^ Doolittle WF, Sapienza C (April 1980). "Selfish genes, the phenotype paradigm and genome evolution". Nature. 284 (5757): 601–603. Bibcode:1980Natur.284..601D. doi:10.1038/284601a0. PMID 6245369. S2CID 4311366.
  30. ^ Palazzo AF, Koonin EV (November 2020). "Functional Long Non-coding RNAs Evolve from Junk Transcripts". Cell. 183 (5): 1151–1161. doi:10.1016/j.cell.2020.09.047. PMID 33068526. S2CID 222815635.
  31. ^ Graur D, Zheng Y, Azevedo RB (January 2015). "An evolutionary classification of genomic function". Genome Biology and Evolution. 7 (3): 642–645. doi:10.1093/gbe/evv021. PMC 5322545. PMID 25635041.
  32. ^ Schmitz, Jonathan F.; Ullrich, Kristian K.; Bornberg-Bauer, Erich (2018-09-10). "Incipient de novo genes can evolve from frozen accidents that escaped rapid transcript turnover". Nature Ecology & Evolution. 2 (10): 1626–1632. doi:10.1038/s41559-018-0639-7. ISSN 2397-334X.
  33. ^ Neme, Rafik; Tautz, Diethard (2016-02-02). "Fast turnover of genome transcription across evolutionary time exposes entire non-coding DNA to de novo gene emergence". eLife. 5: e09977. doi:10.7554/eLife.09977. ISSN 2050-084X.