Jump to content

Codon degeneracy

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Artoria2e5 (talk | contribs) at 13:17, 21 December 2023 (~). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Degeneracy or redundancy[1] of codons is the redundancy of the genetic code, exhibited as the multiplicity of three-base pair codon combinations that specify an amino acid. The degeneracy of the genetic code is what accounts for the existence of synonymous mutations.[2]: Chp 15 

Background

Degeneracy of the genetic code was identified by Lagerkvist.[3] For instance, codons GAA and GAG both specify glutamic acid and exhibit redundancy; but, neither specifies any other amino acid and thus are not ambiguous or demonstrate no ambiguity.

The codons encoding one amino acid may differ in any of their three positions; however, more often than not, this difference is in the second or third position.[4] For instance, the amino acid glutamic acid is specified by GAA and GAG codons (difference in the third position); the amino acid leucine is specified by UUA, UUG, CUU, CUC, CUA, CUG codons (difference in the first or third position); and the amino acid serine is specified by UCA, UCG, UCC, UCU, AGU, AGC (difference in the first, second, or third position).[2]: 521–522 

Degeneracy results because there are more codons than encodable amino acids. For example, if there were two bases per codon, then only 16 amino acids could be coded for (4²=16). Because at least 21 codes are required (20 amino acids plus stop) and the next largest number of bases is three, then 4³ gives 64 possible codons, meaning that some degeneracy must exist.[2]: 521–522 

Terminology

A position of a codon is said to be a n-fold degenerate site if only n of four possible nucleotides (A, C, G, T) at this position specify the same amino acid. A nucleotide substitution at a 4-fold degenerate site us always a synonymous mutation with no change on the amino acid.[2]: 521–522 

A less degenerate site would produce a nonsynonymous mutation on some of the substitutions. An example (and the only) 3-fold degenerate site is the third position of an isoleucine codon. AUU, AUC, or AUA all encode isoleucine, but AUG encodes methionine. In computation, this position is often treated as a twofold degenerate site.[why?][2]: 521–522 

A position is said to be non-degenerate if any mutation at this position changes the amino acid. For example, all three positions of methionine's AUG are non-degenerate, because the only codon coding for methionine is AUG. The same goes for tryptophan's UGG.[2]: 521–522 

There are three amino acids encoded by six different codons: serine, leucine, and arginine. Only two amino acids are specified by a single codon each. One of these is the amino-acid methionine, specified by the codon AUG, which also specifies the start of translation; the other is tryptophan, specified by the codon UGG.

Inverse table for the standard genetic code (compressed using IUPAC notation)
Amino acid DNA codons Compressed Amino acid DNA codons Compressed
Ala, A GCU, GCC, GCA, GCG GCN Ile, I AUU, AUC, AUA AUH
Arg, R CGU, CGC, CGA, CGG; AGA, AGG CGN, AGR; or
CGY, MGR
Leu, L CUU, CUC, CUA, CUG; UUA, UUG CUN, UUR; or
CUY, YUR
Asn, N AAU, AAC AAY Lys, K AAA, AAG AAR
Asp, D GAU, GAC GAY Met, M AUG
Asn or Asp, B AAU, AAC; GAU, GAC RAY Phe, F UUU, UUC UUY
Cys, C UGU, UGC UGY Pro, P CCU, CCC, CCA, CCG CCN
Gln, Q CAA, CAG CAR Ser, S UCU, UCC, UCA, UCG; AGU, AGC UCN, AGY
Glu, E GAA, GAG GAR Thr, T ACU, ACC, ACA, ACG ACN
Gln or Glu, Z CAA, CAG; GAA, GAG SAR Trp, W UGG
Gly, G GGU, GGC, GGA, GGG GGN Tyr, Y UAU, UAC UAY
His, H CAU, CAC CAY Val, V GUU, GUC, GUA, GUG GUN
START AUG, CUG, UUG HUG STOP UAA, UGA, UAG URA, UAR

Implications

These properties of the genetic code make it more fault-tolerant for point mutations. For example, in theory, fourfold degenerate codons can tolerate any point mutation at the third position, although codon usage bias restricts this in practice in many organisms; twofold degenerate codons can withstand silence mutation rather than Missense or Nonsense point mutations at the third position. Since transition mutations (purine to purine or pyrimidine to pyrimidine mutations) are more likely than transversion (purine to pyrimidine or vice versa) mutations, the equivalence of purines or that of pyrimidines at twofold degenerate sites adds a further fault-tolerance.[2]: 531–532 

Grouping of codons by amino acid residue molar volume and hydropathy.

A practical consequence of redundancy is that some errors in the genetic code cause only a synonymous mutation, or an error that would not affect the protein because the hydrophilicity or hydrophobicity is maintained by equivalent substitution of amino acids (conservative mutation). For example, a codon of NUN (where N = any nucleotide) tends to code for hydrophobic amino acids, NCN yields amino acid residues that are small in size and moderate in hydropathy, and NAN encodes average size hydrophilic residues.[5][6] These tendencies may result from the shared ancestry of the aminoacyl tRNA synthetases related to these codons.

These variable codes for amino acids are allowed because of modified bases in the first base of the anticodon of the tRNA, and the base-pair formed is called a wobble base pair. The modified bases include inosine and the Non-Watson-Crick U-G basepair.[7]

See also

References

  1. ^ "The Information in DNA Determines Cellular Function via Translation | Learn Science at Scitable". www.nature.com. Retrieved 2021-07-14.
  2. ^ a b c d e f g Watson JD, Baker TA, Bell SP, Gann A, Levine M, Oosick R (2008). Molecular Biology of the Gene. San Francisco: Pearson/Benjamin Cummings. ISBN 978-0-8053-9592-1.
  3. ^ Lagerkvist, U. (1978.) "Two out of three: An alternative method for codon reading", PNAS, 75:1759-62.
  4. ^ Lehmann, J; Libchaber, A (July 2008). "Degeneracy of the genetic code and stability of the base pair at the second position of the anticodon". RNA. 14 (7): 1264–9. doi:10.1261/rna.1029808. PMC 2441979. PMID 18495942.
  5. ^ Yang; et al. (1990). Michel-Beyerle, M. E. (ed.). Reaction centers of photosynthetic bacteria: Feldafing-II-Meeting. Vol. 6. Berlin: Springer-Verlag. pp. 209–18. ISBN 3-540-53420-2.
  6. ^ Füllen G, Youvan DC (1994). "Genetic Algorithms and Recursive Ensemble Mutagenesis in Protein Engineering". Complexity International. 1. Archived from the original on 2011-03-15.
  7. ^ Varani G, McClain WH (July 2000). "The G x U wobble base pair. A fundamental building block of RNA structure crucial to RNA function in diverse biological systems". EMBO Rep. 1 (1): 18–23. doi:10.1093/embo-reports/kvd001. PMC 1083677. PMID 11256617.