Non-proteinogenic amino acids
Non-coded, non-proteinogenic or non-standard amino acids are amino acids which are not encoded in the genetic code. Despite the presence of only 22 amino acids in the genetic code (the proteinogenic amino acids), over 140 natural amino acids are know and thousands of more combinations are possible. Several non-proteinogenic amino acids are noteworthy because they are:
- intermediates in biosynthesis
- post-transcriptionally incorporated into protein
- possess a physiological role (e.g. components of bacterial cell walls, neurotransmitters and toxins)
- natural and man-made pharmacological compounds
- present in meteorites and in prebiotic experiments (e.g. Miller–Urey experiment)
Definition by negation 
Technically, any organic compound with an amine (-NH2) and a carboxylic acid (-COOH) functional group is an amino acid. Whereas the proteinogenic amino acids are small subset of this group that possess central carbon atom (α- or 2-) bearing an amino group, a carboxyl group, a side chain and an α-hydrogen levo conformation, with the exception of glycine, which is achiral, and proline, which is technically an imino acid.
The genetic code encodes 20 standard amino acids. However, there are three extra proteinogenic amino acids: selenocysteine, pyrrolysine and N-Formylmethionine. The former two do not have a dedicated codon, but are added in place of a stop codon when a specific sequence is present, UGA codon and SECIS element for selenocysteine, UAG PYLIS downstream sequence for pyrrolysine. Formylmethionine is an imino acid encoded by the start codon AUG in bacteria that is never found within a protein sequence.
Formylmethionine. This amino acid is a methionine whose amino group has been protected by a formyl group
There are various groups of amino acids:
- 20 standard amino acids
- 23 proteinogenic amino acids
- over 80 amino acids created abiotically in high concentrations
- about 900 are produced by natural pathways
- over 118 engineered amino acids have been placed into protein
These groups overlap, but are not identical. All 23 proteinogenic amino acids are biosynthesises by organisms, but not all of them are abiotic (found in prebiotic experiments and meteorites), such as histidine. Many amino acids, such as ornithine, are metabolic intermediates that are abiotic, but not coded. Others are only metabolic intermediates, such as citrulline. Others are solely found in abiotic mixes, such as α-methylnorvaline. Over 30 unnatural amino acids have been translationally inserted into protein in engineered systems, yet are not biosynthetic.
In addition to the IUPAC numbering system to differentiate the various carbons in an organic molecule, by sequentially assigning a number to each carbon, including those forming a carboxylic group, the carbons along the side-chain of amino acids can also labelled with Greek letters, where the α-carbon is the central chiral carbon possessing a carboxyl group, a side chain and, in α-amino acid, an amine group —the carbon in carboxylic groups is not counted. (Consequently, the IUPAC names of many non-proteinogenic α-amino acids start with 2-amino and end in -ic acid.)
Natural, but non L-α-amino acids 
Most natural amino acids are α-amino acids in the L conformation, but some exceptions exist.
Non alpha 
Some non-α amino acids exist, most notably β-alanine and GABA. β-alanine is produced by aspartate decarboxylase (panD encoded) and is joined to pantoate via an amide bond, forming pantothenic acid, a precursor for cofactor A. GABA is a neurotransmitter in animals and possesses an extra carbon β-alanine in the hydrocarbon chain separating the terminal amino and carboxyl groups.
The reason why α amino acids are using in protein has been attributed to their frequency in meteorites and prebiotic experiments. An initial speculation on the deleterious properties of β amino acids in terms of secondary structure, turned out to be incorrect. Additionally, several man-made inhibitors exist that are not α amino acids, such as isoserine.
D amino acids 
Most bacterial cells walls are formed by peptidoglycan, a polymer composed of amino sugar crosslinked with short oligopeptides bridged between each other. The oligopeptide is non-ribosomally synthesised and contains several peculiarities, including D-amino acids, generally D-alanine and D-glutamate. A further peculiarity is that the former is racemised by a PLP-binding enzymes (encoded by alr or the homologue dadX), whereas the latter is racemised by a cofactor independent enzyme (murI). Some variants are present, in Thermotoga spp. D-lysine is present and in certain vancomycin-resistant strains D-serine is present (vanT gene).
In animals, some D-amino acids are neurotransmitters.
Without a hydrogen on the α-carbon 
All proteinogenic amino acids have at least one hydrogen on the α-carbon: this is due to the different specificity a the rybosomal transferase activity would require for a α-hydrogen versus a α-methyl and the biosynthetic problems faced with the quaternary carbon, which would block PLP-dependent catalysis (both SN2 and E2/attack).
Nevertheless, some exceptions are present. In some fungi α-Amino isobutyric acid is produced as a monomer to synthesise some antibiotics. This compound is similar to alanine, but possess a methyl group instead of a hydrogen, given that it possess two methyl group on the α-carbon, the latter is therefore not a stereocentre. Another compound similar to alanine with an α-hydrogen is dehydroalanine, which possess a methene sidechain.
Twin amino acid stereocentres 
A subset of L α amino acids possess two ends that could be considered α amino acids (obviously only one end is the α). In protein cysteine residues forms a disulfide bond with other cysteine residues therefore crosslinking the protein, two crosslinked cysteines form a cystine molecule. Cysteine and methionine are generally produced by direct sulfurylation, but in some species the can be produced by transfurylation, where the activated homoserine or serine is fused to a cysteine or homocysteine forming cystathionine, a molecule composed of a thioether bridged serine/cysteine moiety with a homoserine/homocysteine. A similar compound is lanthionine which can be seen as two cysteines joined via a thioether and is found in various organisms. Similarly, djenkolic acid, a plant toxin from jengkol beans, is composed of two cysteines joined via two thioethers separated by a methylene group. Diaminopimelic acid is both used as a bridge in petidoglycan and is used a precursor to lysine (via its decarboxylation).
In cells, especially autotrophs, several non-proteinogenic amino acids are found as metabolic intermediates. However, despite the catalytic flexibility of PLP-binding enzymes, many amino acids are synthesised as keto-acids (e.g. 4-methyl-2-oxopentanoate to leucine) and aminated in the last step, thus keeping the number of non-proteinogenic amino acid intermediates fairly low.
In addition to primary metabolism, several non-proteinogenic amino acids are precursors or the final production in secondary metabolism to make compounds such as toxins.
prebiotic amino acids and alternative biochemistries 
In meteorites and in prebiotic experiments (e.g. Miller–Urey experiment) many more amino acids than the twenty standard amino acids are found, several of which at higher concentrations that the standard ones: it has been conjectured that if amino acid based life were to arise in parallel elsewhere in the universe, no more than 75% of the amino acids would be in common. The most notable anomaly is the lack of aminobutyric acid.
|Proportion of amino acids relative to glycine (%)|
|Molecule||Electric discharge||Murchinson meteorite|
|N-ethyl alanine||< 0.05|
|N-ethyl β-alanine||< 0.05|
Straight side chain 
The genetic code has been described as a frozen accident and the reasons why there is only one standard amino acid with a straight chain (alanine) could simply be redundancy with valine, leucine and isoleucine. However, straight chained amino acids are reported to form much more stable alpha helices.
Serine, homoserine, O-methyl-homoserine and O-ethyl-homoserine possess an hydroxymethyl, hydroxyethyl, O-methyl-hydroxymethyl and O-methyl-hydroxyethyl side chain. Whereas cysteine, homocysteine, methionine and ethylonine possess the thiol equivalents. The selenol equivalents are selenocysteine, selenohomocysteine, selenomethionine and selenoethionine. Amino acids with the next chalcogen down are also found in nature: several species such as Aspergillus fumigatus, Aspergillus terreus, and Penicillium chrysogenum in the absence of sulfur are able to produce and incorporate into protein tellurocysteine and telluromethionine.
Hydroxyglycine, an amino acid with a hydroxyl side-chain, is highly unstable.
Expanded genetic code 
post-transcriptionally incorporated into protein 
Despite not being encoded by the genetic code as proteinogenic amino acids, some non-standard amino acids are nevertheless found in proteins. These are formed by post-translational modification of the side chains of standard amino acids present in the target protein. These modifications are often essential for the function or regulation of a protein; for example, in Gamma-carboxyglutamate the carboxylation of glutamate allows for better binding of calcium cations, and in hydroxyproline the hydroxylation of proline is critical for maintaining connective tissues. Another example is the formation of hypusine in the translation initiation factor EIF5A, through modification of a lysine residue. Such modifications can also determine the localization of the protein, e.g., the addition of long hydrophobic groups can cause a protein to bind to a phospholipid membrane.
Hypusine. This amino acid is obtained by adding to the ε-amino group of a lysine a 4-aminobutyl moiety (obtained from spermidine)
Toxic analogues 
Several non-proteinogenic amino acids are toxic due to their ability to mimic certain properties of proteinogenic amino acids, such as thialysine. Some non-proteinogenic amino acids are neurotoxic by mimicking amino acids used as neurotransmitters (i.e. not for protein biosynthesis), e.g. Quisqualic acid, canavanine or azetidine-2-carboxylic acid.
Not amino acids 
Taurine is an amino sulfonic acid and not an amino acid, however it is occasionally considered as such as the amounts required to suppress the auxotroph in certain organisms (e.g. cats) are closer to those of "essential amino acids" (amino acid auxotrophy) than of vitamins (cofactor auxptrophy).
- Ambrogelly, A.; Palioura, S.; Söll, D. (2007). "Natural expansion of the genetic code". Nature Chemical Biology 3 (1): 29–35. doi:10.1038/nchembio847. PMID 17173027.
- Böck, A.; Forchhammer, K.; Heider, J.; Baron, C. (1991). "Selenoprotein synthesis: An expansion of the genetic code". Trends in biochemical sciences 16 (12): 463–467. doi:10.1016/0968-0004(91)90180-4. PMID 1838215.
- Théobald-Dietrich, A.; Giegé, R.; Rudinger-Thirion, J. L. (2005). "Evidence for the existence in mRNAs of a hairpin element responsible for ribosome dependent pyrrolysine insertion into proteins". Biochimie 87 (9–10): 813–817. doi:10.1016/j.biochi.2005.03.006. PMID 16164991.
- Lu, Y.; Freeland, S. (2006). "On the evolution of the standard amino-acid alphabet". Genome Biology 7 (1): 102. doi:10.1186/gb-2006-7-1-102. PMC 1431706. PMID 16515719.
- Voet, D. and Voet, J. G., Biochemistry (3rd ed.), John Wiley & Sons (2004)
- Coxon KM, Chakauya E, Ottenhof HH et al. (August 2005). "Pantothenate biosynthesis in higher plants". Biochemical Society Transactions 33 (Pt 4): 743–6. doi:10.1042/BST0330743. PMID 16042590.
- Weber, A. L.; Miller, S. L. (1981). "Reasons for the occurrence of the twenty coded protein amino acids". Journal of molecular evolution 17 (5): 273–284. doi:10.1007/BF01795749. PMID 7277510.
- Koyak MJ, Cheng RP. Design and synthesis of biologically active β-peptides. Meth Mol Biol.
- Boniface, A.; Parquet, C.; Arthur, M.; Mengin-Lecreulx, D.; Blanot, D. (2009). "The Elucidation of the Structure of Thermotoga maritima Peptidoglycan Reveals Two Novel Types of Cross-link". Journal of Biological Chemistry 284 (33): 21856–21862. doi:10.1074/jbc.M109.034363. PMC 2755910. PMID 19542229.
- Arias, C. A.; Martín-Martinez, M.; Blundell, T. L.; Arthur, M.; Courvalin, P.; Reynolds, P. E. (1999). "Characterization and modelling of VanT: A novel, membrane-bound, serine racemase from vancomycin-resistant Enterococcus gallinarum BM4174". Molecular microbiology 31 (6): 1653–1664. doi:10.1046/j.1365-2958.1999.01294.x. PMID 10209740.
- Gao, X.; Chooi, Y. H.; Ames, B. D.; Wang, P.; Walsh, C. T.; Tang, Y. (2011). "Fungal Indole Alkaloid Biosynthesis: Genetic and Biochemical Investigation of the Tryptoquialanine Pathway inPenicillium aethiopicum". Journal of the American Chemical Society 133 (8): 2729–2741. doi:10.1021/ja1101085. PMC 3045477. PMID 21299212.
- Curis E, Nicolis I, Moinard C et al. (November 2005). "Almost all about citrulline in mammals". Amino Acids 29 (3): 177–205. doi:10.1007/s00726-005-0235-4. PMID 16082501.
- Padmanabhan, S.; Baldwin, R. L. (1991). "Straight-chain non-polar amino acids are good helix-formers in water". Journal of molecular biology 219 (2): 135–137. doi:10.1016/0022-2836(91)90553-I. PMID 2038048.
- Ramadan, S. E.; Razak, A. A.; Ragab, A. M.; El-Meleigy, M. (1989). "Incorporation of tellurium into amino acids and proteins in a tellurium-tolerant fungi". Biological trace element research 20 (3): 225–232. doi:10.1007/BF02917437. PMID 2484755.
- Vermeer C (March 1990). "Gamma-carboxyglutamate-containing proteins and the vitamin K-dependent carboxylase". The Biochemical Journal 266 (3): 625–36. PMC 1131186. PMID 2183788.
- Bhattacharjee A, Bansal M (March 2005). "Collagen structure: the Madras triple helix and the current scenario". IUBMB Life 57 (3): 161–72. doi:10.1080/15216540500090710. PMID 16036578.
- Park MH (February 2006). "The post-translational synthesis of a polyamine-derived amino acid, hypusine, in the eukaryotic translation initiation factor 5A (eIF5A)". Journal of Biochemistry 139 (2): 161–9. doi:10.1093/jb/mvj034. PMC 2494880. PMID 16452303.
- Blenis J, Resh MD (December 1993). "Subcellular localization specified by protein acylation and phosphorylation". Current Opinion in Cell Biology 5 (6): 984–9. doi:10.1016/0955-0674(93)90081-Z. PMID 8129952.
- Dasuri, K.; Ebenezer, P. J.; Uranga, R. M.; Gavilán, E.; Zhang, L.; Fernandez-Kim, S. O. K.; Bruce-Keller, A. J.; Keller, J. N. (2011). "Amino acid analog toxicity in primary rat neuronal and astrocyte cultures: Implications for protein misfolding and TDP-43 regulation". Journal of Neuroscience Research 89 (9): 1471–1477. doi:10.1002/jnr.22677. PMC 3175609. PMID 21608013.