Expanded genetic code
An expanded genetic code refers to an artificially modified genetic code in which one or more specific codons have been allocated to encode an amino acid that is not among the 20 "standard" amino acids.
"Standard" or "natural" amino acids are the 20 proteinogenic alpha-amino acids that in nature are the building-blocks of all proteins within humans and other eukaryotes, and that are also directly encoded by the genetic code. All others are known as "non-standard", "non-canonical", or "unnatural".
In May 2014, researchers announced that they had successfully introduced two new artificial nucleotides into bacterial DNA, and by including individual artificial nucleotides in the culture media, were able to passage the bacteria 24 times; they did not create mRNA or proteins able to use the artificial nucleotides.
The translation of genetic information contained in messenger RNA (mRNA) into a protein is catalysed by ribosomes. Transfer RNAs (tRNA) are used as keys to decode the mRNA into its encoded polypeptide. The tRNA recognizes a specific three nucleotide codon in the mRNA with a complementary sequence called the anticodon on one of its loops. Each three nucleotide codon is translated into one of twenty naturally occurring amino acids. There is at least one tRNA for any codon, and sometimes multiple codons code for the same amino acid. Many tRNAs are compatible with several codons. An enzyme called an aminoacyl tRNA synthetase covalently attaches the amino acid to the appropriate tRNA. Most cells have a different synthetase for each amino acid (20 synthetases). On the other hand, some bacteria have fewer than 20 aminoacyl tRNA synthetases, and introduce the "missing" amino acid(s) by modification of a structurally related amino acid by an amidotransferase enzyme. Attachment of an amino acid to tRNA uses energy from ATP. The aminoacyl tRNA synthetase often does not recognize the anticodon, but another part of the tRNA, meaning that if the anticodon were to be mutated the encoding of that amino acid would change to a new codon.
In the ribosome, the information in mRNA is translated into a specific amino acid when the mRNA codon matches with the complementary anticodon of a tRNA, and the attached amino acid is added onto a growing polypeptide chain. When it is released from the ribosome, the polypeptide chain folds into a functioning protein.
There are a few restrictions for the tRNA, synthetase, codon, and unnatural amino acid (Uaa) being incorporated into a protein. For successful translation of a novel amino acid, the codon to which the unnatural amino acid is assigned cannot already code for one of the 20 natural amino acids. Usually a nonsense codon (stop codon) or a four-base codon are used. Together, the tRNA, aminoacyl tRNA synthetase, and codon are called an orthogonal set. The orthogonal set must not crosstalk with the endogenous tRNA and synthetase sets, while still being functionally compatible with the ribosome and other components of the translation apparatus. The active site of the synthetase is modified to accept only the non-natural amino acid. The synthetase is also modified to recognize only the orthogonal tRNA. The tRNA synthetase pair is often engineered in other bacteria or eukaryotic cells. The unnatural amino acid must be able to permeate the cytoplasm when it is added to the growth medium of the cell.
The possibility of reassigning codons was realized by Normanly et al. in 1990, when a viable mutant strain of E. coli read through the amber (stop) codon. As a result the amber codon became the choice codon to be assigned a novel amino acid. Later, in the Schultz lab the tRNATyr/tyrosyl-tRNA synthetase (TyrRS) from Methanococcus jannaschii, an archaebacterium, was used to introduce a tyrosine instead of STOP, the default value of the amber codon. As mentioned, this was possible because of the differences between the endogenous bacterial synthases and the orthologous archeal synthase, which do not recognize each other.
A similar earlier concept is that of alloprotein, which are made by incubating cells with an unnatural amino acid in the absence of a similar coded amino acid in order for the former to be incorporated into protein in place of the latter, for example L-2-aminohexanoic acid (Ahx) for methionine (Met).
This orthologous set can then be mutated and screened through directed evolution to accept a different, even novel, amino acid. Mutations to the plasmid containing the pair can be introduced by error-prone PCR or through degenerate primers for the synthetase's active site. Selection involves multiple rounds of a two-step process, where the plasmid is transferred into cells expressing chloramphenicol acetyl transferase with a premature amber codon. In the presence of toxic chloramphenicol and the non-natural amino acid, the surviving cells will have overridden the amber codon using the orthogonal tRNA aminoacylated with either the standard amino acids or the non-natural one. To remove the former, the plasmid is inserted into cells with a barnase gene (toxic) with a premature amber codon but without the non-natural amino acid, removing all the orthogonal synthases that do not specifically recognize the non-natural amino acid. In addition to the recoding of the tRNA to a different codon, they can be mutated to recognize a four-base codon, allowing additional free coding options. The non-natural amino acid, as a result, introduces diverse physicochemical and biological properties in order to be used as a tool to explore protein structure and function or to create novel or enhanced protein for practical purposes.
The orthogonal pairs of synthetase and tRNA that work for one organism may not work for another, as the synthetase may mis-aminoacylate endogenous tRNAs or the tRNA be mis-aminoacylated itself by an endogenous synthetase. As a result, the sets created to date differ between organisms.
Orthogonal sets in E. coli
- tRNATyr-TyrRS pair from the archaeon Methanococcus jannaschii
- tRNALys–LysRS pair from the archaeon Pyrococcus horikoshii
- tRNAGlu–GluRS pair from Methanosarcina mazei
- leucyl-tRNA synthetase from Methanobacterium thermoautotrophicum and a mutant leucyl tRNA derived from Halobacterium sp
Orthogonal sets in yeast
- tRNATyr-TyrRS pair from Escherichia coli
- tRNALeu–LeuRS pair from Escherichia coli
- tRNAiMet from human and GlnRS from Escherichia coli
Orthogonal sets in mammalian cells
- tRNATyr-TyrRS pair from Bacillus stearothermophilus
- modified tRNATrp-TrpRS pair from Bacillus subtilis trp
- tRNALeu–LeuRS pair from Escherichia coli
Unnatural base pair (UBP)
An unnatural base pair (UBP) is a designed subunit (or nucleobase) of DNA which is created in a laboratory and does not occur in nature. A demonstration of UBPs were achieved in vitro by Ichiro Hirao's group at RIKEN institute in Japan. In 2002, they developed an unnatural base pair between 2-amino-8-(2-thienyl)purine (s) and pyridine-2-one (y) that functions in vitro in transcription and translation for the site-specific incorporation of non-standard amino acids into proteins.  In 2006, they created 7-(2-thienyl)imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa) as a third base pair for replication and transcription. Afterward, Ds and 4-[3-(6-aminohexanamido)-1-propynyl]-2-nitropyrrole (Px) was discovered as a high fidelity pair in PCR amplification. In 2013, they applied the Ds-Px pair to DNA aptamer generation by in vitro selection (SELEX) and demonstrated the genetic alphabet expansion significantly augment DNA aptamer affinities to target proteins.
In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the Scripps Research Institute in San Diego, California, published that his team designed an unnatural base pair (UBP). The two new artificial nucleotides or Unnatural Base Pair (UBP) were named d5SICS and dNaM. More technically, these artificial nucleotides bearing hydrophobic nucleobases, feature two fused aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA. In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed, and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations. This is the first known example of a living organism passing along an expanded genetic code to subsequent generations.  This was in part achieved by the addition of a supportive algal gene that expresses a nucleotide triphosphate transporter which efficiently imports the triphosphates of both d5SICSTP and dNaMTP into E. coli bacteria. Then, the natural bacterial replication pathways use them to accurately replicate the plasmid containing d5SICS–dNaM.
The successful incorporation of a third base pair into a living microoorganism is a significant breakthrough toward the goal of greatly expanding the number of amino acids which can be encoded by DNA, from the existing 20 amino acids to a theoretically possible 172, thereby expanding the potential for living organisms to produce novel proteins. The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses.
With an expanded genetic code, the unnatural amino acid can be genetically directed to any chosen site in the protein of interest. Proteins with non-natural amino acids are called “alloproteins”. The high efficiency and fidelity of this process allows a better control of the placement of the modification compared to modifying the protein post-translationally, which, in general, will target all amino acids of the same type, such as the thiol group of cysteine and the amino group of lysine. Also, an expanded genetic code allows modifications to be carried out in vivo. The ability to site-specifically direct lab-synthesized chemical moieties into proteins allows many types of studies that would otherwise be extremely difficult.
- Probing Protein Structure and Function: By using amino acids with slightly different size such as o-Methyltyrosine or dansylalanine instead of tyrosine, and by inserting genetically coded reporter moieties (color-changing and/or spin-active) into selected protein sites, chemical information about the protein's structure and function can be measured.
- Identifying and Regulating Protein Activity: By using photocaged aminoacids, protein function can be "switched" on or off by illuminating the organism.
- Changing the mode of action of a protein: One can start with the gene for a protein that binds a certain sequence of DNA and, by inserting a chemically active amino acid into the binding site, convert it to a protein that cuts the DNA rather than binding it.
- Improving immunogenicity and overcoming self-tolerance: By replacing strategically chosen tyrosines with p-nitro phenylalanine, a tolerated self-protein can be made immunogenic.
UAAs can introduce unique chemical properties and reactivities into proteins. Alloproteins can be used as molecular switches for signal pathways, as photocrosslinkers, or as fluorescently labeled probes. The creation of alloproteins presents a way to expand the structural and chemical diversity of proteins.
An example of the possible application for this method is biomedical, wherein "chemical warheads" can be added to proteins that target specific cellular components.
- List of genetic codes
- Directed evolution
- Nucleic acid analogue
- Protein labelling
- Protein methods
- Synthetic biology
- Xie, J; Schultz, PG (2005). "Adding amino acids to the genetic repertoire". Current Opinion in Chemical Biology 9 (6): 548–54. doi:10.1016/j.cbpa.2005.10.011. PMID 16260173.
- Modeling Electrostatic Contributions to Protein Folding and Binding - Tjong, p.1 footnote
- Frontiers in Drug Design and Discovery ed. Atta-Ur-Rahman & others, p.299
- Elzanowski A, Ostell J (2008-04-07). "The Genetic Codes". National Center for Biotechnology Information (NCBI). Retrieved 2010-03-10.
- Pollack, Andrew (May 7, 2014). "Researchers Report Breakthrough in Creating Artificial Genetic Code". New York Times. Retrieved May 7, 2014.
- Callaway, Ewen (May 7, 2014). "First life with 'alien' DNA". Nature (journal). doi:10.1038/nature.2014.15179. Retrieved May 7, 2014.
- Malyshev, Denis A.; Dhami, Kirandeep; Lavergne, Thomas; Chen, Tingjian; Dai, Nan; Foster, Jeremy M.; Corrêa, Ivan R.; Romesberg, Floyd E. (May 7, 2014). "A semi-synthetic organism with an expanded genetic alphabet". Nature (journal). doi:10.1038/nature13314. Retrieved May 7, 2014.
- Amos, Jonathan (8 May 2014). "Semi-synthetic bug extends ‘life's alphabet’". BBC News. Retrieved 2014-05-09.
- Wang, L.; Brock, A.; Herberich, B.; Schultz, P. G. (April 2001). "Expanding the Genetic Code of Escherichia coli". Science 292 (5516): 498–500. doi:10.1126/science.1060077. PMID 11313494.
- Alberts, et. al, Bruce (2008). Molecular Biology of the Cell (5th ed.). New York: Garland Science. ISBN 0815341059.
- Woese, et. al, Carl (2000). "Aminoacyl-tRNA synthetases, the genetic code, and the evolutionary process.". Microbiol. Mol. Biol. Rev. 64: 202–236.
- Minnihan, Ellen C; Yokoyama, Kenichi, Stubbe, JoAnne (Nov 2009). "Unnatural amino acids: better than the real things?". F1000 Biology Reports 1 (88). doi:10.3410/B1-88.
- Sakamoto, K. (2002). "Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells". Nucleic Acids Research 30 (21): 4692–4699. doi:10.1093/nar/gkf589. PMC 135798. PMID 12409460.
- Normanly, J; Kleina, L.G.; Masson, J.M.; Abelson, J.; Miller, J.H. (1990). "Construction of Escherichia coli amber suppressor tRNA genes. III. Determination of tRNA specificity". J. Mol. Biol. 213 (4): 719–726. doi:10.1016/S0022-2836(05)80258-X. PMID 2141650.
- Wang, L.; Magliery, T.J.; Liu, D.R.; Schultz, P.G. (2000). "A new functional suppressor tRNA/aminoacyl-tRNA synthetase pair for the in vivo incorporation of unnatural amino acids into proteins". J. Am. Chem. Soc. 122 (20): 5010–5011. doi:10.1021/ja000595y.
- Koide, H.; Yokoyama, S.; Kawai, G.; Ha, J. M.; Oka, T.; Kawai, S.; Miyake, T.; Fuwa, T.; Miyazawa, T. (1988). "Biosynthesis of a protein containing a nonprotein amino acid by Escherichia coli: L-2-aminohexanoic acid at position 21 in human epidermal growth factor". Proceedings of the National Academy of Sciences of the United States of America 85 (17): 6237–6241. doi:10.1073/pnas.85.17.6237. PMC 281944. PMID 3045813.
- Watanabe, T; Muranaka, N; Hohsaka, T. (2008). "Four-base codon-mediated saturation mutagenesis in a cell-free translation system". J Biosci Bioeng 105 (3): 211–5. doi:10.1263/jbb.105.211. PMID 18397770.
- Anderson, J.C.; Wu, N.; Santoro, S.W.; Lakshman, V.; King, D.S.; Schultz, P.G. (2004). "An expanded genetic code with a functional quadruplet codon". Proc Natl Acad Sci USA 101 (20): 7566–7571. doi:10.1073/pnas.0401517101. PMC 419646. PMID 15138302.
- Santoro, S.W.; Anderson, J.C.; Lakshman, V.; Schultz, P.G. (2003). "An archaebacteria-derived glutamyl-tRNA synthetase and tRNA pair for unnatural amino acid mutagenesis of proteins in Escherichia coli". Nucleic Acids Res 31 (23): 6700–6709. doi:10.1093/nar/gkg903. PMC 290271. PMID 14627803.
- Anderson, J.C.; Schultz, P.G. (2003). "Adaptation of an orthogonal archaeal leucyl-tRNA and synthetase pair for four-base, amber, and opal suppression". Biochemistry 42 (32): 9598–9608. doi:10.1021/bi034550w. PMID 12911301.
- Chin, J.W.; Cropp, T.A.; Anderson, J.C.; Mukherji, M.; Zhang, Z.; Schultz, P.G. (2003). "An expanded eukaryotic genetic code". Science 301 (5635): 964–967. doi:10.1126/science.1084772. PMID 12920298.
- Wu, N.; Deiters, A.; Cropp, T.A.; King, D.; Schultz, P.G. (2004). "A genetically encoded photocaged amino Acid". J Am Chem Soc 126 (44): 14306–14307. doi:10.1021/ja040175z. PMID 15521721.
- Kowal, A.K.; Kohrer, C.; RajBhandary, U.L. (2001). "Twenty-first aminoacyl-tRNA synthetase–suppressor tRNA pairs for possible use in site-specific incorporation of amino acid analogues into proteins in eukaryotes and in eubacteria". Proc Natl Acad Sci USA 98 (5): 2268–2273. doi:10.1073/pnas.031488298. PMC 30127. PMID 11226228.
- Sakamoto, K.; Hayashi, A.; Sakamoto, A.; Kiga, D.; Nakayama, H.; Soma, A.; Kobayashi, T.; Kitabatake, M. et al. (2002). "Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells". Nucleic Acids Res. 30 (21): 4692–4699. doi:10.1093/nar/gkf589. PMC 135798. PMID 12409460.
- Zhang, Z.; Alfonta, L.; Tian, F.; Bursulaya, B.; Uryu, S.; King, D.S.; Schultz, P.G. (2004). "Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells". Proc. Natl. Acad. Sci. USA 101 (24): 8882–8887. doi:10.1073/pnas.0307029101. PMC 428441. PMID 15187228.
- Wang, W.; Takimoto, J.; Louie, G.V.; Baiga, T.J.; Noel, J.P.; Lee, K.F.; Slesinger, P.A.; Wang, L. (2007). "Genetically encoding unnatural amino acids for cellular and neuronal studies". Nat. Neurosci 10 (8): 1063–1072. doi:10.1038/nn1932. PMC 2692200. PMID 17603477.
- Hirao, I. et al. (2002) An unnatural base pair for incorporating amino acid analogs into proteins. Nat. Biotechnol. 20, 177-182
- Hirao, I. et al. (2006) An unnatural hydrophobic base pair system: site-specific incorporation of nucleotide analogs into DNA and RNA. Nat. Methods 6, 729-735
- Kimoto, M. et al. (2009) An unnatural base pair system for efficient PCR amplification and functionalization of DNA molecules. Nucleic acids Res. 37, e14
- Yamashige, R. et al. Highly specific unnatural base pair systems as a third base pair for PCR amplification. Nucleic Acids Res. 40, 2793-2806
- Kimoto, M. et al. (2013) Generation of high-affinity DNA aptamers using an expanded genetic alphabet. Nat. Biotechnol. 31, 453-457
- Malyshev, Denis A.; Dhami, Kirandeep; Quach, Henry T.; Lavergne, Thomas; Ordoukhanian, Phillip (24 July 2012). "Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet". Proceedings of the National Academy of Sciences of the United States of America (PNAS) 109 (30): 12005–12010. doi:10.1073/pnas.1205176109. Retrieved 2014-05-11.
- Callaway, Ewan (May 7, 2014). "Scientists Create First Living Organism With 'Artificial' DNA". Nature News (Huffington Post). Retrieved 8 May 2014.
- Fikes, Bradley J. (May 8, 2014). "Life engineered with expanded genetic code". San Diego Union Tribune. Retrieved 8 May 2014.
- Sample, Ian (May 7, 2014). "First life forms to pass on artificial DNA engineered by US scientists". The Guardian. Retrieved 8 May 2014.
- Pollack, Andrew (May 7, 2014). "Scientists Add Letters to DNA’s Alphabet, Raising Hope and Fear". New York Times. Retrieved 8 May 2014.
- Wang, Q; Parrish, AR; Wang, L (2009). "Expanding the Genetic Code for Biological Studies". Chemistry & biology 16 (3): 323–36. doi:10.1016/j.chembiol.2009.03.001. PMC 2696486. PMID 19318213.
- Liu, CC; Mack, AV; Brustad, EM; Mills, JH; Groff, D; Smider, VV; Schultz, PG. (2009). "The Evolution of Proteins with Genetically Encoded "Chemical Warheads"". J Am Chem Soc. 131 (28): 9616–7. doi:10.1021/ja902985e. PMC 2745334. PMID 19555063.