Expanded genetic code
An expanded genetic code refers to an artificially modified genetic code in which one or more specific codons have been allocated to encode an amino acid which is not among the 20 "standard" amino acids.
"Standard" amino acids are the 20 proteinogenic alpha-amino acids which in nature are the building blocks of all proteins within humans and other eukaryotes, and which are also directly encoded by the universal genetic code. All others are known as "non-standard".
The translation of genetic information contained in messenger RNA (mRNA) into a protein is catalysed by ribosomes. Transfer RNAs (tRNA) are used as keys to decode the mRNA into its encoded polypeptide. The tRNA recognizes a specific three nucleotide codon in the mRNA with a complementary sequence called the anticodon on one of its loops. Each three nucleotide codon is translated into one of twenty naturally occurring amino acids. There is at least one tRNA for any codon, and sometimes multiple codons code for the same amino acid. Many tRNAs are compatible with several codons. An enzyme called an aminoacyl tRNA synthetase covalently attaches the amino acid to the appropriate tRNA. Most cells have a different synthetase for each amino acid (20 synthetases). On the other hand, some bacteria have fewer than 20 aminoacyl tRNA synthetases, and introduce the "missing" amino acid(s) by modification of a structurally related amino acid by an amidotransferase enzyme. Attachment of an amino acid to tRNA uses energy from ATP. The aminoacyl tRNA synthetase often does not recognize the anticodon, but another part of the tRNA, meaning that if the anticodon were to be mutated the encoding of that amino acid would change to a new codon.
In the ribosome, the information in mRNA is translated into a specific amino acid when the mRNA codon matches with the complementary anticodon of a tRNA, and the attached amino acid is added onto a growing polypeptide chain. When it is released from the ribosome, the polypeptide chain folds into a functioning protein.
There are a few restrictions for the tRNA, synthetase, codon, and unnatural amino acid (Uaa) being incorporated into a protein. For successful translation of a novel amino acid, the codon to which the unnatural amino acid is assigned cannot already code for one of the 20 natural amino acids. Usually a nonsense codon (stop codon) or a four-base codon are used. Together, the tRNA, aminoacyl tRNA synthetase, and codon are called an orthogonal set. The orthogonal set must not crosstalk with the endogenous tRNA and synthetase sets, while still being functionally compatible with the ribosome and other components of the translation apparatus. The active site of the synthetase is modified to accept only the non-natural amino acid. The synthetase is also modified to only recognize the orthogonal tRNA. The tRNA synthetase pair is often engineered in other bacteria or eukaryotic cells. The unnatural amino acid must be able to permeate the cytoplasm when it is added to the growth medium of the cell.
The possibility of reassigning codons was realized by Normanly et al. in 1990 when a viable mutant strain of E. coli read through the amber (stop) codon. As a result the amber codon became the choice codon to be assigned a novel amino acid. Later, in the Schultz lab the tRNATyr/tyrosyl-tRNA synthetase (TyrRS) from Methanococcus jannaschii, an archaebacterium, was used to introduce a tyrosine instead of STOP, the default value of the amber codon. As mentioned, this was possible because of the differences between the endogenous bacterial synthases and the orthologous archeal synthase which do not recognize each other.
A similar earlier concept is that of alloprotein, which are made by incubating cells with an unnatural amino acid in the absence of a similar coded amino acid in order for the former to be incorporated into protein in place of the latter, for example L-2-aminohexanoic acid (Ahx) for methionine (Met).
Directed evolution 
This orthologous set can then be mutated and screened through directed evolution to accept a different, even novel, amino acid. Mutations to the plasmid containing the pair can be introduced by error-prone PCR or through degenerate primers for the synthetase's active site. Selection involves multiple rounds of a two-step process, where the plasmid is transferred into cells expressing chloramphenicol acetyl transferase with a premature amber codon. In the presence of toxic chloramphenicol and the non-natural amino acid, the surviving cells will have overridden the amber codon using the orthogonal tRNA aminoacylated with either the standard amino acids or the non-natural one. To remove the former, the plasmid is inserted into cells with a barnase gene (toxic) with a premature amber codon but without the non-natural amino acid, removing all the orthogonal synthases which do not specifically recognize the non-natural amino acid. In addition to the recoding of the tRNA to a different codon, they can be mutated to recognize a four base codon, allowing additional free coding options. The non natural amino acid, as a result, introduces diverse physicochemical and biological properties in order to be used as a tool to explore protein structure and function or to create novel or enhanced protein for practical purposes.
The orthogonal pairs of synthetase and tRNA which work for one organism may not work for another as the synthetase may mis-aminoacylate endogenous tRNAs or the tRNA be mis-aminoacylated itself by an endogenous synthetase. As a result the sets created to date differ between organisms.
Orthogonal sets in E. coli 
- tRNATyr-TyrRS pair from the archaeon Methanococcus jannaschii
- tRNALys–LysRS pair from the archaeon Pyrococcus horikoshii
- tRNAGlu–GluRS pair from Methanosarcina mazei
- leucyl-tRNA synthetase from Methanobacterium thermoautotrophicum and a mutant leucyl tRNA derived from Halobacterium sp
Orthogonal sets in yeast 
- tRNATyr-TyrRS pair from Escherichia coli
- tRNALeu–LeuRS pair from Escherichia coli
- tRNAiMet from human and GlnRS from Escherichia coli
Orthogonal sets in mammalian cells 
- tRNATyr-TyrRS pair from Bacillus stearothermophilus
- modified tRNATrp-TrpRS pair from Bacillus subtilis trp
- tRNALeu–LeuRS pair from Escherichia coli
Protein studies 
With an expanded genetic code, the unnatural amino acid can be genetically directed to any chosen site in the protein of interest. Proteins with non-natural amino acids are called “alloproteins”. The high efficiency and fidelity of this process allows a better control of the placement of the modification compared to modifying the protein post-translationally, which generally will target all amino acids of the same type, such as the thiol group of cysteine and the -amino group of lysine. Also, an expanded genetic code allows modifications to be carried out in vivo. The ability to site-specifically direct lab-synthesized chemical moieties into proteins allows many types of studies which would otherwise be extremely difficult.
- Probing Protein Structure and Function: by using amino acids with slightly different size such as o-Methyltyrosine or dansylalanine instead of tyrosine, and by inserting genetically coded reporter moieties (color-changing and/or spin-active) into selected protein sites, chemical information about the protein's structure and function can be measured.
- Identifying and Regulating Protein Activity: by using photocaged aminoacids, protein function can be "switched" on or off by illuminating the organism.
- Changing the mode of action of a protein: one can start with the gene for a protein which binds a certain sequence of DNA, and, by inserting a chemically active amino acid into the binding site, convert it to a protein which cuts the DNA, rather than binding it.
- Improving immunogenicity and overcoming self-tolerance: by replacing strategically chosen tyrosines with p-nitro phenylalanine, a tolerated self-protein can be made immunogenic.
UAAs can introduce unique chemical properties and reactivities into proteins. Alloproteins can be used as molecular switches for signal pathways, as photocrosslinkers, or as fluorescently labeled probes. The creation of alloproteins presents a way to expand the structural and chemical diversity of proteins.
An example of the possible application for this method is biomedical where "chemical warheads" can be added to protein which target specific cellular components.
See also 
- synthetic biology
- Novel base-pairs
- Directed evolution
- protein labelling
- protein methods
- Xie, J; Schultz, PG (2005). "Adding amino acids to the genetic repertoire". Current Opinion in Chemical Biology 9 (6): 548–54. doi:10.1016/j.cbpa.2005.10.011. PMID 16260173.
- Modeling Electrostatic Contributions to Protein Folding and Binding - Tjong, p.1 footnote
- http://books.google.com/books?id=VoJw6fIISSkC&pg=PA299&lpg=PA299&ots=C20L115r05&sig=4cix7yKNlod3xbzy2TWiOzEe6As&hl=en&sa=X&ei=H81LUL6MOfC10QX4wYG4Cw&ved=0CIcBEOgBMA8 Frontiers in Drug Design and Discovery] ed. Atta-Ur-Rahman & others, p.299
- Elzanowski A, Ostell J (2008-04-07). "The Genetic Codes". National Center for Biotechnology Information (NCBI). Retrieved 2010-03-10.
- Wang, L.; Brock, A., Herberich, B., Schultz, P. G. (April 2001). "Expanding the Genetic Code of Escherichia coli". Science 292 (5516): 498–500. doi:10.1126/science.1060077. PMID 11313494.
- Alberts, et. al, Bruce (2008). Molecular Biology of the Cell (5th ed. ed.). New York: Garland Science. ISBN 0815341059.
- Woese, et. al, Carl (2000). "Aminoacyl-tRNA synthetases, the genetic code, and the evolutionary process.". Microbiol. Mol. Biol. Rev. 64: 202–236.
- Minnihan, Ellen C; Yokoyama, Kenichi, Stubbe, JoAnne (Nov 2009). "Unnatural amino acids: better than the real things?". F1000 Biology Reports 1 (88). doi:10.3410/B1-88.
- Sakamoto, K. (2002). "Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells". Nucleic Acids Research 30 (21): 4692–4699. doi:10.1093/nar/gkf589. PMC 135798. PMID 12409460.
- Normanly, J; Kleina, L.G.; Masson, J.M.; Abelson, J.; Miller, J.H. (1990). "Construction of Escherichia coli amber suppressor tRNA genes. III. Determination of tRNA specificity". J. Mol. Biol. 213 (4): 719–726. doi:10.1016/S0022-2836(05)80258-X. PMID 2141650.
- Wang, L.; Magliery, T.J.; Liu, D.R.; Schultz, P.G. (2000). "A new functional suppressor tRNA/aminoacyl-tRNA synthetase pair for the in vivo incorporation of unnatural amino acids into proteins". J. Am. Chem. Soc. 122 (20): 5010–5011. doi:10.1021/ja000595y.
- Koide, H.; Yokoyama, S.; Kawai, G.; Ha, J. M.; Oka, T.; Kawai, S.; Miyake, T.; Fuwa, T. et al. (1988). "Biosynthesis of a protein containing a nonprotein amino acid by Escherichia coli: L-2-aminohexanoic acid at position 21 in human epidermal growth factor". Proceedings of the National Academy of Sciences of the United States of America 85 (17): 6237–6241. doi:10.1073/pnas.85.17.6237. PMC 281944. PMID 3045813.
- Watanabe, T; Muranaka, N; Hohsaka, T. (2008). "Four-base codon-mediated saturation mutagenesis in a cell-free translation system". J Biosci Bioeng 105 (3): 211–5. doi:10.1263/jbb.105.211. PMID 18397770.
- Anderson, J.C.; Wu, N.; Santoro, S.W.; Lakshman, V.; King, D.S.; Schultz, P.G. (2004). "An expanded genetic code with a functional quadruplet codon". Proc Natl Acad Sci USA 101 (20): 7566–7571. doi:10.1073/pnas.0401517101. PMC 419646. PMID 15138302.
- Santoro, S.W.; Anderson, J.C.; Lakshman, V.; Schultz, P.G. (2003). "An archaebacteria-derived glutamyl-tRNA synthetase and tRNA pair for unnatural amino acid mutagenesis of proteins in Escherichia coli". Nucleic Acids Res 31 (23): 6700–6709. doi:10.1093/nar/gkg903. PMC 290271. PMID 14627803.
- Anderson, J.C.; Schultz, P.G. (2003). "Adaptation of an orthogonal archaeal leucyl-tRNA and synthetase pair for four-base, amber, and opal suppression". Biochemistry 42 (32): 9598–9608. doi:10.1021/bi034550w. PMID 12911301.
- Chin, J.W.; Cropp, T.A.; Anderson, J.C.; Mukherji, M.; Zhang, Z.; Schultz, P.G. (2003). "An expanded eukaryotic genetic code". Science 301 (5635): 964–967. doi:10.1126/science.1084772. PMID 12920298.
- Wu, N.; Deiters, A.; Cropp, T.A.; King, D.; Schultz, P.G. (2004). "A genetically encoded photocaged amino Acid". J Am Chem Soc 126 (44): 14306–14307. doi:10.1021/ja040175z. PMID 15521721.
- Kowal, A.K.; Kohrer, C.; RajBhandary, U.L. (2001). "Twenty-first aminoacyl-tRNA synthetase–suppressor tRNA pairs for possible use in site-specific incorporation of amino acid analogues into proteins in eukaryotes and in eubacteria". Proc Natl Acad Sci USA 98 (5): 2268–2273. doi:10.1073/pnas.031488298. PMC 30127. PMID 11226228.
- Sakamoto, K.; Hayashi, A.; Sakamoto, A.; Kiga, D.; Nakayama, H.; Soma, A.; Kobayashi, T.; Kitabatake, M. et al. (2002). "Site-specific incorporation of an unnatural amino acid into proteins in mammalian cells". Nucleic Acids Res. 30 (21): 4692–4699. doi:10.1093/nar/gkf589. PMC 135798. PMID 12409460.
- Zhang, Z.; Alfonta, L.; Tian, F.; Bursulaya, B.; Uryu, S.; King, D.S.; Schultz, P.G. (2004). "Selective incorporation of 5-hydroxytryptophan into proteins in mammalian cells". Proc. Natl. Acad. Sci. USA 101 (24): 8882–8887. doi:10.1073/pnas.0307029101. PMC 428441. PMID 15187228.
- Wang, W.; Takimoto, J.; Louie, G.V.; Baiga, T.J.; Noel, J.P.; Lee, K.F.; Slesinger, P.A.; Wang, L. (2007). "Genetically encoding unnatural amino acids for cellular and neuronal studies". Nat. Neurosci 10 (8): 1063–1072. doi:10.1038/nn1932. PMC 2692200. PMID 17603477.
- Wang, Q; Parrish, AR; Wang, L (2009). "Expanding the Genetic Code for Biological Studies". Chemistry & biology 16 (3): 323–36. doi:10.1016/j.chembiol.2009.03.001. PMC 2696486. PMID 19318213.
- Liu, CC; Mack, AV; Brustad, EM; Mills, JH; Groff, D; Smider, VV; Schultz, PG. (2009). "The Evolution of Proteins with Genetically Encoded "Chemical Warheads"". J Am Chem Soc. 131 (28): 9616–7. doi:10.1021/ja902985e. PMC 2745334. PMID 19555063.