Nucleotides are organic molecules that serve as the monomers, or subunits, of nucleic acids like DNA and RNA. The building blocks of nucleic acids, nucleotides are composed of a nitrogenous base, a five-carbon sugar (ribose or deoxyribose), and at least one phosphate group. Thus a nucleoside plus a phosphate group yields a nucleotide.
Nucleotides serve to carry packets of energy within the cell in the form of the nucleoside triphosphates (ATP, GTP, CTP and UTP), playing a central role in metabolism. In addition, nucleotides participate in cell signaling (cGMP and cAMP), and are incorporated into important cofactors of enzymatic reactions (e.g. coenzyme A, FAD, FMN, NAD, and NADP+).
A nucleotide is composed of a nucleobase (also termed a nitrogenous base), a five-carbon sugar (either ribose or 2-deoxyribose), and one or, depending on the definition, more than one phosphate groups. Authoritative chemistry sources such as the ACS Style Guide and IUPAC Gold Book clearly state that the term nucleotide refers only to a molecule containing one phosphate. However, common usage in molecular biology textbooks often extends this definition to include molecules with two or three phosphate groups. Thus, the term "nucleotide" generally refers to a nucleoside monophosphate, but a nucleoside diphosphate or nucleoside triphosphate could be considered a nucleotide as well.
Without the phosphate group, the nucleobase and sugar compose a nucleoside. The phosphate groups form bonds with either the 2, 3, or 5-carbon of the sugar, with the 5-carbon site most common. Cyclic nucleotides form when the phosphate group is bound to two of the sugar's hydroxyl groups. Nucleotides contain either a purine or a pyrimidine base. Ribonucleotides are nucleotides in which the sugar is ribose. Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.
Nucleic acids are polymeric macromolecules made from nucleotide monomers. In DNA, the purine bases are adenine and guanine, while the pyrimidines are thymine and cytosine. RNA uses uracil in place of thymine. Adenine always pairs with thymine by 2 hydrogen bonds, while guanine pairs with cytosine through 3 hydrogen bonds, each due to their unique structures.
In vivo, nucleotides can be synthesized de novo or recycled through salvage pathways. The components used in de novo nucleotide synthesis are derived from biosynthetic precursors of carbohydrate and amino acid metabolism, and from ammonia and carbon dioxide. The liver is the major organ of de novo synthesis of all four nucleotides. De novo synthesis of pyrimidines and purines follows two different pathways. Pyrimidines are synthesized first from aspartate and carbamoyl-phosphate in the cytoplasm to the common precursor ring structure orotic acid, onto which a phosphorylated ribosyl unit is covalently linked. Purines, however, are first synthesized from the sugar template onto which the ring synthesis occurs. For reference, the syntheses of the purine and pyrimidine nucleotides are carried out by several enzymes in the cytoplasm of the cell, not within a specific organelle. Nucleotides undergo breakdown such that useful parts can be reused in synthesis reactions to create new nucleotides.
In vitro, protecting groups may be used during laboratory production of nucleotides. A purified nucleoside is protected to create a phosphoramidite, which can then be used to obtain analogues not found in nature and/or to synthesize an oligonucleotide.
Pyrimidine ribonucleotide synthesis
The synthesis of the pyrimidines CTP and UTP occurs in the cytoplasm and starts with the formation of carbamoyl phosphate from glutamine and CO2. Next, aspartate carbamoyltransferase catalyzes a condensation reaction between aspartate and carbamoyl phosphate to form carbamoyl aspartic acid, which is cyclized into 4,5-dihydroorotic acid by dihydroorotase. The latter is converted to orotate by dihydroorotate oxidase. The net reaction is:
- (S)-Dihydroorotate + O2 → Orotate + H2O2
Orotate is covalently linked with a phosphorylated ribosyl unit. The covalent linkage between the ribose and pyrimidine occurs at position C1 of the ribose unit, which contains a pyrophosphate, and N1 of the pyrimidine ring. Orotate phosphoribosyltransferase (PRPP transferase) catalyzes the net reaction yielding orotidine monophosphate (OMP):
- Orotate + 5-Phospho-α-D-ribose 1-diphosphate (PRPP) → Orotidine 5'-phosphate + Pyrophosphate
Orotidine 5'-monophosphate is decarboxylated by orotidine-5'-phosphate decarboxylase to form uridine monophosphate (UMP). PRPP transferase catalyzes both the ribosylation and decarboxylation reactions, forming UMP from orotic acid in the presence of PRPP. It is from UMP that other pyrimidine nucleotides are derived. UMP is phosphorylated by two kinases to uridine triphosphate (UTP) via two sequential reactions with ATP. First the diphosphate form UDP is produced, which in turn is phosphorylated to UTP. Both steps are fueled by ATP hydrolysis:
- ATP + UMP → ADP + UDP
- UDP + ATP → UTP + ADP
CTP is subsequently formed by amination of UTP by the catalytic activity of CTP synthetase. Glutamine is the NH3 donor and the reaction is fueled by ATP hydrolysis, too:
- UTP + Glutamine + ATP + H2O → CTP + ADP + Pi
Purine ribonucleotide synthesis
The atoms which are used to build the purine nucleotides come from a variety of sources:
|The biosynthetic origins of purine ring atoms
N1 arises from the amine group of Asp
C2 and C8 originate from formate
N3 and N9 are contributed by the amide group of Gln
C4, C5 and N7 are derived from Gly
C6 comes from HCO3− (CO2)
The de novo synthesis of purine nucleotides by which these precursors are incorporated into the purine ring proceeds by a 10-step pathway to the branch-point intermediate IMP, the nucleotide of the base hypoxanthine. AMP and GMP are subsequently synthesized from this intermediate via separate, two-step pathways. Thus, purine moieties are initially formed as part of the ribonucleotides rather than as free bases.
Six enzymes take part in IMP synthesis. Three of them are multifunctional:
The pathway starts with the formation of PRPP. PRPS1 is the enzyme that activates R5P, which is formed primarily by the pentose phosphate pathway, to PRPP by reacting it with ATP. The reaction is unusual in that a pyrophosphoryl group is directly transferred from ATP to C1 of R5P and that the product has the α configuration about C1. This reaction is also shared with the pathways for the synthesis of Trp, His, and the pyrimidine nucleotides. Being on a major metabolic crossroad and requiring much energy, this reaction is highly regulated.
In the first reaction unique to purine nucleotide biosynthesis, PPAT catalyzes the displacement of PRPP's pyrophosphate group (PPi) by an amide nitrogen donated from either glutamine (N), glycine (N&C), aspartate (N), folic acid (C1), or CO2. This is the committed step in purine synthesis. The reaction occurs with the inversion of configuration about ribose C1, thereby forming β-5-phosphorybosylamine (5-PRA) and establishing the anomeric form of the future nucleotide.
Next, a glycine is incorporated fueled by ATP hydrolysis and the carboxyl group forms an amine bond to the NH2 previously introduced. A one-carbon unit from folic acid coenzyme N10-formyl-THF is then added to the amino group of the substituted glycine followed by the closure of the imidazole ring. Next, a second NH2 group is transferred from a glutamine to the first carbon of the glycine unit. A carboxylation of the second carbon of the glycin unit is concomittantly added. This new carbon is modified by the additional of a third NH2 unit, this time transferred from an aspartate residue. Finally, a second one-carbon unit from formyl-THF is added to the nitrogen group and the ring covalently closed to form the common purine precursor inosine monophosphate (IMP).
Inosine monophosphate is converted to adenosine monophosphate in two steps. First, GTP hydrolysis fuels the addition of aspartate to IMP by adenylosuccinate synthase, substituting the carbonyl oxygen for a nitrogen and forming the intermediate adenylosuccinate. Fumarate is then cleaved off forming adenosine monophosphate. This step is catalyzed by adenylosuccinate lyase.
Inosine monophosphate is converted to guanosine monophosphate by the oxidation of IMP forming xanthylate, followed by the insertion of an amino group at C2. NAD+ is the electron acceptor in the oxidation reaction. The amide group transfer from glutamine is fueled by ATP hydrolysis.
Pyrimidine and purine degradation
In humans, pyrimidine rings (C, T, U) can be degraded completely to CO2 and NH3 (urea excretion). That having been said, purine rings (G, A) cannot. Instead they are degraded to the metabolically inert uric acid which is then excreted from the body. Uric acid is formed when GMP is split into the base guanine and ribose. Guanine is deaminated to xanthine which in turn is oxidized to uric acid. This last reaction is irreversible. Similarly, uric acid can be formed when AMP is deaminated to IMP from which the ribose unit is removed to form hypoxanthine. Hypoxanthine is oxidized to xanthine and finally to uric acid. Instead of uric acid secretion, guanine and IMP can be used for recycling purposes and nucleic acid synthesis in the presence of PRPP and aspartate (NH3 donor).
Unnatural base pair (UBP)
An unnatural base pair (UBP) is a designed subunit (or nucleobase) of DNA which is created in a laboratory and does not occur in nature. In 2012, a group of American scientists led by Floyd Romesberg, a chemical biologist at the Scripps Research Institute in San Diego, California, published that his team designed an unnatural base pair (UBP). The two new artificial nucleotides or Unnatural Base Pair (UBP) were named d5SICS and dNaM. More technically, these artificial nucleotides bearing hydrophobic nucleobases, feature two fused aromatic rings that form a (d5SICS–dNaM) complex or base pair in DNA. In 2014 the same team from the Scripps Research Institute reported that they synthesized a stretch of circular DNA known as a plasmid containing natural T-A and C-G base pairs along with the best-performing UBP Romesberg's laboratory had designed, and inserted it into cells of the common bacterium E. coli that successfully replicated the unnatural base pairs through multiple generations. This is the first known example of a living organism passing along an expanded genetic code to subsequent generations. This was in part achieved by the addition of a supportive algal gene that expresses a nucleotide triphosphate transporter which efficiently imports the triphosphates of both d5SICSTP and dNaMTP into E. coli bacteria. Then, the natural bacterial replication pathways use them to accurately replicate the plasmid containing d5SICS–dNaM.
The successful incorporation of a third base pair is a significant breakthrough toward the goal of greatly expanding the number of amino acids which can be encoded by DNA, from the existing 20 amino acids to a theoretically possible 172, thereby expanding the potential for living organisms to produce novel proteins. The artificial strings of DNA do not encode for anything yet, but scientists speculate they could be designed to manufacture new proteins which could have industrial or pharmaceutical uses.
Nucleotide (abbreviated "nt") is a common unit of length for single-stranded nucleic acids, similar to how base pair is a unit of length for double-stranded nucleic acids.
Abbreviation codes for degenerate bases
The IUPAC has designated the symbols for nucleotides. Apart from the five (A, G, C, T/U) bases, often degenerate bases are used especially for designing PCR primers. These nucleotide codes are listed here. Some primer sequences may also include the character "I", which codes for the non-standard nucleotide Inosine. Inosine occurs in tRNAs, and will pair with Adenine, Cytosine, or Thymine. This character does not appear in the following table however, because it does not represent a degeneracy. While Inosine can serve a similar function as the degeneracy "H", it is an actual nucleotide, rather than a representation of a mix of nucleotides that covers each possible pairing needed.
|B||not A (B comes after A)||C||G||T||3|
|D||not C (D comes after C)||A||G||T|
|H||not G (H comes after G)||A||C||T|
|V||not T (V comes after T and U)||A||C||G|
|N or -||any base (not a gap)||A||C||G||T||4|
- Alberts B, Johnson A, Lewis J, Raff M, Roberts K & Wlater P (2002). Molecular Biology of the Cell (4th ed.). Garland Science. ISBN 0-8153-3218-1. pp. 120–121.
- Coghill, Anne M.; Garson, Lorrin R., ed. (2006). The ACS style guide: effective communication of scientific information (3rd ed.). Washington, D.C.: American Chemical Society. p. 244. ISBN 978-0-8412-3999-9.
- "Nucleotides". IUPAC Gold Book. International Union of Pure and Applied Chemists. doi:10.1351/goldbook.N04255. Retrieved 30 June 2014.
- Lehninger, Albert L. (1975). Biochemistry: the molecular basis of cell structure and function. New York: Worth Publishers Inc. doi:10.1002/jobm.19770170116.
- Stryer, Lubert (1988). Biochemistry (3rd ed. ed.). New York: W. H. Freeman. ISBN 9780716719205.
- Garrett, Reginald H.; Grisham, Charles M. (2007). Biochemistry (4th ed. ed.). Belmont, CA: Brooks/Cole, Cengage Learning.
- Zaharevitz, DW; Anerson, LW; Manlinowski, NM; Hyman, R; Strong, JM; Cysyk, RL. "Contribution of de-novo and salvage synthesis to the uracil nucleotide pool in mouse tissues and tumors in vivo".
- See IUPAC nomenclature of organic chemistry for details on carbon residue numbering
- Jones, M. E. (1980). "Pyrimidine nucleotide biosynthesis in animals: Genes, enzymes, and regulation of UMP biosynthesis". Ann. Rev. Biochem 49 (1): 253–79. doi:10.1146/annurev.bi.49.070180.001345. PMID 6105839.
- McMurry, JE; Begley, TP (2005). The organic chemistry of biological pathways. Roberts & Company. ISBN 978-0-9747077-1-6.
- Malyshev, Denis A.; Dhami, Kirandeep; Quach, Henry T.; Lavergne, Thomas; Ordoukhanian, Phillip (24 July 2012). "Efficient and sequence-independent replication of DNA containing a third base pair establishes a functional six-letter genetic alphabet". Proceedings of the National Academy of Sciences of the United States of America (PNAS) 109 (30): 12005–12010. doi:10.1073/pnas.1205176109. Retrieved 2014-05-11.
- Malyshev, Denis A.; Dhami, Kirandeep; Lavergne, Thomas; Chen, Tingjian; Dai, Nan; Foster, Jeremy M.; Corrêa, Ivan R.; Romesberg, Floyd E. (May 7, 2014). "A semi-synthetic organism with an expanded genetic alphabet". Nature (journal). doi:10.1038/nature13314. Retrieved May 7, 2014.
- Callaway, Ewan (May 7, 2014). "Scientists Create First Living Organism With 'Artificial' DNA". Nature News (Huffington Post). Retrieved 8 May 2014.
- Fikes, Bradley J. (May 8, 2014). "Life engineered with expanded genetic code". San Diego Union Tribune. Retrieved 8 May 2014.
- Sample, Ian (May 7, 2014). "First life forms to pass on artificial DNA engineered by US scientists". The Guardian. Retrieved 8 May 2014.
- Pollack, Andrew (May 7, 2014). "Scientists Add Letters to DNA’s Alphabet, Raising Hope and Fear". New York Times. Retrieved 8 May 2014.
- Nomenclature Committee of the International Union of Biochemistry (NC-IUB) (1984). "Nomenclature for Incompletely Specified Bases in Nucleic Acid Sequences". Retrieved 2008-02-04.
- Abbreviations and Symbols for Nucleic Acids, Polynucleotides and their Constituents (IUPAC)
- Provisional Recommendations 2004 (IUPAC)
- Chemistry explanation of nucleotide structure