Non-canonical base pairing

Non-canonical base pairing occurs when nucleobases hydrogen bond, or base pair, to one another in schemes other than the standard Watson-Crick base pairs (which are adenine (A) -- thymine (T) in DNA, adenine (A) -- uracil (U) in RNA, and guanine (G) -- cytosine (C) in both DNA and RNA). There are three main types of non-canonical base pairs: those stabilized by polar hydrogen bonds, those having interactions among C−H and O/N groups, and those that have hydrogen bonds between the bases themselves.^[1] The first discovered non-canonical base pairs are Hoogsteen base pairs, which were first described by American biochemist Karst Hoogsteen.

Non-canonical base pairings commonly occur in the secondary structure of RNA (e.g. pairing of G with U), and in tRNA recognition. They are typically less stable than standard base pairings.^[2] The presence of non-canonical base pairs in double stranded DNA results in a disrupted double helix.^[3]

History

James Watson and Francis Crick published the double helical structure of DNA and proposed the canonical Watson-Crick base pairs in 1953.^[4] Ten years later, in 1963, Karst Hoogsteen reported that he had used single crystal X-ray diffraction to investigate alternative base pair structures, and he found an alternative structure for the nucelobase pair adenine-thymine in which the purine (A) takes on an alternative conformation with respect to the pyrimidine (T).^[5] Five years after Hoogsteen proposed the A-T Hoogsteen base pair, optical rotary dispersion spectra which provided evidence for a G-C Hoogsteen base pair were reported.^[6] The G-C Hoogsteen base pair was first observed via X-ray crystallography years later, in 1986, by co-crystallizing DNA with triostin A (an antibiotic).^[7] Ultimately, after years of studying both Watson-Crick and Hoogsteen base pairs, it has been determined that both occur naturally in DNA, and that they exist in equilibrium with one another; the conditions in which the DNA exists ultimately determine which form will be favored.^[8] Since the structures of the canonical Watson-Crick and non-canonical Hoogsteen base pairs were determined, many other types of non-canonical base pairs have been presented and described.

Structure

Base pairing

An estimated 60% of bases in structured RNA participate in canonical Watson-Crick base pairs.^[9] Base pairing occurs when two bases form hydrogen bonds with each other. These hydrogen bonds can be either polar or non-polar interactions. The polar hydrogen bonds are formed by N-H...O/N and/or O-H...O/N interactions. Non-polar hydrogen bonds are formed between C-H...O/N.^[10]

Edge interactions

Each base has three potential edges where it can interact with another base. The Purine bases have 3 edges which are able to hydrogen bond. Those are known as the Watson-Crick edge(WC), the Hoogsteen edge(H), and the Sugar edge(S). Pyrimidine bases also have three hydrogen-bonding edges.^[9] Like the Pyrimidine there is the Watson-Crick edge(WC) and the Sugar edge(S) but the third edge is referred to as the "C-H" edge(H). This C-H edge is sometimes also referred to as the Hoogsteen edge for simplicity. There various edges for the Purine and Pyrimidine bases are shown in Figure 2.^[10]

^[10]

Besides the three edges of interaction Base pairs also vary in various cis/trans forms. The cis and trans structures depend on the orientation of the ribose sugar as compared to the hydrogen bond interaction. These various orientations are shown in Figure 3. With the cis/trans forms and the 3 edges of hydrogen bonding there are 12 basic types of base pairing geometries which can be found in RNA structures. Those 12 types are WC:WC (cis/trans), W:HC (cis/trans), WC:S (cis/trans), H:S (cis/trans), H:H (cis/trans), and S:S (cis/trans).

Classification

These 12 types can be further divided into more subgroups which are dependent on the directionality of the glycosidic bonds and steric extensions.^[11] With all of the various base pair combinations there are 169 theoretically possible base pair combinations. The number of actual base pair combinations is however much lower since some of the combinations result in non-favorable interactions. This number of possible non-canonical base pairs is still being determined since it is very dependent on the base pairing criteria.^[12] Understanding the base pair configuration is difficult since the pairing is very dependent on the bases surroundings. These surroundings consist of adjacent base pairs, adjacent loops, or third interaction such as a base triple.^[13]

Since the various bases are rigid and planar, the bonding between them bases are well defined. The spatial interactions between the two bases can be classified in 6 rigid-body parameters or intra-base pair parameters (3 translational, 3 rotational) as shown in Figure 4.^[14] These parameters describe the base pairs three dimensional confirmation. The three translational arrangements are known as shear, stretch, and stagger. These three parameters are directly related to the proximity and direction of the pairs hydrogen bonding. The rotational arrangements are buckle, propeller, and opening. Rotational arrangements relate to the non-planar confirmation as compared to the ideal coplanar geometry.^[10] Intra-base pair parameters are used to determine the structure and stabilities of non-canonical base pairs. These parameters were originally created for the base pairings in DNA but can also fit the non-canonical base models.^[14]

Types

The most common non-canonical base pairs are trans A:G Hoogsteen/sugar edge, A:U Hoogsteen/WC, and G:U Wobble pairs.^[15]

Hoogsteen base pairs

Hoogsteen base pairs occur between adenine (A) and thymine(T), and guanine (G) and cytosine(C), similarly to Watson-Crick base pairs; however, the purine takes on an alternative conformation with respect to the pyrimidine. In the A-U Hoogsteen base pair, the adenine is rotated 180° about the glycosidic bond, resulting in an alternative hydrogen bonding scheme which has one hydrogen bond in common with the Watson-Crick base pair (adenine N6 and thymine N4), while the other, instead of occurring between adenine N1 and thymine N3 as in the Watson-Crick base pair, occurs between adenine N7 and thymine N3.^[8] The A-U base pair is shown in Figure 5. In the G-C Watson-Crick base pair, similarly to the A-T Hoogsteen base pair, the purine (guanine) is rotated 180° about the glycosidic bond while the pyrimidine (cytosine) remains in place. One hydrogen bond from the Watson-Crick base pair is maintained (guanine O6 and cytosine N4) and the other occurs between guanine N7 and a protonated cytosine N3 (note that the Hoogsteen G-C base pair has two hydrogen bonds, while the Watson-Crick G-C base pair has three).^[8]

Wobble base pairs

Wobble base pairing occur between two nucleotides that are not Watson-Crick base pairs. The 4 main examples are guanine-uracil (G-U), hypoxanthine-uracil (I-U), hypoxanthine-adenine (I-A), and hypoxanthine-cytosine (I-C). These wobble base pairs are very important in tRNA. Most organisms have less than 45 tRNA molecules but 61 tRNA molecules would be necessary to canonically pair to the codon. Wobble base pairing was proposed by Watson in 1966. Wobble base pairing allows for the 5' anticodon to non-standard base pair. Examples of wobble base pairs are given in Figure 6.

3-D Structure

Secondary and three-dimensional structures of RNA are formed and stabilized through non-canonical base pairs. Base pairs make up many secondary structural blocks which aid the folding of RNA complexes and three dimensional structures. The overall folded RNA is stabilized by the tertiary and secondary structures canonically base pairing together.^[10] Due to the many non-canonical base pairs there are an unlimited amount of structures which allow for the diverse functions of RNA.^[9] The arrangement of the non-canonical bases allow long-range RNA interactions, recognition of proteins and other molecules, and structural stabilizing elements.^[14] Many of the common non-canonical base pairs can be added to a stacked RNA stem without disturbing its helical character.^[1]

Secondary

Figure 7: This depicts a hairpin structure found in pre m-RNA

Basic secondary structural elements of RNA include bulges, double helices, hairpin loops and internal loops. An example of a hairpin loop of RNA is given in Figure 7. As shown in the figure hairpin loops and internal loops require a sudden change in backbone direction. Non-canonical base pairing allows for increased flexibility at junctions or turns in the secondary structure.^[10]

Tertiary

Three-dimensional structures are formed through the long-range intra-molecular interactions between the secondary structures. This leads to the formation of pseudoknots, ribose zippers, kissing hairpin loops, or co-axial pseudocontinuous helices.^[10] The three-dimensional structures of RNA are primarily determined through molecular simulations or computationally guided measurements.^[14] An example of a Pseudoknot is given in Figure 8.

Experimental Methods

Watson-Crick canonical base pairing is not the only edge-to-edge conformation possible for the nucleotide since non-canonical pairing can take place as well. Sugar-phosphate backbone has an ionic character, which makes the bases sensitive to their environment, leading to conformational changes, such as non-canonical pairing.^[16]^[1] There are various methods of prediction for these conformations, such as NMR structure determination and X-ray crystallography.^[16]

Applications

RNA has many purposes throughout the cell including many important steps in gene expression. Various conformations of the non-Watson-Crick base pairs allow for a multitude of biological functions such as mRNA splicing, siRNA, transport, protein recognition, protein binding, and translation.^[17]^[18]

One example of a biological application of non-canonical base pairs in the kink turn. A kink-turn is found throughout many functional RNA species. It consists of a three-nucleotide bilge which is due to three Hoogsteen base pairs. This kink-turn acts as a marker where various proteins can bind such as the human 15-5k protein or proteins in the L7Ae family.^[19] A similar scenario is described in the binding of the HIV-1 Rev-response element (RRE) RNA. The RNA has an extra wide deep groove that is caused by cis Watson-Crick G:A pair followed by a trans Watson-Crick G:G. The HIV-1 Rev-response element is then able to bind due to the deepened groove.^[1]

References

^ ^a ^b ^c ^d Hermann T, Westhof E (December 1999). "Non-Watson-Crick base pairs in RNA-protein recognition" (PDF). Chemistry & Biology. 6 (12): R335-43. doi:10.1016/s1074-5521(00)80003-4. PMID 10631510.
^ Lemieux S, Major F (October 2002). "RNA canonical and non-canonical base pairing types: a recognition method and complete repertoire". Nucleic Acids Research. 30 (19): 4250–63. doi:10.1093/nar/gkf540. PMC 140540. PMID 12364604.
^ Das J, Mukherjee S, Mitra A, Bhattacharyya D (October 2006). "Non-canonical base pairs and higher order structures in nucleic acids: crystal structure database analysis". Journal of Biomolecular Structure & Dynamics. 24 (2): 149–61. doi:10.1080/07391102.2006.10507108. PMID 16928138.
^ Watson JD, Crick FH (April 1953). "Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid". Nature. 171 (4356): 737–8. Bibcode:1953Natur.171..737W. doi:10.1038/171737a0. PMID 13054692.
^ Hoogsteen K (1963-09-10). "The crystal and molecular structure of a hydrogen-bonded complex between 1-methylthymine and 9-methyladenine". Acta Crystallographica. 16 (9): 907–916. doi:10.1107/S0365110X63002437. ISSN 0365-110X.
^ Courtois Y, Fromageot P, Guschlbauer W (December 1968). "Protonated polynucleotide structures. 3. An optical rotatory dispersion study of the protonation of DNA". European Journal of Biochemistry. 6 (4): 493–501. doi:10.1111/j.1432-1033.1968.tb00472.x. PMID 5701966.
^ Quigley GJ, Ughetto G, van der Marel GA, van Boom JH, Wang AH, Rich A (June 1986). "Non-Watson-Crick G.C and A.T base pairs in a DNA-antibiotic complex". Science. 232 (4755): 1255–8. doi:10.1126/science.3704650. PMID 3704650.
^ ^a ^b ^c Nikolova EN, Zhou H, Gottardo FL, Alvey HS, Kimsey IJ, Al-Hashimi HM (December 2013). "A historical account of Hoogsteen base-pairs in duplex DNA". Biopolymers. 99 (12): 955–68. doi:10.1002/bip.22334. PMC 3844552. PMID 23818176.
^ ^a ^b ^c Leontis NB, Westhof E (April 2001). "Geometric nomenclature and classification of RNA base pairs". RNA. 7 (4): 499–512. doi:10.1017/S1355838201002515. PMC 1370104. PMID 11345429.
^ ^a ^b ^c ^d ^e ^f ^g Halder S, Bhattacharyya D (November 2013). "RNA structure and dynamics: a base pairing perspective". Progress in Biophysics and Molecular Biology. 113 (2): 264–83. doi:10.1016/j.pbiomolbio.2013.07.003. PMID 23891726.
^ Sponer JE, Leszczynski J, Sychrovský V, Sponer J (October 2005). "Sugar edge/sugar edge base pairs in RNA: stabilities and structures from quantum chemical calculations". The Journal of Physical Chemistry B. 109 (39): 18680–9. doi:10.1021/jp053379q. PMID 16853403.
^ Sharma P, Sponer JE, Sponer J, Sharma S, Bhattacharyya D, Mitra A (March 2010). "On the role of the cis Hoogsteen:sugar-edge family of base pairs in platforms and triplets-quantum chemical insights into RNA structural biology". The Journal of Physical Chemistry B. 114 (9): 3307–20. doi:10.1021/jp910226e. PMID 20163171.
^ Heus HA, Hilbers CW (October 2003). "Structures of non-canonical tandem base pairs in RNA helices: review". Nucleosides, Nucleotides & Nucleic Acids. 22 (5–8): 559–71. doi:10.1081/NCN-120021955. PMID 14565230.
^ ^a ^b ^c ^d Olson WK, Li S, Kaukonen T, Colasanti AV, Xin Y, Lu XJ (May 2019). "Effects of Noncanonical Base Pairing on RNA Folding: Structural Context and Spatial Arrangements of G·A Pairs". Biochemistry. 58 (20): 2474–2487. doi:10.1021/acs.biochem.9b00122. PMC 6729125. PMID 31008589.
^ Roy A, Panigrahi S, Bhattacharyya M, Bhattacharyya D (March 2008). "Structure, stability, and dynamics of canonical and noncanonical base pairs: quantum chemical studies". The Journal of Physical Chemistry B. 112 (12): 3786–96. doi:10.1021/jp076921e. PMID 18318519.
^ ^a ^b Lu XJ, Olson WK (September 2003). "3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures". Nucleic Acids Research. 31 (17): 5108–21. doi:10.1093/nar/gkg680. PMC 212791. PMID 12930962.
^ Fernandes CL, Escouto GB, Verli H (2013-06-28). "Structural glycobiology of heparinase II from Pedobacter heparinus". Journal of Biomolecular Structure & Dynamics. 32 (7): 1092–102. doi:10.1080/07391102.2013.809604. PMID 23808670.
^ Storz G, Altuvia S, Wassarman KM (2005-06-01). "An abundance of RNA regulators". Annual Review of Biochemistry. 74 (1): 199–217. doi:10.1146/annurev.biochem.74.082803.133136. PMID 15952886.
^ Huang L, Lilley DM (January 2018). "The kink-turn in the structural biology of RNA". Quarterly Reviews of Biophysics. 51: e5. doi:10.1017/S0033583518000033. PMID 30912490.

[Hermann_1999-1] Hermann T, Westhof E (December 1999). "Non-Watson-Crick base pairs in RNA-protein recognition" (PDF). Chemistry & Biology. 6 (12): R335-43. doi:10.1016/s1074-5521(00)80003-4. PMID 10631510.

[2] Lemieux S, Major F (October 2002). "RNA canonical and non-canonical base pairing types: a recognition method and complete repertoire". Nucleic Acids Research. 30 (19): 4250–63. doi:10.1093/nar/gkf540. PMC 140540. PMID 12364604.

[3] Das J, Mukherjee S, Mitra A, Bhattacharyya D (October 2006). "Non-canonical base pairs and higher order structures in nucleic acids: crystal structure database analysis". Journal of Biomolecular Structure & Dynamics. 24 (2): 149–61. doi:10.1080/07391102.2006.10507108. PMID 16928138.

[4] Watson JD, Crick FH (April 1953). "Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid". Nature. 171 (4356): 737–8. Bibcode:1953Natur.171..737W. doi:10.1038/171737a0. PMID 13054692.

[5] Hoogsteen K (1963-09-10). "The crystal and molecular structure of a hydrogen-bonded complex between 1-methylthymine and 9-methyladenine". Acta Crystallographica. 16 (9): 907–916. doi:10.1107/S0365110X63002437. ISSN 0365-110X.

[6] Courtois Y, Fromageot P, Guschlbauer W (December 1968). "Protonated polynucleotide structures. 3. An optical rotatory dispersion study of the protonation of DNA". European Journal of Biochemistry. 6 (4): 493–501. doi:10.1111/j.1432-1033.1968.tb00472.x. PMID 5701966.

[7] Quigley GJ, Ughetto G, van der Marel GA, van Boom JH, Wang AH, Rich A (June 1986). "Non-Watson-Crick G.C and A.T base pairs in a DNA-antibiotic complex". Science. 232 (4755): 1255–8. doi:10.1126/science.3704650. PMID 3704650.

[Nikolova_2013-8] Nikolova EN, Zhou H, Gottardo FL, Alvey HS, Kimsey IJ, Al-Hashimi HM (December 2013). "A historical account of Hoogsteen base-pairs in duplex DNA". Biopolymers. 99 (12): 955–68. doi:10.1002/bip.22334. PMC 3844552. PMID 23818176.

[Leontis_2001-9] Leontis NB, Westhof E (April 2001). "Geometric nomenclature and classification of RNA base pairs". RNA. 7 (4): 499–512. doi:10.1017/S1355838201002515. PMC 1370104. PMID 11345429.

[Halder_2013-10] ^ ^a ^b ^c ^d ^e ^f ^g Halder S, Bhattacharyya D (November 2013). "RNA structure and dynamics: a base pairing perspective". Progress in Biophysics and Molecular Biology. 113 (2): 264–83. doi:10.1016/j.pbiomolbio.2013.07.003. PMID 23891726.

[11] Sponer JE, Leszczynski J, Sychrovský V, Sponer J (October 2005). "Sugar edge/sugar edge base pairs in RNA: stabilities and structures from quantum chemical calculations". The Journal of Physical Chemistry B. 109 (39): 18680–9. doi:10.1021/jp053379q. PMID 16853403.

[12] Sharma P, Sponer JE, Sponer J, Sharma S, Bhattacharyya D, Mitra A (March 2010). "On the role of the cis Hoogsteen:sugar-edge family of base pairs in platforms and triplets-quantum chemical insights into RNA structural biology". The Journal of Physical Chemistry B. 114 (9): 3307–20. doi:10.1021/jp910226e. PMID 20163171.

[13] Heus HA, Hilbers CW (October 2003). "Structures of non-canonical tandem base pairs in RNA helices: review". Nucleosides, Nucleotides & Nucleic Acids. 22 (5–8): 559–71. doi:10.1081/NCN-120021955. PMID 14565230.

[Olson_2019-14] Olson WK, Li S, Kaukonen T, Colasanti AV, Xin Y, Lu XJ (May 2019). "Effects of Noncanonical Base Pairing on RNA Folding: Structural Context and Spatial Arrangements of G·A Pairs". Biochemistry. 58 (20): 2474–2487. doi:10.1021/acs.biochem.9b00122. PMC 6729125. PMID 31008589.

[15] Roy A, Panigrahi S, Bhattacharyya M, Bhattacharyya D (March 2008). "Structure, stability, and dynamics of canonical and noncanonical base pairs: quantum chemical studies". The Journal of Physical Chemistry B. 112 (12): 3786–96. doi:10.1021/jp076921e. PMID 18318519.

[:5-16] Lu XJ, Olson WK (September 2003). "3DNA: a software package for the analysis, rebuilding and visualization of three-dimensional nucleic acid structures". Nucleic Acids Research. 31 (17): 5108–21. doi:10.1093/nar/gkg680. PMC 212791. PMID 12930962.

[17] Fernandes CL, Escouto GB, Verli H (2013-06-28). "Structural glycobiology of heparinase II from Pedobacter heparinus". Journal of Biomolecular Structure & Dynamics. 32 (7): 1092–102. doi:10.1080/07391102.2013.809604. PMID 23808670.

[18] Storz G, Altuvia S, Wassarman KM (2005-06-01). "An abundance of RNA regulators". Annual Review of Biochemistry. 74 (1): 199–217. doi:10.1146/annurev.biochem.74.082803.133136. PMID 15952886.

[19] Huang L, Lilley DM (January 2018). "The kink-turn in the structural biology of RNA". Quarterly Reviews of Biophysics. 51: e5. doi:10.1017/S0033583518000033. PMID 30912490.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]