In molecular biology and genetics, splicing is a modification of the nascent pre-messenger RNA (pre-mRNA) transcript in which introns are removed and exons are joined. For nuclear encoded genes, splicing takes place within the nucleus after or concurrently with transcription. Splicing is needed for the typical eukaryotic messenger RNA (mRNA) before it can be used to produce a correct protein through translation. For many eukaryotic introns, splicing is done in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs), but there are also self-splicing introns.
Several methods of RNA splicing occur in nature; the type of splicing depends on the structure of the spliced intron and the catalysts required for splicing to occur.
The word intron is derived from the term intragenic region, that is, a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding sequence in the unprocessed RNA transcript. As part of the RNA processing pathway, introns are removed by RNA splicing either shortly after or concurrent with transcription. Introns are found in the genes of most organisms and many viruses. They can be located in a wide range of genes, including those that generate proteins, ribosomal RNA (rRNA), and transfer RNA (tRNA).
Spliceosomal introns often reside within the sequence of eukaryotic protein-coding genes. Within the intron, a donor site (5' end of the intron), a branch site (near the 3' end of the intron) and an acceptor site (3' end of the intron) are required for splicing. The splice donor site includes an almost invariant sequence GU at the 5' end of the intron, within a larger, less highly conserved region. The splice acceptor site at the 3' end of the intron terminates the intron with an almost invariant AG sequence. Upstream (5'-ward) from the AG there is a region high in pyrimidines (C and U), or polypyrimidine tract. Upstream from the polypyrimidine tract is the branchpoint, which includes an adenine nucleotide. The consensus sequence for an intron (in IUPAC nucleic acid notation) is: M-A-G-[cut]-G-U-R-A-G-U (donor site) ... intron sequence ... C-U-R-[A]-Y (branch sequence 20-50 nucleotides upstream of acceptor site) ... Y-rich-N-C-A-G-[cut]-G (acceptor site). However, it is noted that the specific sequence of intronic splicing elements and the number of nucleotides between the branchpoint and the nearest 3’ acceptor site affect splice site selection. Also, point mutations in the underlying DNA or errors during transcription can activate a cryptic splice site in part of the transcript that usually is not spliced. This results in a mature messenger RNA with a missing section of an exon. In this way, a point mutation, which usually only affects a single amino acid, can manifest as a deletion in the final protein.
Formation and activity
Splicing is catalyzed by the spliceosome which is a large RNA-protein complex composed of five small nuclear ribonucleoproteins (snRNPs, pronounced 'snurps' ). The RNA components of snRNPs interact with the intron and may be involved in catalysis. Two types of spliceosomes have been identified (the major and minor) which contain different snRNPs.
- The major spliceosome splices introns containing GU at the 5' splice site and AG at the 3' splice site. It is composed of the U1, U2, U4, U5, and U6 snRNPs and is active in the nucleus. In addition, a number of proteins including U2AF and SF1 are required for the assembly of the spliceosome.
- E Complex-U1 binds to the GU sequence at the 5' splice site, along with accessory proteins/enzymes ASF/SF2, U2AF (binds at the Py-AG site), SF1/BBP (BBP=Branch Binding Protein);
- A Complex-U2 binds to the branch site and ATP is hydrolyzed;
- B1 Complex-U5/U4/U6 trimer binds, and the U5 binds exons at the 5' site, with U6 binding to U2;
- B2 Complex-U1 is released, U5 shifts from exon to intron and the U6 binds at the 5' splice site;
- C1 Complex-U4 is released, U6/U2 catalyzes transesterification, that make 5'end of introns ligate to the A on intron and form a lariat ,U5 binds exon at 3' splice site, and the 5' site is cleaved, resulting in the formation of the lariat;
- C2 Complex-U2/U5/U6 remain bound to the lariat, and the 3' site is cleaved and exons are ligated using ATP hydrolysis. The spliced RNA is released and the lariat debranches.
- This type of splicing is termed canonical splicing or termed the lariat pathway, which accounts for more than 99% of splicing. By contrast, when the intronic flanking sequences do not follow the GU-AG rule, noncanonical splicing is said to occur (see "minor spliceosome" below).
- The minor spliceosome is very similar to the major spliceosome, however it splices out rare introns with different splice site sequences. While the minor and major spliceosomes contain the same U5 snRNP, the minor spliceosome has different, but functionally analogous snRNPs for U1, U2, U4, and U6, which are respectively called U11, U12, U4atac, and U6atac. Unike the major spliceosome, it is found outside the nucleus, but very close to the nuclear membrane.
- Trans-splicing is a form of splicing that joins two exons that are not within the same RNA transcript.
Self-splicing occurs for rare introns that form a ribozyme, performing the functions of the spliceosome by RNA alone. There are three kinds of self-splicing introns, Group I, Group II and Group III. Group I and II introns perform splicing similar to the spliceosome without requiring any protein. This similarity suggests that Group I and II introns may be evolutionarily related to the spliceosome. Self-splicing may also be very ancient, and may have existed in an RNA world present before protein.
Two transesterifications characterize the mechanism in which group I introns are spliced:
- 3'OH of a free guanine nucleoside (or one located in the intron) or a nucleotide cofactor (GMP, GDP, GTP) attacks phosphate at the 5' splice site.
- 3'OH of the 5'exon becomes a nucleophile and the second transesterification results in the joining of the two exons.
The mechanism in which group II introns are spliced (two transesterification reaction like group I introns) is as follows:
- The 2'OH of a specific adenosine in the intron attacks the 5' splice site, thereby forming the lariat
- The 3'OH of the 5' exon triggers the second transesterification at the 3' splice site thereby joining the exons together.
tRNA (also tRNA-like) splicing is another rare form of splicing that usually occurs in tRNA. The splicing reaction involves a different biochemistry than the spliceosomal and self-splicing pathways. Ribonucleases cleave the RNA and ligases join the exons together.
Splicing occurs in all the kingdoms or domains of life, however, the extent and types of splicing can be very different between the major divisions. Eukaryotes splice many protein-coding messenger RNAs and some non-coding RNAs. Prokaryotes, on the other hand, splice rarely and mostly non-coding RNAs. Another important difference between these two groups of organisms is that prokaryotes completely lack the spliceosomal pathway.
Because spliceosomal introns are not conserved in all species, there is debate concerning when spliceosomal splicing evolved. Two models have been proposed: the intron late and intron early models (see intron evolution).
Spliceosomal splicing and self-splicing involves a two-step biochemical process. Both steps involve transesterification reactions that occur between RNA nucleotides. tRNA splicing, however, is an exception and does not occur by transesterification.
Spliceosomal and self-splicing transesterification reactions occur via two sequential transesterification reactions. First, the 2'OH of a specific branchpoint nucleotide within the intron that is defined during spliceosome assembly performs a nucleophilic attack on the first nucleotide of the intron at the 5' splice site forming the lariat intermediate. Second, the 3'OH of the released 5' exon then performs a nucleophilic attack at the last nucleotide of the intron at the 3' splice site thus joining the exons and releasing the intron lariat.
In many cases, the splicing process can create a range of unique proteins by varying the exon composition of the same mRNA. This phenomenon is then called alternative splicing. Alternative splicing can occur in many ways. Exons can be extended or skipped, or introns can be retained. It is estimated that 95% of transcripts from multiexon genes undergo alternative splicing, some instances of which occur in a tissue-specific manner and/or under specific cellular conditions. Given this complexity, alternative splicing of pre-mRNA transcripts is regulated by a system of trans-acting proteins (activators and repressors) that bind to cis-acting sites or "elements" (enhancers and silencers) on the pre-mRNA transcript itself. These proteins and their respective binding elements promote or reduce the usage of a particular splice site. However, adding to the complexity of alternative splicing, it is noted that the effects of regulatory factors are many times position-dependent. For example, a splicing factor that serves as a splicing activator when bound to an intronic enhancer element may serve as a repressor when bound to its splicing element in the context of an exon, and vice versa. In addition to the position-dependent effects of enhancer and silencer elements, the location of the branchpoint (i.e., distance upstream of the nearest 3’ acceptor site) also affects splicing. The secondary structure of the pre-mRNA transcript also plays a role in regulating splicing, such as by bringing together splicing elements or by masking a sequence that would otherwise serve as a binding element for a splicing factor.
Experimental manipulation of splicing
Splicing events can be experimentally altered by binding steric-blocking antisense oligos such as Morpholinos or Peptide nucleic acids to snRNP binding sites, to the branchpoint nucleotide that closes the lariat, or to splice-regulatory element binding sites.
Based on data current as of 2011, one-third of all hereditary diseases are thought to have a splicing component. Common errors include:
- Mutation of a splice site resulting in loss of function of that site. Results in exposure of a premature stop codon, loss of an exon, or inclusion of an intron.
- Mutation of a splice site reducing specificity. May result in variation in the splice location, causing insertion or deletion of amino acids, or most likely, a disruption of the reading frame.
- Displacement of a splice site, leading to inclusion or exclusion of more RNA than expected, resulting in longer or shorter exons.
Although many splicing errors are safeguarded by a cellular quality control mechanism termed nonsense-mediated mRNA decay (NMD), a number of splicing-related diseases also exist, as suggested above.
In addition to RNA, proteins can undergo splicing. Although the biomolecular mechanisms are different, the principle is the same: parts of the protein, called inteins instead of introns, are removed. The remaining parts, called exteins instead of exons, are fused together. Protein splicing has been observed in a wide range of organisms, including bacteria, archaea, plants, yeast and humans.
|Wikimedia Commons has media related to Splicing.|
- Primary transcript
- Minor spliceosome
- Exon Junction Complex
- SWAP protein domain, a splicing regulator
- Tilgner H, Knowles DG, Johnson R, Davis CA, Chakrabortty S, Djebali S, Curado J, Snyder M, Gingeras TR, Guigó R (2012). “Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs”. Genome Res. 22 (9): 1616–1625.
- Roy SW, Gilbert W (2006). “The evolution of spliceosomal introns: patterns, puzzles and progress”. Nat. Rev. Genet. 7 (3): 211–221.
- Clancy, Suzanne (2008). "RNA Splicing: Introns, Exons and Spliceosome". Nature Education 1 (1). Retrieved 31 March 2011.
- Black, Douglas L. (2003). "Mechanisms of alternative pre-messenger RNA splicing". Annual Reviews of Biochemistry 72 (1): 291–336. doi:10.1146/annurev.biochem.72.121801.161720. PMID 12626338.
- Taggart AJ, DeSimone AM, Shih JS, Filloux ME, Fairbrother WG (2012). "Large-scale mapping of branchpoints in human pre-mRNA transcripts in vivo". Nat. Struct. Mol. Biol. 19 (7) 719–721.
- Corvelo A, Hallegger M, Smith CWJ, Eyras E (2010). "Genome-Wide Association between Branch Point Properties and Alternative Splicing". PLoS Comput. Biol. 6 (11) e1001016. doi:10.1371/journal.pcbi.1001016
- Matlin, AJ; Clark F, Smith, CWJ (May 2005). "Understanding alternative splicing: towards a cellular code". Nature Reviews 6 (5): 386–398. doi:10.1038/nrm1645. PMID 15956978.
- Ng B, Yang F, Huston DP, et al. (December 2004). "Increased noncanonical splicing of autoantigen transcripts provides the structural basis for expression of untolerized epitopes". J. Allergy Clin. Immunol. 114 (6): 1463–70. doi:10.1016/j.jaci.2004.09.006. PMID 15577853.
- Patel AA, Steitz JA (2003). "Splicing double: insights from the second spliceosome". Nat. Rev. Mol. Cell Biol. 4 (12): 960–70. doi:10.1038/nrm1259. PMID 14685174.
- König, H., Matter, N., Bader, R., Thiele, W. & Müller, F. (2007). "Splicing segregation: the minor spliceosome acts outside the nucleus and controls cell proliferation.". Cell 131 (4): 718–729. doi:10.1016/j.cell.2007.09.043. PMID 18022366.
- Di Segni G, Gastaldi S, Tocchini-Valentini GP (May 2008). "Cis- and trans-splicing of mRNAs mediated by tRNA sequences in eukaryotic cells". Proc. Natl. Acad. Sci. U.S.A. 105 (19): 6864–9. doi:10.1073/pnas.0800420105. PMC 2383978. PMID 18458335.
- Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ (2008). "Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing". Nat. Genet. 40 (12) 1413–1415.
- Lim KH, Ferraris L, Filloux ME, Raphael BJ, Fairbrother WG (2011). "Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes". Proc. Natl. Acad. Sci. U.S.A. 108 (27) 11093–11098.
- Warf MB, Berglund JA (2010). "Role of RNA structure in regulating pre-mRNA splicing". Trends Biochem. Sci. 35 (3) 169–178.
- Reid DC, Chang BL, Gunderson SI, Alpert L, Thompson WA, Fairbrother WG (2009). "Next-generation SELEX identifies sequence and structural determinants of splicing factor binding in human pre-mRNA sequence". RNA 15 (12) 2385–2397.
- Draper BW, Morcos PA, Kimmel CB (2001). "Inhibition of zebrafish fgf8 pre-mRNA splicing with morpholino oligos: a quantifiable method for gene knockdown". Genesis 30 (3): 154–6. doi:10.1002/gene.1053. PMID 11477696.
Sazani P, Kang SH, Maier MA, et al. (1 October 2001). "Nuclear antisense effects of neutral, anionic and cationic oligonucleotide analogs". Nucleic Acids Res. 29 (19): 3965–74. doi:10.1093/nar/29.19.3965. PMC 60237. PMID 11574678.
- Morcos, PA (2007). "Achieving targeted and quantifiable alteration of mRNA splicing with Morpholino oligos.". Biochem. Biophys. Res. Commun. 358 (2): 521–7. doi:10.1016/j.bbrc.2007.04.172. PMID 17493584.
- Bruno IG, Jin W, Cote GJ (2004-10-15). "Correction of aberrant FGFR1 alternative RNA splicing through targeting of intronic regulatory elements". Hum. Mol. Genet. 13 (20): 2409–20. doi:10.1093/hmg/ddh272. PMID 15333583.(Epub August 27, 2004)
- Danckwardt S, Neu-Yilik G, Thermann R, Frede U, Hentze MW, Kulozik AE (2002). "Abnormally spliced beta-globin mRNAs: a single point mutation generates transcripts sensitive and insensitive to nonsense-mediated mRNA decay". Blood 99 (5): 1811–6. doi:10.1182/blood.V99.5.1811. PMID 11861299.
- Ward AJ, Cooper TA (2010). "The pathobiology of splicing". J. Pathol. 220 (2) 152–163.
- Ken-ichi Hanada, James C. Yang (2005). "Increased Novel biochemistry: post-translational protein splicing and other lessons from the school of antigen processing" (PDF). J Mol Med 83 (6): 420–428. doi:10.1007/s00109-005-0652-6. PMID 15759099.
- Virtual Cell Animation Collection: mRNA Splicing
- RNA Splicing at the US National Library of Medicine Medical Subject Headings (MeSH)