Transcription (biology)

Simplified diagram of mRNA synthesis and processing. Enzymes not shown.

This article is about genetics. For other uses, see Transcription (disambiguation)

Transcription is the first step of gene expression, in which a particular segment of DNA is copied into RNA by the enzyme, RNA polymerase. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes. During transcription, a DNA sequence is read by an RNA polymerase, which produces a complementary, antiparallel RNA strand. As opposed to DNA replication, transcription results in an RNA complement that includes uracil (U) in all instances where thymine (T) would have occurred in a DNA complement. Also unlike DNA replication where DNA is synthesised, transcription does not involve an RNA primer to initiate RNA synthesis.

Transcription proceeds in 5 or 6 steps, each moving like a wave along the DNA.

One or more sigma factors initiate transcription of a gene by enabling binding of RNA polymerase to promoter DNA.
RNA polymerase moves a transcription bubble, like the slider of a zipper, which splits the double helix DNA molecule into two strands of unpaired DNA nucleotides, by breaking the hydrogen bonds between complementary DNA nucleotides.
RNA polymerase adds matching RNA nucleotides that are paired with complementary DNA nucleotides of one DNA strand.
RNA sugar-phosphate backbone forms with assistance from RNA polymerase to form an RNA strand.
Hydrogen bonds of the untwisted RNA + DNA helix break, freeing the newly synthesized RNA strand.
If the cell has a nucleus, the RNA is further processed (addition of a 3'UTR poly-A tail and a 5'UTR cap) and exits to the cytoplasm through the nuclear pore complex.

Transcription is the first step leading to gene expression. The stretch of DNA transcribed into an RNA molecule is called a transcription unit and encodes at least one gene. If the gene transcribed encodes a protein, the result of transcription is messenger RNA (mRNA), which will then be used to create that protein via the process of translation. Alternatively, the transcribed gene may encode for either non-coding RNA genes (such as microRNA, lincRNA, etc.) or ribosomal RNA (rRNA) or transfer RNA (tRNA), other components of the protein-assembly process, or other ribozymes.^[1]

A DNA transcription unit encoding for a protein contains not only the sequence that will eventually be directly translated into the protein (the coding sequence) but also regulatory sequences that direct and regulate the synthesis of that protein. The regulatory sequence before (i.e., upstream from) the coding sequence is called the five prime untranslated region (5'UTR), and the sequence following (downstream from) the coding sequence is called the three prime untranslated region (3'UTR).^[1]

Transcription has some proofreading mechanisms, but they are fewer and less effective than the controls for copying DNA; therefore, transcription has a lower copying fidelity than DNA replication.^[2]

As in DNA replication, DNA is read from 3'UTR → 5'UTR during transcription. Meanwhile, the complementary RNA is created from the 5'UTR → 3'UTR direction. This means its 5' end is created first in base pairing. Although DNA is arranged as two antiparallel strands in a double helix, only one of the two DNA strands, called the template strand, is used for transcription. This is because RNA is only single-stranded, as opposed to double-stranded DNA. The other DNA strand (the non-template strand) is called the coding strand, because its sequence is the same as the newly created RNA transcript (except for the substitution of uracil for thymine). The use of only the 3'UTR → 5'UTR strand eliminates the need for the Okazaki fragments seen in DNA replication.^[1]

Transcription is divided into five stages: pre-initiation, initiation, promoter clearance, elongation and termination.^[1]

Major steps

Pre-initiation

In eukaryotes, RNA polymerase, and therefore the initiation of transcription, requires the presence of a core promoter sequence in the DNA. Promoters are regions of DNA that promote transcription and, in eukaryotes, are found at -30, -75, and -90 base pairs upstream from the transcription start site (abbreviated to TSS). Core promoters are sequences within the promoter that are essential for transcription initiation. RNA polymerase is able to bind to core promoters in the presence of various specific transcription factors.^{[citation needed]}

The most characterized type of core promoter in eukaryotes is a short DNA sequence known as a TATA box, found 25-30 base pairs upstream from the TSS.^{[citation needed]} The TATA box, as a core promoter, is the binding site for a transcription factor known as TATA-binding protein (TBP), which is itself a subunit of another transcription factor, called Transcription Factor II D (TFIID). After TFIID binds to the TATA box via the TBP, five more transcription factors and RNA polymerase combine around the TATA box in a series of stages to form a preinitiation complex. One transcription factor, Transcription factor II H, has two components with helicase activity and so is involved in the separating of opposing strands of double-stranded DNA to form the initial transcription bubble. However, only a low, or basal, rate of transcription is driven by the preinitiation complex alone. Other proteins known as activators and repressors, along with any associated coactivators or corepressors, are responsible for modulating transcription rate.^{[citation needed]}

Thus, preinitiation complex contains:^{[citation needed]}

Core Promoter Sequence
Transcription Factors
RNA Polymerase
Activators and Repressors.

The transcription preinitiation in archaea is, in essence, homologous to that of eukaryotes, but is much less complex.^[3] The archaeal preinitiation complex assembles at a TATA-box binding site; however, in archaea, this complex is composed of only RNA polymerase II, TBP, and TFB (the archaeal homologue of eukaryotic transcription factor II B (TFIIB)).^[4]^[5]

Initiation

In bacteria, transcription begins with the binding of RNA polymerase to the promoter in DNA. RNA polymerase is a core enzyme consisting of five subunits: 2 α subunits, 1 β subunit, 1 β' subunit, and 1 ω subunit. At the start of initiation, the core enzyme is associated with a sigma factor that aids in finding the appropriate -35 and -10 base pairs downstream of promoter sequences.^[6] When the sigma factor and RNA polymerase combine, they form a holoenzyme.

Transcription initiation is more complex in eukaryotes. Eukaryotic RNA polymerase does not directly recognize the core promoter sequences. Instead, a collection of proteins called transcription factors mediate the binding of RNA polymerase and the initiation of transcription. Only after certain transcription factors are attached to the promoter does the RNA polymerase bind to it. The completed assembly of transcription factors and RNA polymerase bind to the promoter, forming a transcription initiation complex. Transcription in the archaea domain is similar to transcription in eukaryotes.^[7]

Promoter clearance

After the first bond is synthesized, the RNA polymerase must clear the promoter. During this time there is a tendency to release the RNA transcript and produce truncated transcripts. This is called abortive initiation and is common for both eukaryotes and prokaryotes.^[8]

In prokaryotes, abortive initiation continues to occur until the σ factor rearranges, resulting in the transcription elongation complex (which gives a 35 bp moving footprint). The σ factor is released according to a stochastic model.^[9] Mechanistically, promoter clearance occurs through a scrunching mechanism, where the energy built up by the RNA transcript scrunching provides the energy needed to move the RNAP complex and clear the promoter. ^[10] Once the transcript reaches approximately 23 nucleotides, it no longer slips and elongation can occur. ^{[citation needed]}. This, like most of the remainder of transcription, is an energy-dependent process, consuming adenosine triphosphate (ATP).^{[citation needed]}

In eukaryotes, after several rounds of 10nt abortive initiation, promoter clearance coincides with the TFIIH's phosphorylation of serine 5 on the carboxy terminal domain of RNAP II, leading to the recruitment of capping enzyme (CE). ^[11] ^[12] The exact mechanism of how CE induces promoter clearance in eukaryotes is not yet known.

Elongation

One strand of the DNA, the template strand (or noncoding strand), is used as a template for RNA synthesis. As transcription proceeds, RNA polymerase traverses the template strand and uses base pairing complementarity with the DNA template to create an RNA copy. Although RNA polymerase traverses the template strand from 3' → 5', the coding (non-template) strand and newly formed RNA can also be used as reference points, so transcription can be described as occurring 5' → 3'. This produces an RNA molecule from 5' → 3', an exact copy of the coding strand (except that thymines are replaced with uracils, and the nucleotides are composed of a ribose (5-carbon) sugar where DNA has deoxyribose (one less oxygen atom) in its sugar-phosphate backbone).^{[citation needed]}

mRNA transcription can involve multiple RNA polymerases on a single DNA template and multiple rounds of transcription (amplification of particular mRNA), so many mRNA molecules can be rapidly produced from a single copy of a gene.^{[citation needed]}

Elongation also involves a proofreading mechanism that can replace incorrectly incorporated bases. In eukaryotes, this may correspond with short pauses during transcription that allow appropriate RNA editing factors to bind. These pauses may be intrinsic to the RNA polymerase or due to chromatin structure.^{[citation needed]}

Termination

Bacteria use two different strategies for transcription termination - Rho-independent termination and Rho-dependent termination. In Rho-independent transcription termination,also called intrinsic termination, RNA transcription stops when the newly synthesized RNA molecule forms a G-C-rich hairpin loop followed by a run of Us. When the hairpin forms, the mechanical stress breaks the weak rU-dA bonds, now filling the DNA-RNA hybrid. This pulls the poly-U transcript out of the active site of the RNA polymerase, in effect, terminating transcription. In the "Rho-dependent" type of termination, a protein factor called "Rho" destabilizes the interaction between the template and the mRNA, thus releasing the newly synthesized mRNA from the elongation complex.^[13]

Transcription termination in eukaryotes is less understood but involves cleavage of the new transcript followed by template-independent addition of As at its new 3' end, in a process called polyadenylation.^[14]

Measuring and detecting transcription

Transcription can be measured and detected in a variety of ways:^{[citation needed]}

Nuclear Run-on assay: measures the relative abundance of newly formed transcripts
RNase protection assay and ChIP-Chip of RNAP: detect active transcription sites
RT-PCR: measures the absolute abundance of total or nuclear RNA levels, which may however differ from transcription rates
DNA microarrays: measures the relative abundance of the global total or nuclear RNA levels; however, these may differ from transcription rates
In situ hybridization: detects the presence of a transcript
MS2 tagging: by incorporating RNA stem loops, such as MS2, into a gene, these become incorporated into newly synthesized RNA. The stem loops can then be detected using a fusion of GFP and the MS2 coat protein, which has a high affinity, sequence-specific interaction with the MS2 stem loops. The recruitment of GFP to the site of transcription is visualised as a single fluorescent spot. This new approach has revealed that transcription occurs in discontinuous bursts, or pulses (see Transcriptional bursting). With the notable exception of in situ techniques, most other methods provide cell population averages, and are not capable of detecting this fundamental property of genes.^[15]
Northern blot: the traditional method, and until the advent of RNA-Seq, the most quantitative
RNA-Seq: applies next-generation sequencing techniques to sequence whole transcriptomes, which allows the measurement of relative abundance of RNA, as well as the detection of additional variations such as fusion genes, post-transcriptional edits and novel splice sites

Transcription factories

Active transcription units are clustered in the nucleus, in discrete sites called transcription factories or euchromatin. Such sites can be visualized by allowing engaged polymerases to extend their transcripts in tagged precursors (Br-UTP or Br-U) and immuno-labeling the tagged nascent RNA. Transcription factories can also be localized using fluorescence in situ hybridization or marked by antibodies directed against polymerases. There are ~10,000 factories in the nucleoplasm of a HeLa cell, among which are ~8,000 polymerase II factories and ~2,000 polymerase III factories. Each polymerase II factory contains ~8 polymerases. As most active transcription units are associated with only one polymerase, each factory usually contains ~8 different transcription units. These units might be associated through promoters and/or enhancers, with loops forming a ‘cloud’ around the factor.^[16]

History

A molecule that allows the genetic material to be realized as a protein was first hypothesized by François Jacob and Jacques Monod. Severo Ochoa won a Nobel Prize in Physiology or Medicine in 1959 for developing a process for synthesizing RNA in vitro with polynucleotide phosphorylase, which was useful for cracking the genetic code. RNA synthesis by RNA polymerase was established in vitro by several laboratories by 1965; however, the RNA synthesized by these enzymes had properties that suggested the existence of an additional factor needed to terminate transcription correctly.^{[citation needed]}

In 1972, Walter Fiers became the first person to actually prove the existence of the terminating enzyme.

Roger D. Kornberg won the 2006 Nobel Prize in Chemistry "for his studies of the molecular basis of eukaryotic transcription".^[17]

Reverse transcription

Some viruses (such as HIV, the cause of AIDS), have the ability to transcribe RNA into DNA. HIV has an RNA genome that is duplicated into DNA. The resulting DNA can be merged with the DNA genome of the host cell. The main enzyme responsible for synthesis of DNA from an RNA template is called reverse transcriptase.

In the case of HIV, reverse transcriptase is responsible for synthesizing a complementary DNA strand (cDNA) to the viral RNA genome. The enzyme ribonuclease H then digests the RNA strand, and reverse transcriptase synthesises a complementary strand of DNA to form a double helix DNA structure ("cDNA"). The cDNA is integrated into the host cell's genome by the enzyme integrase, which causes the host cell to generate viral proteins that reassemble into new viral particles. In HIV, subsequent to this, the host cell undergoes programmed cell death, or apoptosis of T cells.^[18] However, in other retroviruses, the host cell remains intact as the virus buds out of the cell.

Some eukaryotic cells contain an enzyme with reverse transcription activity called telomerase. Telomerase is a reverse transcriptase that lengthens the ends of linear chromosomes. Telomerase carries an RNA template from which it synthesizes a repeating sequence of DNA, or "junk" DNA. This repeated sequence of DNA is called a telomere and can be thought of as a "cap" for a chromosome. It is important because every time a linear chromosome is duplicated, it is shortened. With this "junk" DNA or "cap" at the ends of chromosomes, the shortening eliminates some of the non-essential, repeated sequence rather than the protein-encoding DNA sequence, that is farther away from the chromosome end.

Telomerase is often activated in cancer cells to enable cancer cells to duplicate their genomes indefinitely without losing important protein-coding DNA sequence. Activation of telomerase could be part of the process that allows cancer cells to become immortal. The immortalizing factor of cancer via telomere lengthening due to telomerase has been proven to occur in 90% of all carcinogenic tumors in vivo with the remaining 10% using an alternative telomere maintenance route called ALT or Alternative Lengthening of Telomeres. ^[19]

Inhibitors

Transcription inhibitors can be used as antibiotics against, for example, pathogenic bacteria (antibacterials) and fungi (antifungals). An example of such an antibacterial is rifampicin, which inhibits prokaryotic DNA transcription into mRNA by inhibiting DNA-dependent RNA polymerase by binding its beta-subunit. 8-Hydroxyquinoline is an antifungal transcription inhibitor.^[20] The effects of histone methylation may also work to inhibit the action of transcription.

References

^ ^a ^b ^c ^d Eldra P. Solomon, Linda R. Berg, Diana W. Martin. Biology, 8th Edition, International Student Edition. Thomson Brooks/Cole. ISBN 978-0495317142
^ Berg J, Tymoczko JL, Stryer L (2006). Biochemistry (6th ed.). San Francisco: W. H. Freeman. ISBN 0-7167-8724-5.{{cite book}}: CS1 maint: multiple names: authors list (link)
^ Littlefield, O., Korkhin, Y., and Sigler, P.B. (1999). "The structural basis for the oriented assembly of a TBP/TFB/promoter complex". PNAS. 96 (24): 13668–13673. doi:10.1073/pnas.96.24.13668. PMC 24122. PMID 10570130.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Hausner, W., Michael Thomm, M. (2001). "Events during Initiation of Archaeal Transcription: Open Complex Formation and DNA-Protein Interactions". Journal of Bacteriology. 183 (10): 3025–3031. doi:10.1128/JB.183.10.3025-3031.2001. PMC 95201. PMID 11325929.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Qureshi, SA; Bell, SD; Jackson, SP (1997). "Factor requirements for transcription in the archaeon Sulfolobus shibatae". EMBO Journal. 16 (10): 2927–2936. doi:10.1093/emboj/16.10.2927. PMC 1169900. PMID 9184236.
^ Raven, Peter H. (2011). Biology (9th ed.). New York: McGraw-Hill. pp. 278–301. ISBN 978-0-07-353222-6.
^ Mohamed Ouhammouch, Robert E. Dewhurst, Winfried Hausner, Michael Thomm, and E. Peter Geiduschek (2003). "Activation of archaeal transcription by recruitment of the TATA-binding protein". Proceedings of the National Academy of Sciences of the United States of America. 100 (9): 5097–5102. doi:10.1073/pnas.0837150100. PMC 154304. PMID 12692306.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1126/science.1169237, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1126/science.1169237 instead.
^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 16285918, please use {{cite journal}} with |pmid=16285918 instead.
^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 15037753, please use {{cite journal}} with |pmid=15037753 instead.
^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 15136722, please use {{cite journal}} with |pmid=15136722 instead.
^ Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 8156590, please use {{cite journal}} with |pmid=8156590 instead.
^ Richardson J. Rho-dependent termination and ATPases in transcript termination. Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression. 2002;1577(2):251-260. Available at: http://dx.doi.org/10.1016/S0167-4781(02)00456-6. Retrieved March 5, 2011.
^ Lykke-Andersen S, Jensen TH. Overlapping pathways dictate termination of RNA polymerase II transcription. Biochimie. 2007;89(10):1177-82. Available at: http://dx.doi.org/10.1016/j.biochi.2007.05.007. Retrieved August 5, 2010.
^ Raj, A. and van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216-26.
^ Papantonis, A (10/26/2012). "TNFα signals through specialized factories where responsive coding and miRNA genes are transcribed". Nature EMBO J. {{cite journal}}: Check date values in: |date= (help)
^ "Chemistry 2006". Nobel Foundation. Retrieved March 29, 2007.
^ Kolesnikova I. N. (2000 г.). "Some patterns of apoptosis mechanism during HIV-infection". Dissertation (in Russian). Retrieved February 20, 2011. {{cite web}}: Check date values in: |date= (help)
^ ALT and Telomerase from Nature. Retrieved May 2010
^ 8-Hydroxyquinoline info from SIGMA-ALDRICH. Retrieved Feb 2012

External links

Interactive Java simulation of transcription initiation. From Center for Models of Life at the Niels Bohr Institute.
Interactive Java simulation of transcription interference--a game of promoter dominance in bacterial virus. From Center for Models of Life at the Niels Bohr Institute.
Biology animations about this topic under Chapter 15 and Chapter 18
Virtual Cell Animation Collection, Introducing Transcription
Easy to use DNA transcription site

[Biology-1] Eldra P. Solomon, Linda R. Berg, Diana W. Martin. Biology, 8th Edition, International Student Edition. Thomson Brooks/Cole. ISBN 978-0495317142

[Stryer_2006-2] Berg J, Tymoczko JL, Stryer L (2006). Biochemistry (6th ed.). San Francisco: W. H. Freeman. ISBN 0-7167-8724-5.{{cite book}}: CS1 maint: multiple names: authors list (link)

[3] Littlefield, O., Korkhin, Y., and Sigler, P.B. (1999). "The structural basis for the oriented assembly of a TBP/TFB/promoter complex". PNAS. 96 (24): 13668–13673. doi:10.1073/pnas.96.24.13668. PMC 24122. PMID 10570130.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[4] Hausner, W., Michael Thomm, M. (2001). "Events during Initiation of Archaeal Transcription: Open Complex Formation and DNA-Protein Interactions". Journal of Bacteriology. 183 (10): 3025–3031. doi:10.1128/JB.183.10.3025-3031.2001. PMC 95201. PMID 11325929.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[5] Qureshi, SA; Bell, SD; Jackson, SP (1997). "Factor requirements for transcription in the archaeon Sulfolobus shibatae". EMBO Journal. 16 (10): 2927–2936. doi:10.1093/emboj/16.10.2927. PMC 1169900. PMID 9184236.

[6] Raven, Peter H. (2011). Biology (9th ed.). New York: McGraw-Hill. pp. 278–301. ISBN 978-0-07-353222-6.

[7] Mohamed Ouhammouch, Robert E. Dewhurst, Winfried Hausner, Michael Thomm, and E. Peter Geiduschek (2003). "Activation of archaeal transcription by recruitment of the TATA-binding protein". Proceedings of the National Academy of Sciences of the United States of America. 100 (9): 5097–5102. doi:10.1073/pnas.0837150100. PMC 154304. PMID 12692306.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[8] Attention: This template ({{cite doi}}) is deprecated. To cite the publication identified by doi:10.1126/science.1169237, please use {{cite journal}} (if it was published in a bona fide academic journal, otherwise {{cite report}} with |doi=10.1126/science.1169237 instead.

[9] Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 16285918, please use {{cite journal}} with |pmid=16285918 instead.

[10] Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 15037753, please use {{cite journal}} with |pmid=15037753 instead.

[11] Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 15136722, please use {{cite journal}} with |pmid=15136722 instead.

[12] Attention: This template ({{cite pmid}}) is deprecated. To cite the publication identified by PMID 8156590, please use {{cite journal}} with |pmid=8156590 instead.

[13] Richardson J. Rho-dependent termination and ATPases in transcript termination. Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression. 2002;1577(2):251-260. Available at: http://dx.doi.org/10.1016/S0167-4781(02)00456-6. Retrieved March 5, 2011.

[14] Lykke-Andersen S, Jensen TH. Overlapping pathways dictate termination of RNA polymerase II transcription. Biochimie. 2007;89(10):1177-82. Available at: http://dx.doi.org/10.1016/j.biochi.2007.05.007. Retrieved August 5, 2010.

[15] Raj, A. and van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216-26.

[16] Papantonis, A (10/26/2012). "TNFα signals through specialized factories where responsive coding and miRNA genes are transcribed". Nature EMBO J. {{cite journal}}: Check date values in: |date= (help)

[17] "Chemistry 2006". Nobel Foundation. Retrieved March 29, 2007.

[18] Kolesnikova I. N. (2000 г.). "Some patterns of apoptosis mechanism during HIV-infection". Dissertation (in Russian). Retrieved February 20, 2011. {{cite web}}: Check date values in: |date= (help)

[19] ALT and Telomerase from Nature. Retrieved May 2010

[20] 8-Hydroxyquinoline info from SIGMA-ALDRICH. Retrieved Feb 2012

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]