In genetics, a promoter is a region of DNA that initiates transcription of a particular gene. Promoters are located near the transcription start sites of genes, on the same strand and upstream on the DNA (towards the 5' region of the sense strand). Promoters can be about 100–1000 base pairs long.
- 1 Overview
- 2 Identification of relative location
- 3 Relative location in the cell nucleus
- 4 Promoter elements
- 5 Subgenomic promoters
- 6 Detection of promoters
- 7 Evolutionary change
- 8 Binding
- 9 Diseases associated with aberrant promoter function
- 10 Canonical sequences and wild-type
- 11 Diseases that may be associated with promoter variations
- 12 Constitutive vs regulated promoters
- 13 Use of the word promoter
- 14 See also
- 15 References
- 16 External links
For the transcription to take place, the enzyme that synthesizes RNA, known as RNA polymerase, must attach to the DNA near a gene. Promoters contain specific DNA sequences and response elements that provide a secure initial binding site for RNA polymerase and for proteins called transcription factors that recruit RNA polymerase. These transcription factors have specific activator or repressor sequences of corresponding nucleotides that attach to specific promoters and regulate gene expression.
- In bacteria
- The promoter is recognized by RNA polymerase and an associated sigma factor, which in turn are often brought to the promoter DNA by an activator protein's binding to its own DNA binding site nearby.
- In eukaryotes
- The process is more complicated, and at least seven different factors are necessary for the binding of an RNA polymerase II to the promoter.
Identification of relative location
As promoters are typically immediately adjacent to the gene in question, positions in the promoter are designated relative to the transcriptional start site, where transcription of DNA begins for a particular gene (i.e., positions upstream are negative numbers counting back from -1, for example -100 is a position 100 base pairs upstream).
Relative location in the cell nucleus
In the cell nucleus, it seems that promoters are distributed preferentially at the edge of the chromosomal territories, likely for the co-expression of genes on different chromosomes. Furthermore, in humans, promoters show certain structural features characteristic for each chromosome.
- Core promoter - the minimal portion of the promoter required to properly initiate transcription
- Includes the transcription start site (TSS) and elements directly upstream
- A binding site for RNA polymerase
- General transcription factor binding sites, e.g. TATA box
- Proximal promoter - the proximal sequence upstream of the gene that tends to contain primary regulatory elements
- Approximately 250 base pairs upstream of the start site
- Specific transcription factor binding sites
- Distal promoter - the distal sequence upstream of the gene that may contain additional regulatory elements, often with a weaker influence than the proximal promoter
- Anything further upstream (but not an enhancer or other regulatory region whose influence is positional/orientation independent)
- Specific transcription factor binding sites
- The sequence at -10 (the -10 element) has the consensus sequence TATAAT.
- The sequence at -35 (the -35 element) has the consensus sequence TTGACA.
- The above consensus sequences, while conserved on average, are not found intact in most promoters. On average, only 3 to 4 of the 6 base pairs in each consensus sequence are found in any given promoter. Few natural promoters have been identified to date that possess intact consensus sequences at both the -10 and -35; artificial promoters with complete conservation of the -10 and -35 elements have been found to transcribe at lower frequencies than those with a few mismatches with the consensus.
- Some promoters contain one or more upstream promoter element (UP element) subsite (consensus sequence 5'-AAAAAARNR-3' when centered in the -42 region; consensus sequence 5'-AWWWWWTTTTT-3' when centered in the -52 region; W = A or T; R = A or G; N = any base).
It should be noted that the above promoter sequences are recognized only by RNA polymerase holoenzyme containing sigma-70. RNA polymerase holoenzymes containing other sigma factors recognize different core promoter sequences.
<-- upstream downstream --> 5'-XXXXXXXPPPPPXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3' -35 -10 Gene to be transcribed
(Note that the optimal spacing between the -35 and -10 sequences is 17 bp.)
Probability of occurrence of each nucleotide
for -10 sequence T A T A A T 77% 76% 60% 61% 56% 82%
for -35 sequence T T G A C A 69% 79% 61% 56% 54% 54%
Gene promoters are typically located upstream of the gene and can have regulatory elements several kilobases away from the transcriptional start site (enhancers). In eukaryotes, the transcriptional complex can cause the DNA to bend back on itself, which allows for placement of regulatory sequences far from the actual site of transcription. Eukaryotic RNA-polymerase-II-dependent promoters can contain a TATA element (consensus sequence TATAAA), which is recognized by the general transcription factor TATA-binding protein (TBP); and a B recognition element (BRE), which is recognized by the general transcription factor TFIIB. The TATA element and BRE typically are located close to the transcriptional start site (typically within 30 to 40 base pairs).
Eukaryotic promoter regulatory sequences typically bind proteins called transcription factors that are involved in the formation of the transcriptional complex. An example is the E-box (sequence CACGTG), which binds transcription factors in the basic helix-loop-helix (bHLH) family (e.g. BMAL1-Clock, cMyc).
Bidirectional promoters (mammalian)
Bidirectional promoters are short (<1 kbp), intergenic regions of DNA between the 5' ends of the genes in a bidirectional gene pair. A “bidirectional gene pair” refers to two adjacent genes coded on opposite strands, with their 5' ends oriented toward one another. The two genes are often functionally related, and modification of their shared promoter region allows them to be co-regulated and thus co-expressed. Bidirectional promoters are a common feature of mammalian genomes. About 11% of human genes are bidirectionally paired.
Bidirectionally paired genes in the Gene Ontology database shared at least one database-assigned functional category with their partners 47% of the time. Microarray analysis has shown bidirectionally paired genes to be co-expressed to a higher degree than random genes or neighboring unidirectional genes. Although co-expression does not necessarily indicate co-regulation, methylation of bidirectional promoter regions has been shown to downregulate both genes, and demethylation to upregulate both genes. There are exceptions to this, however. In some cases (about 11%), only one gene of a bidirectional pair is expressed. In these cases, the promoter is implicated in suppression of the non-expressed gene. The mechanism behind this could be competition for the same polymerases, or chromatin modification. Divergent transcription could shift nucleosomes to upregulate transcription of one gene, or remove bound transcription factors to downregulate transcription of one gene.
Some functional classes of genes are more likely to be bidirectionally paired than others. Genes implicated in DNA repair are five times more likely to be regulated by bidirectional promoters than by unidirectional promoters. Chaperone proteins are three times more likely, and mitochondrial genes are more than twice as likely. Many basic housekeeping and cellular metabolic genes are regulated by bidirectional promoters. The overrepresentation of bidirectionally paired DNA repair genes associates these promoters with cancer. Forty-five percent of human somatic oncogenes seem to be regulated by bidirectional promoters - significantly more than non-cancer causing genes. Hypermethylation of the promoters between gene pairs WNT9A/CD558500, CTDSPL/BC040563, and KCNK15/BF195580 has been associated with tumors.
Certain sequence characteristics have been observed in bidirectional promoters, including a lack of TATA boxes, an abundance of CpG islands, and a symmetry around the midpoint of dominant Cs and As on one side and Gs and Ts on the other. CCAAT boxes are common, as they are in many promoters that lack TATA boxes. In addition, the motifs NRF-1, GABPA, YY1,and ACTACAnnTCCC are represented in bidirectional promoters at significantly higher rates than in unidirectional promoters. The absence of TATA boxes suggests that they play a role in determining the directionality of promoters, but counterexamples of bidirectional promoters do possess TATA boxes and unidirectional promoters without them indicates that they cannot be the only factor.
Although the term "bidirectional promoter" refers specifically to promoter regions of mRNA-encoding genes, luciferase assays have shown that over half of human genes do not have a strong directional bias. Research suggests that non-coding RNAs are frequently associated with the promoter regions of mRNA-encoding genes. It has been hypothesized that the recruitment and initiation of RNA Polymerase II usually begins bidirectionally, but divergent transcription is halted at a checkpoint later during elongation. Possible mechanisms behind this regulation include sequences in the promoter region, chromatin modification, and the spatial orientation of the DNA.
A subgenomic promoter is a promoter added to a virus for a specific heterologous gene, resulting in the formation of mRNA for that gene alone.
Detection of promoters
A wide variety of algorithms have been developed to facilitate detection of promoters in genomic sequence, and promoter prediction is a common element of many gene prediction methods. A promoter region is located before the -35 and -10 Consensus sequences. The closer the promoter region is to the consensus sequences the more often transcription of that gene will take place. There is not a set pattern for promoter regions as there are for consensus sequences.
A major question in evolutionary biology is how important tinkering with promoter sequences is to evolutionary change, for example, the changes that have occurred in the human lineage after separating from chimps.
Some evolutionary biologists, for example Allan Wilson, have proposed that evolution in promoter or regulatory regions may be more important than changes in coding sequences over such time frames.
A key reason for the importance of promoters is the potential to incorporate endocrine and environmental signals into changes in gene expression: A great variety of changes in the extracellular or intracellular environment may have impact on gene expression, depending on the exact configuration of a given promoter: the combination and arrangement of specific DNA sequences that constitute the promoter defines the exact groups of proteins that can be bound to the promoter, at a given timepoint. Once the cell receives a physiological, pathological, or pharmacological stimulus, a number of cellular proteins are modified biochemically by signal cascades. By changes in structure, specific proteins acquire the capability to enter the nucleus of the cell and bind to promoter DNA, or to other proteins that themselves are already bound to a given promoter. The multi-protein complexes that are formed have the potential to change levels of gene expression. As a result the gene product may increase or decrease inside the cell.
The binding of RNAP (R) to a promoter (P) is a two-step process:
Diseases associated with aberrant promoter function
Though OMIM is a major resource for gathering information on the relationship between mutations and natural variation in gene sequence and susceptibility to hundreds of diseases, a sophisticated search strategy is required to extract diseases associated with defects in transcriptional control where the promoter is believed to have direct involvement.
This is a list of diseases where evidence suggests some promoter malfunction, through either direct mutation of a promoter sequence or mutation in a transcription factor or transcriptional co-activator.
Most diseases are heterogeneous in etiology, meaning that one "disease" is often many different diseases at the molecular level, though symptoms exhibited and response to treatment may be identical. How diseases of different molecular origin respond to treatments is partially addressed in the discipline of pharmacogenomics.
Not listed here are the many kinds of cancers involving aberrant transcriptional regulation owing to creation of chimeric genes through pathological chromosomal translocation. Importantly, intervention on the number or structure of promoter-bound proteins is one key to treating a disease without affecting expression of unrelated genes sharing elements with the target gene. Genes where change is not desirable are capable of influencing the potential of a cell to become cancerous and form a tumor.
Canonical sequences and wild-type
The usage of canonical sequence for a promoter is often problematic, and can lead to misunderstandings about promoter sequences. Canonical implies perfect, in some sense.
In the case of a transcription factor binding site, then there may be a single sequence that binds the protein most strongly under specified cellular conditions. This might be called canonical.
However, natural selection may favor less energetic binding as a way of regulating transcriptional output. In this case, we may call the most common sequence in a population, the wild-type sequence. It may not even be the most advantageous sequence to have under prevailing conditions.
Diseases that may be associated with promoter variations
Some cases of many genetic diseases are associated with variations in promoters or transcription factors.
Constitutive vs regulated promoters
Some promoters are called constitutive as they are active in all circumstances in the cell, while others are regulated becoming active in response to specific stimuli.
Use of the word promoter
When referring to a promoter some authors actually mean promoter + operator. i.e., The lac promoter is IPTG inducible, this means that besides the lac promoter the lac operator is also present. If the lac operator was not present the IPTG would not have an inducible effect. Another example is the tac promoter system (Ptac). Notice how it is written down as tac promoter, while in fact it means both promoter and operator.
- "Analysis of Biological Networks: Transcriptional Networks - Promoter Sequence Analysis". Tel Aviv University. Retrieved 30 December 2012.
- Gagniuc, P; Ionescu-Tirgoviste C (2013). "Gene promoters show chromosome-specificity and reveal chromosome territories in humans.". BMC Genomics 14 (278): 278. doi:10.1186/1471-2164-14-278. PMID 23617842.
- Smale, T.; Kadonaga, T. (2003). "The RNA polymerase II core promoter". Annual review of biochemistry 72: 449–479. doi:10.1146/annurev.biochem.72.121801.161520. ISSN 0066-4154. PMID 12651739.
- Ross, W.; Gosink, K. K.; Salomon, J.; Igarashi, K.; Zou, C.; Ishihama, A.; Severinov, K.; Gourse, R. L. (1993). "A third recognition element in bacterial promoters: DNA binding by the alpha subunit of RNA polymerase". Science 262 (5138): 1407–1413. doi:10.1126/science.8248780. PMID 8248780.
- Estrem, S. T.; Ross, W.; Gaal, T.; Chen, Z. W.; Niu, W.; Ebright, R. H.; Gourse, R. L. (1999). "Bacterial promoter architecture: Subsite structure of UP elements and interactions with the carboxy-terminal domain of the RNA polymerase alpha subunit". Genes & development 13 (16): 2134–2147. doi:10.1101/gad.13.16.2134. PMC 316962. PMID 10465790.
- Gagniuc, P; Ionescu-Tirgoviste, C (Sep 28, 2012). "Eukaryotic genomes may exhibit up to 10 generic classes of gene promoters.". BMC Genomics 13 (1): 512. doi:10.1186/1471-2164-13-512. PMC 3549790. PMID 23020586. Retrieved 7 January 2013.
- Gershenzon NI, Ioshikhes IP (2005). "Synergy of human Pol II core promoter elements revealed by statistical sequence analysis". Bioinformatics 21 (8): 1295–300. doi:10.1093/bioinformatics/bti172. PMID 15572469.
- Lagrange T, Kapanidis AN, Tang H, Reinberg D, Ebright RH (1998). "New core promoter element in RNA polymerase II-dependent transcription: sequence-specific DNA binding by transcription factor IIB". Genes & Development 12 (1): 34–44. doi:10.1101/gad.12.1.34. PMC 316406. PMID 9420329.
- Levine, M.; Tjian, R. (July 2003). "Transcription regulation and animal diversity". Nature 424 (6945): 147–151. doi:10.1038/nature01763. ISSN 0028-0836. PMID 12853946.
- Trinklein ND, Aldred SF, Hartman SJ, Schroeder DI, Otillar RP and Myers RM 2004. "An abundance of bidirectional promoters in the human genome." Genome Res. 14:62-66 http://dx.doi.org/10.1101/gr.1982804
- Yang MQ, Koehly LM, and Elnitski LL 2007. http://dx.doi.org/10.1371/journal.pcbi.0030072
- Adachi N and Lieber MR, 2002. "Bidirectional gene organization." Cell. 109 (7) 807-809 http://dx.doi.org/10.1016/S0092-8674(02)00758-4
- Koyanagi KO, Hagiwara M, Itoh T, Gojobori T and Imanashi T, 2005. "Comparative genomics of bidirectional gene pairs and its implications for the evolution of a transcriptional regulation system." Gene. 353 (2) 169-176. http://dx.doi.org/10.1016/j.gene.2005.04.027
- Liu B, Chen J and Shen B, 2011. "Genome-wide analysis of the transcription factor binding preference of human bi-directional promoters and functional annotation of related gene pairs." BMC Systems Biology. 5(Suppl 1) S2 http://dx.doi.org/10.1186/1752-0509-5-S1-S2
- Shu J, Jelinek J, Chang H, Shen L, Qin T, Chen W, Oki Y, Issa JJ 2006. "Silencing of Bidirectional Promoters by DNA Methylation in Tumorigenesis." Cancer Res 66: 5077. http://dx.doi.org/10.1158/0008-5472.CAN-05-2629
- Wei W, Pelechano V, Jaervelin AI, and Steinmetz LM 2011. "Functional consequences of bidirectional promoters." Trends in Genetics 27(7) 267-276. http://dx.doi.org/10.1016/j.tig.2011.04.002
- Lin JM, Collins PJ, Trinklein ND, Fu Y, Xi H, Myers RM, and Weng Z 2007. "Transcription factor binding and modified histones in human bidirectional promoters." Genome Res. 17: 818-827 http://dx.doi.org/10.1101/gr.5623407
- Vlahopoulos S, Zoumpourlis VC (2004). "JNK: a key modulator of intracellular signaling". Biochemistry (Mosc) 69 (8): 844–54. doi:10.1023/B:BIRY.0000040215.02460.45. PMID 15377263.
- Vlahopoulos S, Boldogh I, Casola A, Brasier AR (1999). "Nuclear factor-kappaB-dependent induction of interleukin-8 gene expression by tumor necrosis factor alpha: evidence for an antioxidant sensitive activating pathway distinct from nuclear translocation". Blood 94 (6): 1878–89. PMID 10477716.
- Veitia RA, Nijhout HF (2006). "The robustness of the transcriptional response to alterations in morphogenetic gradients". BioEssays 28 (3): 282–9. doi:10.1002/bies.20377. PMID 16479586.
- Tomilin NV (2008). "Regulation of mammalian gene expression by retroelements and non-coding tandem repeats". BioEssays 30 (4): 338–48. doi:10.1002/bies.20741. PMID 18348251.
- Celniker SE, Drewell RA (2007). "Chromatin looping mediates boundary element promoter interactions". BioEssays 29 (1): 7–10. doi:10.1002/bies.20520. PMID 17187351.
- Smith CL (2008). "A shifting paradigm: histone deacetylases and transcriptional activation". BioEssays 30 (1): 15–24. doi:10.1002/bies.20687. PMID 18081007.
- Copland JA, Sheffield-Moore M, Koldzic-Zivanovic N, Gentry S, Lamprou G, Tzortzatou-Stathopoulou F, Zoumpourlis V, Urban RJ, Vlahopoulos SA (2009). "Sex steroid receptors in skeletal differentiation and epithelial neoplasia: is tissue-specific intervention possible?". BioEssays 31 (6): 629–41. doi:10.1002/bies.200800138. PMID 19382224.
- Vlahopoulos SA, Logotheti S, Mikas D, Giarika A, Gorgoulis V, Zoumpourlis V (2008). "The role of ATF-2 in oncogenesis". BioEssays 30 (4): 314–27. doi:10.1002/bies.20734. PMID 18348191.
- Hobbs, K.; Negri, J.; Klinnert, M.; Rosenwasser, L. J.; Borish, L. (1998). "Interleukin-10 and Transforming Growth Factor- β Promoter Polymorphisms in Allergies and Asthma". American Journal of Respiratory and Critical Care Medicine 158 (6): 1958–1962. doi:10.1164/ajrccm.158.6.9804011. PMID 9847292.
- Burchard, E. G.; Silverman, E. K.; Rosenwasser, L. J.; Borish, L.; Yandava, C.; Pillari, A.; Weiss, S. T.; Hasday, J.; Lilly, C. M.; Ford, J. G.; Drazen, J. M. (1999). "Association Between a Sequence Variant in the IL-4 Gene Promoter and FEV1in Asthma". American Journal of Respiratory and Critical Care Medicine 160 (3): 919–922. doi:10.1164/ajrccm.160.3.9812024. PMID 10471619.
- Kulozik, A. B. K.; Bellan-Koch, A.; Bail, S.; Kohne, E.; Kleihauer, E. (May 1991). "Thalassemia intermedia: moderate reduction of beta globin gene transcriptional activity by a novel mutation of the proximal CACCC promoter element". Blood 77 (9): 2054–2058. ISSN 0006-4971. PMID 2018842.
- Petrij, F.; Giles, H.; Dauwerse, G.; Saris, J.; Hennekam, C.; Masuno, M.; Tommerup, N.; Van Ommen, J.; Goodman, H.; Peters, D. J.; Breuning, M. H. (July 1995). "Rubinstein-Taybi syndrome caused by mutations in the transcriptional co-activator CBP". Nature 376 (6538): 348–351. Bibcode:1995Natur.376..348P. doi:10.1038/376348a0. ISSN 0028-0836. PMID 7630403.
- Lac operon
- ORegAnno - Open Regulatory Annotation Database
- Identifying a Protein Binding Sites on DNA molecule YouTube tutorial video
- mybioinfo.info - A search engine that cuts out promoter region sequence of your gene of interest.
- SwitchDB - An online database used to analyze promoters and transcription start sites (TSSs) throughout the human genome.
- Pleiades Promoter Project - a research project with an aim to generate 160 fully characterized, human DNA promoters of less than 4 kb (MiniPromoters) to drive gene expression in defined brain regions of therapeutic interests.
- ENCODE threads Explorer RNA and chromatin modification patterns around promoters. Nature (journal)