Cap analysis gene expression
Cap analysis gene expression (CAGE) is a technique used in molecular biology to produce a snapshot of the 5′ end of the messenger RNA population in a biological sample. The small fragments (usually 27 nucleotides long) from the very beginnings of mRNAs (5' ends of capped transcripts) are extracted, reverse-transcribed to DNA, PCR amplified and sequenced. CAGE was first published by Hayashizaki, Carninci and co-workers in 2003. CAGE has been extensively used within the FANTOM research projects.
The output of CAGE is a set of short nucleotide sequences (often called tags) with their observed counts. Using a reference genome, a researcher can usually determine, with some confidence, the original mRNA (and therefore which gene) the tag was extracted from. Copy numbers of CAGE tags provide an easy way of digital quantification of the RNA transcript abundances in biological samples.
Unlike a similar technique Serial Analysis of Gene Expression (SAGE, superSAGE) in which tags come from other parts of transcripts, CAGE is primarily used to locate exact transcription start sites in the genome. This knowledge in turn allows a researcher to investigate promoter structure necessary for gene expression.
However, the CAGE protocol has a known bias with a nonspecific guanine (G) at the most 5′ end of the CAGE tags, which is attributed to the template-free 5′-extension during the first-strand cDNA synthesis. This would induce erroneous mapping of CAGE tags, for instance to nontranscribed pseudogenes. On the other hand, this addition of Gs was also utilised as a signal to filter more reliable TSS peaks.
The original CAGE method (Shiraki et al., 2003) was using CAP Trapper for capturing the 5′ ends, oligo-dT primers for synthesizing the cDNAs, the type IIs restriction enzyme MmeI for cleaving the tags, and the Sanger method for sequencing them.
Random reverse-transcription primers were introduced in 2006 by Kodzius et al. to better detect the non-polyadenylated RNAs.
In 2008, barcode multiplexing was added to the DeepCAGE protocol (Maeda et al., 2008).
In nanoCAGE (Plessy et al., 2010), the 5′ ends or RNAs were captured with the template-switching method instead of CAP Trapper, in order to analyze smaller starting amounts of total RNA. Longer tags were cleaved with the type III restriction enzyme EcoP15I and directly sequenced on the Solexa (then Illumina) platform without concatenation.
The CAGEscan methodology (Plessy et al., 2010), where the enzymatic tag cleavage is skipped, and the 5′ cDNAs sequenced paired-end, was introduced in the same article to connect novel promoters to known annotations.
With HeliScopeCAGE (Kanamori-Katayama et al., 2011), the CAP-trapped CAGE protocol was changed to skip the enzymatic tag cleavage and sequence directly the capped 5′ ends on the HeliScope platform, without PCR amplification. It was then automated by Itoh et al. in 2012.
In 2012, the standard CAGE protocol was updated by Takahashi et al. to cleave tags with EcoP15I and sequence them on the Illumina-Solexa platform.
In 2013, Batut et al. combined CAP trapper, template switching, and 5′-phosphate-dependent exonuclease digestion in RAMPAGE to maximize promoter specificity.
In 2014, Murata et al. published the nAnTi-CAGE protocol, where capped 5′ ends are sequenced on the Illumina platform with no PCR amplification and no tag cleavage.
- Shiraki, T; Kondo, S; Katayama, S; et al. (2003-12-23). "Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage". Proc Natl Acad Sci U S A. 100 (26): 15776–81. PMC . PMID 14663149. doi:10.1073/pnas.2136655100.
- Zhao, Xiaobei (2011). "Systematic Clustering of Transcription Start Site Landscapes". PLoS ONE. Public Library of Science. 6 (8): e23409. PMC . PMID 21887249. doi:10.1371/journal.pone.0023409. Retrieved 2011-08-24.
- Cumbie, Jason (2015). "NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites". BMC Genomics. BioMed Central. 16: 597. PMC . PMID 26268438. doi:10.1186/s12864-015-1670-6. Retrieved 2017-04-05.
- Carninci, Piero (1996). "High-efficiency full-length cDNA cloning by biotinylated CAP trapper.". Genomics. 37 (3): 327–36. PMID 8938445. doi:10.1006/geno.1996.0567. Retrieved 2013-10-16.
- Kodzius, Rimantas (2006). "CAGE: cap analysis of gene expression.". Nat Methods. 3 (3): 211–22. PMID 16489339. doi:10.1038/nmeth0306-211. Retrieved 2013-10-16.
- Valen, Eivind (2009). "Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE.". Genome Res. 19 (2): 255–265. PMC . PMID 19074369. doi:10.1101/gr.084541.108. Retrieved 2013-10-16.
- Maeda, Norihiro (2008). "Development of a DNA barcode tagging method for monitoring dynamic changes in gene expression by using an ultra high-throughput sequencer.". BioTechniques. 45 (1): 95–7. PMID 18611171. doi:10.2144/000112814. Retrieved 2016-04-28.
- Plessy, Charles (2010). "Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE.". Nat Methods. 7 (7): 528–34. PMC . PMID 20543846. doi:10.1038/nmeth.1470. Retrieved 2013-10-16.
- Kanamori-Katayama, Mutsumi (2011). "Unamplified cap analysis of gene expression on a single-molecule sequencer.". Genome Res. 21 (7): 1150–9. PMC . PMID 21596820. doi:10.1101/gr.115469.110. Retrieved 2013-10-16.
- Itoh, Masayoshi (2012). "Automated workflow for preparation of cDNA for cap analysis of gene expression on a single molecule sequencer.". PLoS ONE. 7 (1): e30809. PMC . PMID 22303458. doi:10.1371/journal.pone.0030809. Retrieved 2013-10-16.
- Takahashi, Hazuki (2012). "5' end-centered expression profiling using cap-analysis gene expression and next-generation sequencing.". Nat Protoc. 7 (3): 542–61. PMC . PMID 22362160. doi:10.1038/nprot.2012.005. Retrieved 2013-10-16.
- Batut, Philippe (2013). "High-fidelity promoter profiling reveals widespread alternative promoter usage and transposon-driven developmental gene expression.". Genome Res. 23 (1): 169–80. PMC . PMID 22936248. doi:10.1101/gr.139618.112. Retrieved 2013-10-16.
- Murata, Mitsuyoshi (2014). "Detecting Expressed Genes Using CAGE". Methods Mol Biol. 1164: 67–85. PMID 24927836. doi:10.1007/978-1-4939-0805-9_7. Retrieved 2014-08-13.
- Poulain, Stéphane (2017). "NanoCAGE: A Method for the Analysis of Coding and Noncoding 5'-Capped Transcriptomes.". Methods Mol Biol. 1543: 57–109. PMID 28349422. doi:10.1007/978-1-4939-6716-2_4. Retrieved 2017-04-04.