Serial analysis of gene expression

Serial analysis of gene expression (SAGE) is a technique used by molecular biologists to produce a snapshot of the messenger RNA population in a sample of interest in the form of small tags that correspond to fragments of those transcripts. The original technique was developed by Dr. Victor Velculescu at the Oncology Center of Johns Hopkins University and published in 1995.^[1] Several variants have been developed since, most notably a more robust version, LongSAGE,^[2] RL-SAGE^[3] and the most recent SuperSAGE.^[4] Many of these have improved the technique with the capture of longer tags, enabling more confident identification of a source gene.

Overview

Briefly, SAGE experiments proceed as follows:

Isolate the mRNA of an input sample (e.g. a tumour).
Extract a small chunk of sequence from a defined position of each mRNA molecule.
Link these small pieces of sequence together to form a long chain (or concatemer).
Clone these chains into a vector which can be taken up by bacteria.
Sequence these chains using modern high-throughput DNA sequencers.
Process this data with a computer to count the small sequence tags.

A more in-depth, technical explanation of the technique is available here.

Analysis

The output of SAGE is a list of short sequence tags and the number of times it is observed. Using sequence databases a researcher can usually determine, with some confidence, the original mRNA (and therefore which gene) the tag was extracted from.

Statistical methods can be applied to tag and count lists from different samples in order to determine which genes are more highly expressed. For example, a normal tissue sample can be compared against a corresponding tumour to determine which genes tend to be more (or less) active.

Applications

Although SAGE was originally conceived for use in cancer studies, it has been successfully used to describe the transcriptome of other diseases and in a wide variety of organisms.

Comparison to DNA microarrays

The general goal of the technique is similar to the DNA microarray. However, SAGE sampling is based on sequencing mRNA output, not on hybridization of mRNA output to probes, so transcription levels are measured more quantitatively than by microarray. In addition, the mRNA sequences do not need to be known a priori, so genes or gene variants which are not known can be discovered. Microarray experiments are much cheaper to perform, so large-scale studies do not typically use SAGE. Quantifying gene expressions is more exact in SAGE because it involves directly counting the number of transcripts whereas spot intensities in microarrays fall in non-discrete gradients and are prone to background noise.

Variant Protocols: miRNA cloning

MicroRNAs, or miRNAs for short, are small (~22nt) segments of RNA which have been found to play a crucial role in gene regulation. One of the most commonly used methods for cloning and identifying miRNAs within a cell or tissue was developed in the Bartel Lab and published in a paper by Lau et al. (2001). Since then, several variant protocols have arisen, but most have the same basic format. The procedure is quite similar to SAGE: The small RNA are isolated, then linkers are added to each, and the RNA is converted to cDNA by RT-PCR. Following this, the linkers, containing internal restriction sites, are digested with the appropriate restriction enzyme and the sticky ends are ligated together into concatamers. Following concatenation, the fragments are ligated into plasmids and are used to transform bacteria to generate many copies of the plasmid containing the inserts. Those may then be sequenced to identify the miRNA present, as well as analysing expression levels of a given miRNA by counting the number of times it is present, similar to SAGE.

References

^ Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. (1995). "Serial analysis of gene expression". Science. 270 (5235): 484–7. doi:10.1126/science.270.5235.484. PMID 7570003.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Saha S; et al. (2002). "Using the transcriptome to annotate the genome". Nat Biotechnol. 20 (5): 508–12. doi:10.1038/nbt0502-508. PMID 11981567. {{cite journal}}: Explicit use of et al. in: |author= (help)
^ Gowda M, Jantasuriyarat C, Dean RA, Wang GL. (2004). "Robust-LongSAGE (RL-SAGE): a substantially improved LongSAGE method for gene discovery and transcriptome analysis". Plant Physiol. 134 (3): 890–7. doi:10.1104/pp.103.034496. PMC 389912. PMID 15020752.{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Matsumura H, Ito A, Saitoh H, Winter P, Kahl G, Reuter M, Krüger DH, Terauchi R. (2005). "SuperSAGE". Cell Microbiol. 7 (1): 11–8. doi:10.1111/j.1462-5822.2004.00478.x. PMID 15617519.{{cite journal}}: CS1 maint: multiple names: authors list (link)

External links

[SAGE-1] Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. (1995). "Serial analysis of gene expression". Science. 270 (5235): 484–7. doi:10.1126/science.270.5235.484. PMID 7570003.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Saha-2] Saha S; et al. (2002). "Using the transcriptome to annotate the genome". Nat Biotechnol. 20 (5): 508–12. doi:10.1038/nbt0502-508. PMID 11981567. {{cite journal}}: Explicit use of et al. in: |author= (help)

[Gowda-3] Gowda M, Jantasuriyarat C, Dean RA, Wang GL. (2004). "Robust-LongSAGE (RL-SAGE): a substantially improved LongSAGE method for gene discovery and transcriptome analysis". Plant Physiol. 134 (3): 890–7. doi:10.1104/pp.103.034496. PMC 389912. PMID 15020752.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[Matsumura-4] Matsumura H, Ito A, Saitoh H, Winter P, Kahl G, Reuter M, Krüger DH, Terauchi R. (2005). "SuperSAGE". Cell Microbiol. 7 (1): 11–8. doi:10.1111/j.1462-5822.2004.00478.x. PMID 15617519.{{cite journal}}: CS1 maint: multiple names: authors list (link)

[1]

[2]

[3]

[4]