2 base encoding
An editor has nominated this article for deletion. You are welcome to participate in the deletion discussion, which will decide whether or not to retain it. |
This article or section is in a state of significant expansion or restructuring. You are welcome to assist in its construction by editing it as well. If this article or section has not been edited in several days, please remove this template. If you are the editor who added this template and you are actively editing, please be sure to replace this template with {{in use}} during the active editing session. Click on the link for template parameters to use.
This article was last edited by Hashtpa9 (talk | contribs) 16 years ago. (Update timer) |
This article is actively undergoing a major edit for a little while. To help avoid edit conflicts, please do not edit this page while this message is displayed. This page was last edited at 22:25, 28 February 2008 (UTC) (16 years ago) – this estimate is cached, . Please remove this template if this page hasn't been edited for a significant time. If you are the editor who added this template, please be sure to remove it or replace it with {{Under construction}} between editing sessions. |
The dream of human whole genome re-sequencing at a reasonable time and cost (less than $1000) is becoming realized with recently developed next-generation sequencing technologies. These technologies generate hundreds of thousands of small sequence reads at one time. Well-known examples include 454 pyrosequencing (introduced 2005), Solexa system (introduced 2006) and the 2-base encoding sequencing (introduced 2007-2008). These methods have reduced the cost from almost $0.01/base in 2004 to near $0.0001/base in 2006 and increased the sequencing machine capacity from 1,000,000 base/machine/day in 2004 to more than 100,000,000 base/machine/day in 2006.
General features
The general steps in all of these next-generation sequencing techniques include:
1- Random fragmentation of genomic DNA
2- Immobilization of single DNA fragments on a solid support like a bead or planar solid surface
3- Amplification of DNA fragments on the solid surface using PCR and making polymerase colonies
4- Sequencing and subsequent in situ interrogation after each cycle using fluorescence scanning or chemiluminescence [1].
In 2005 Shendure et al. used a sequencing procedure using multiple cycles of ligation of fluorescent labled 9-mer probes which distinguish the central base. In each cycle the sequence of every fifth base is recognized. This process is repeated using different primers to sequence the remaining four bases in each gap [2]. The most recent next-generation sequencing technology which is called 2-base encoding or SOLiD (Sequencing by Oligonucleotide Ligation and Detection) technology has been developed by Applied Biosystem and will be commercially available in 2008. Similar to Shendure et al. and despite other two next-generation sequencing technologies, 2-base encoding is based on ligation sequencing rather than sequencing by synthesis. However, its fundamental difference to previously used 9-mer probes with distinguished central base is taking advantage of fluorescent labeled 8-mer probes with distinguished the 2 central bases.
How it works
The SOLiD Sequencing System uses probes with dual base encoding.
The underlying chemistry is summarized in the following steps:
- Step1, preparing a library: This step begins with shearing the genomic DNA into small fragments. Then two different adapters are added (for example A1 and A2). The resultant library contains template DNA fragments, which are tagged with one adapter at each end.
- Step2, emulsion PCR: in this step the emulsion (water in oil emulsion) PCR reaction is performed using DNA fragments from library, two primers (P1 and P2) complement to the previously used adapters (P1 with A1 and P2 with A2), other PCR reaction components and beads coupled with one of the primers (e.g. P1). The aim is locating one DNA template and one bead into a single emulsion droplet.
In each droplet, DNA template anneals to the P1-coupled bead from its A1 side. Then DNA polymerase will extend from P1 to make the complementary sequence, which eventually results in a bead enriched with PCR products from a single template. After PCR reaction, templates are denatured and disassociate from the beads.
- Step3, bead enrichment: in practice, only 30% of beads have target DNA. To enrich the number of these beads, large polystyrene beads coated with A2 are added to the solution. Thus, any bead containing the extended products will bind polystyrene bead through its P2 end. The resultant complex will be separated from untargeted beads, and melt of to dissociate the targeted beads from polystyrene. This step can increases the throughput of this system from 30% before enrichment to 80% after enrichment.
After enrichment, the 3’-end of products (P2 end) will be modified which makes them capable of covalent bonding in the next step. Therefore, the products of this step are DNA-coupled beads with 3’-modification of each DNA strand.
- Step4, bead deposition: In this step, products of the last step are deposited onto a glass slide. Beads attach to the glass surface randomly through covalent bonds of the 3’-modified beads and the glass.
- Step5, sequencing reaction
For decoding the colors we must first know that each single color indicates two bases and second, we need to know one of the bases in the sequence[3].
Advantages
-Accuracy: each base in this sequencing method is interrogated and interpreted twice. This can increase the accuracy of the system to more than 99.94% which is higher than other systems.
-Detection of SNPs and other small changes: One of the main advantages of this system is its ease in detection of single nucleotide polymorphisms (SNPs) as well as other small alterations in the template sequence. Changing the color of two adjacent bases is characteristic for SNPs. The detection of other alterations is summarized in Figure 2.
- Detection of errors: as discussed earlier, each base in this system is recognized by two colors and alteration in any single base will result in change in two colors. Therefore, while alteration in two or more than two colors demonstrate a real change in our sequence, just one color change indicates an error and should not be considered as a change (Figure 2).
Conclusion
Next-generation
sequencing method |
Number of
bp/run |
Duration of
each run |
Read length
(bp) |
Cost | Accuracy |
---|---|---|---|---|---|
454 | 100 million | 7.5 hours | up to 250 | 1:10-20 of
Sanger sequencing |
> 99.5% |
Solexa (1G) | 1000 million | 3 days | up to 50 | 1:20-30 of
454 sequencing |
> 99.93% |
2 base encoding | 2000 million | 4 days | up to 35 | 1:20-30 of
454 sequencing |
> 99.94% |
References
- ^ http://www.blackwell-synergy.com/doi/full/10.1111/j.1471-8286.2007.02019.x?cookieSet=1 Sequencing breakthroughs for genomic ecology and evolutionary biology
- ^ http://www.sciencemag.org/cgi/content/full/309/5741/1728 Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome
- ^ http://seqanswers.com/forums/showthread.php?t=10