Single molecule real time sequencing
Single molecule real time sequencing (SMRT) is a parallelized single molecule DNA sequencing method. Single molecule real time sequencing utilizes a zero-mode waveguide (ZMW). A single DNA polymerase enzyme is affixed at the bottom of a ZMW with a single molecule of DNA as a template. The ZMW is a structure that creates an illuminated observation volume that is small enough to observe only a single nucleotide of DNA being incorporated by DNA polymerase. Each of the four DNA bases is attached to one of four different fluorescent dyes. When a nucleotide is incorporated by the DNA polymerase, the fluorescent tag is cleaved off and diffuses out of the observation area of the ZMW where its fluorescence is no longer observable. A detector detects the fluorescent signal of the nucleotide incorporation, and the base call is made according to the corresponding fluorescence of the dye.
The DNA sequencing is done on a chip that contains many ZMWs. Inside each ZMW, a single active DNA polymerase with a single molecule of single stranded DNA template is immobilized to the bottom through which light can penetrate and create a visualization chamber that allows monitoring of the activity of the DNA polymerase at a single molecule level. The signal from a phospho-linked nucleotide incorporated by the DNA polymerase is detected as the DNA synthesis proceeds which results in the DNA sequencing in real time.
For each of the nucleotide bases, there is a corresponding fluorescent dye molecule that enable the detector to identify the base being incorporated by the DNA polymerase as it performs the DNA synthesis. The fluorescent dye molecule is attached to the phosphate chain of the nucleotide. When the nucleotide is incorporated by the DNA polymerase, the fluorescent dye is cleaved off with the phosphate chain as a part of a natural DNA synthesis process during which a phosphodiester bond is created to elongate the DNA chain. The cleaved fluorescent dye molecule then diffuses out of the detection volume so that the fluorescent signal is no longer detected.
The ZMW holes are ~70 nm in diameter and ~100 nm in depth. Due to the behavior of light when it travels through a small aperture, the optical field decays exponentially inside the chamber.
The observation volume within an illuminated ZMW is ~20 zeptoliters (20 X 10−21 liters). Within this volume, the activity of DNA polymerase incorporating a single nucleotide can be readily detected.
Sequencing performance for the technology can be measured in read length and total throughput per experiment.
Pacific Biosciences commercialized SMRT sequencing in 2011, after releasing a beta version of its instrument in late 2010. At commercialization read length had a normal distribution with a mean of about 1.1 kilobases. A new chemistry kit released in early 2012 increased the sequencer's read length; an early customer of the chemistry cited mean read lengths of 2.5 to 2.9 kilobases. The XL chemistry kit released in late 2012 increased average read length to more than 4.3 kilobases. The P4 binding kit released in August 2013 combined with the XL chemistry kit yielded average read lengths to more than 5 kilobases and when coupled with input DNA size selection (using an electrophoresis instrument such as BluePippin) yields average read length over 7 kilobases.
Throughput per experiment for the technology is both influenced by the read length of DNA molecules sequenced as well as total multiplex of a SMRT Cell. The prototype of the SMRT Cell contained about 3000 ZMW holes that allowed parallelized DNA sequencing. At commercialization, the SMRT Cells were each patterned with 150,000 ZMW holes that were read in two sets of 75,000. In April 2013, the company released a new version of the sequencer called the "PacBio RS II" that uses all 150,000 ZMW holes concurrently, doubling the throughput per experiment. The highest throughput mode in November 2013 uses P5 binding, C3 chemistry, BluePippin size selection, and a PacBio RS II officially yields 350 megabases per SMRT Cell though a Human de novo data set released with the chemistry averaged 500 megabases per SMRT Cell. Throughput varies based on the type of sample being sequenced. In September 2015, the company announced the launch of a new sequencing instrument, the Sequel System, that increased capacity to 1 million ZMW holes.
Single molecule real time sequencing may be applicable for a broad range of genomics research.
For de novo genome sequencing, read lengths from the single molecule real time sequencing are comparable to or greater than that from the Sanger sequencing method based on dideoxynucleotide chain termination. The longer read length allows de novo genome sequencing and easier genome assemblies. Scientists are also using single molecule real time sequencing in hybrid assemblies for de novo genomes to combine short-read sequence data with long-read sequence data. In 2012, several peer-reviewed publications were released demonstrating the automated finishing of bacterial genomes, including one paper that updated the Celera Assembler with a pipeline for genome finishing using long SMRT sequencing reads. In 2013, scientists estimated that long-read sequencing could be used to fully assemble and finish the majority of bacterial and archaeal genomes.
The same DNA molecule can be resequenced independently by creating the circular DNA template and utilizing a strand displacing enzyme that separates the newly synthesized DNA strand from the template. In August 2012, scientists from the Broad Institute published an evaluation of SMRT sequencing for SNP calling.
The dynamics of polymerase can indicate whether a base is methylated. Scientists demonstrated the use of single molecule real time sequencing for detecting methylation and other base modifications. In 2012 a team of scientists used SMRT sequencing to generate the full methylomes of six bacteria. In November 2012, scientists published a report on genome-wide methylation of an outbreak strain of E. coli.
- Levene, M. J. (2003). "Zero-Mode Waveguides for Single-Molecule Analysis at High Concentrations". Science 299 (5607): 682–686. doi:10.1126/science.1079700. ISSN 0036-8075.
- Eid, J.; Fehr, A.; Gray, J.; Luong, K.; Lyle, J.; Otto, G.; Peluso, P.; Rank, D.; Baybayan, P.; Bettman, B.; Bibillo, A.; Bjornson, K.; Chaudhuri, B.; Christians, F.; Cicero, R.; Clark, S.; Dalal, R.; deWinter, A.; Dixon, J.; Foquet, M.; Gaertner, A.; Hardenbol, P.; Heiner, C.; Hester, K.; Holden, D.; Kearns, G.; Kong, X.; Kuse, R.; Lacroix, Y.; Lin, S.; Lundquist, P.; Ma, C.; Marks, P.; Maxham, M.; Murphy, D.; Park, I.; Pham, T.; Phillips, M.; Roy, J.; Sebra, R.; Shen, G.; Sorenson, J.; Tomaney, A.; Travers, K.; Trulson, M.; Vieceli, J.; Wegener, J.; Wu, D.; Yang, A.; Zaccarin, D.; Zhao, P.; Zhong, F.; Korlach, J.; Turner, S. (2009). "Real-Time DNA Sequencing from Single Polymerase Molecules". Science 323 (5910): 133–138. doi:10.1126/science.1162986. ISSN 0036-8075. PMID 19023044.
- Korlach, J.; Marks, P. J.; Cicero, R. L.; Gray, J. J.; Murphy, D. L.; Roitman, D. B.; Pham, T. T.; Otto, G. A.; Foquet, M.; Turner, S. W. (2008). "Selective aluminum passivation for targeted immobilization of single DNA polymerase molecules in zero-mode waveguide nanostructures". Proceedings of the National Academy of Sciences 105 (4): 1176–1181. doi:10.1073/pnas.0710982105. ISSN 0027-8424. PMC 2234111. PMID 18216253.
- Foquet, Mathieu; Samiee, Kevan T.; Kong, Xiangxu; Chauduri, Bidhan P.; Lundquist, Paul M.; Turner, Stephen W.; Freudenthal, Jake; Roitman, Daniel B. (2008). "Improved fabrication of zero-mode waveguides for single-molecule detection". Journal of Applied Physics 103 (3): 034301. doi:10.1063/1.2831366. ISSN 0021-8979.
- "PacBio Ships First Two Commercial Systems; Order Backlog Grows to 44". GenomeWeb.
- "PacBio Reveals Beta System Specs for RS; Says Commercial Release is on Track for First Half of 2011". GenomeWeb.
- "After a Year of Testing, Two Early PacBio Customers Expect More Routine Use of RS Sequencer in 2012". GenomeWeb.
- "PacBio's XL Chemistry Increases Read Lengths and Throughput; CSHL Tests the Tech on Rice Genome". GenomeWeb.
- "PacBio Users Report Progress in Long Reads for Plant Genome Assembly, Tricky Regions of Human Genome". GenomeWeb.
- "PacBio Blog". pacificbiosciences.com.
- "Longing for the longest reads: PacBio and BluePippin". In between lines of code.
- "Pacific Biosciences". pacificbiosciences.com.
- "PacBio Launches PacBio RS II Sequencer". Next Gen Seek.
- "New Products: PacBio's RS II; Cufflinks". GenomeWeb.
- "Duke Sequencing on Twitter". Twitter.
- John Eid. "Real-Time DNA Sequencing from Single Polymerase Molecules". sciencemag.org.
- "Origins of the E. coli Strain Causing an Outbreak of Hemolytic–Uremic Syndrome in Germany". New England Journal of Medicine 365: 709–717. Aug 2011. doi:10.1056/NEJMoa1106920. PMID 21793740.
- "The Origin of the Haitian Cholera Outbreak Strain". New England Journal of Medicine 364: 33–42. Jan 2011. doi:10.1056/NEJMoa1012928. PMC 3030187. PMID 21142692.
- "Tech Tips". GEN.
- "A hybrid approach for the automated finishing of bacterial genomes" 30.
- "Hybrid error correction and de novo assembly of single-molecule sequencing reads". Nature Biotechnology 30: 693–700. doi:10.1038/nbt.2280.
- "Reducing assembly complexity of microbial genomes with single-molecule sequencing". Genome Biology 14: R101. doi:10.1186/gb-2013-14-9-r101.
- "Pacific biosciences sequencing technology for genotyping and variation discovery in human data". BMC Genomics 13: 375. doi:10.1186/1471-2164-13-375.
- Tyson A. Clark. "Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing". oxfordjournals.org.
- "Genome Integrity - Full text - Direct Detection and Sequencing of Damaged DNA Bases". genomeintegrity.com.
- Iain A. Murray. "The methylomes of six bacteria". oxfordjournals.org.
- "Genome-wide mapping of methylated adenine residues in pathogenic Escherichia coli using single-molecule real-time sequencing". nature.com.