Transposon sequencing

Transposon insertion sequencing (Tn-seq) combines transposon insertional mutagenesis with massively parallel sequencing (MPS) of the transposon insertion sites to identify genes contributing to a function of interest in bacteria. The method was originally established by concurrent work in four laboratories under the acronyms HITS,^[1] INSeq,^[2] TraDIS,^[3] and Tn-Seq.^[4] Numerous variations have been subsequently developed and applied to diverse biological systems. Collectively, the methods are often termed Tn-Seq as they all involve monitoring the fitness of transposon insertion mutants via DNA sequencing approaches.^[5]

Transposons are highly regulated, discrete DNA segments that can relocate within the genome. They are universal and are found in Eubacteria, Archaea, and Eukarya, including humans. Transposons have a large influence on gene expression and can be used to determine gene function. In fact, when a transposon inserts itself in a gene, the gene's function will be disrupted.^[6] Because of that property, transposons have been manipulated for use in insertional mutagenesis.^[7] The development of microbial genome sequencing was a major advance for the use of transposon mutagenesis.^[8]^[9] The function affected by a transposon insertion could be linked to the disrupted gene by sequencing the genome to locate the transposon insertion site. Massively parallel sequencing allows simultaneous sequencing of transposon insertion sites in large mixtures of different mutants. Therefore, genome-wide analysis is feasible if transposons are positioned throughout the genome in a mutant collection.^[5]

Transposon sequencing requires the creation of a transposon insertion library, which will contain a group of mutants that collectively have transposon insertions in all non-essential genes. The library is grown under an experimental condition of interest. Mutants with transposons inserted in genes required for growth under the test condition will diminish in frequency from the population. To identify mutants being lost, genomic sequences adjacent to the transposon ends are amplified by PCR and sequenced by MPS to determine the location and abundance of each insertion mutation. The importance of each gene for growth under the test condition is determined by comparing the abundance of each mutant before and after growth under the condition being examined. Tn-seq is useful for both the study of a single gene's fitness as well as gene interactions ^[10]

Signature–tagged mutagenesis (STM) is an older technique that also involves pooling transposon insertion mutants to determine the importance of the disrupted genes under selective growth conditions.^[11] High-throughput versions of STM use genomic microarrays, which are less accurate and have a lower dynamic range than massively-parallel sequencing.^[5] With the invention of next generation sequencing, genomic data became increasingly available. However, despite the increase in genomic data, our knowledge of gene function remains the limiting factor in our understanding of the role genes play.^[12]^[13] Therefore, a need for a high throughput approach to study genotype–phenotype relationships like Tn-seq was necessary.

Methodology

Transposon sequencing begins by transducing^{[clarification needed]} bacterial populations with transposable elements^{[clarification needed]} using bacteriophages. Tn-seq^{[clarification needed]} uses the Himar I Mariner transposon, a common and stable^{[clarification needed]} transposon. After transduction, the DNA is cleaved^{[clarification needed]} and the inserted sequence amplified through PCR. The recognition sites^{[clarification needed]} for MmeI, a type IIS restriction endonuclease^{[clarification needed]}, can be introduced by a single nucleotide change in the terminal repeats^{[clarification needed]} of Mariner^{[clarification needed]}.^[14] It^{[clarification needed]} is located 4 base pairs before the end of the terminal repeat.

MmeI makes a 2 base pair staggered cut^{[clarification needed]} 20 bases downstream^{[clarification needed]} of the recognition site^{[clarification needed]}.^[15]

When MmeI digests DNA from a library^{[clarification needed]} of transposon insertion mutants^{[clarification needed]}, fragmented DNA including the left and right transposon and 16 base pair of surrounding genomic DNA is produced. The 16 base pair fragment is enough to determine the location of the transposon insertion in the bacterial genome. The ligation^{[clarification needed]} of the adaptor^{[clarification needed]} is facilitated by the 2 base overhang^{[clarification needed]}. A primer^{[clarification needed]} specific to the adaptor and a primer specific to the transposon are used to amplify the sequence via PCR. The 120 base pair product^{[clarification needed]} is then isolated using agarose gel^{[clarification needed]} or PAGE^{[clarification needed]} purification. Massively parallel sequencing is then used to determine the sequences of the flanking 16 base pairs^{[clarification needed]}.^[10]

Gene function is inferred after looking at the effects of the insertion on gene function under certain conditions^{[clarification needed]}.

Advantages and disadvantages

Unlike high-throughput insertion track by deep sequencing (HITS) and transposon-directed insertion site sequencing (TraDIS)^{[clarification needed]}, Tn-seq is specific to the Himar I Mariner transposon, and cannot be applied to other transposons or insertional elements.^[10] However, the protocol for Tn-seq^{[clarification needed]} is less time intensive^{[citation needed]}. HITS and TraDIS^{[clarification needed]} use a DNA shearing^{[clarification needed]} technique that produce a range of PCR product sizes that could cause shorter DNA templates being preferentially amplified over longer templates. Tn-seq produces a product that is uniform in size, therefore reducing the possibility of PCR bias.^[10]

Tn-seq can be used to identify both the fitness of single genes and to map gene interactions in microorganisms. Existing methods for these types of study are dependent on preexisting genomic microarrays or gene knockout arrays, whereas Tn-seq is not. Tn-seq's utilization of massively parallel sequencing makes this technique easily reproducible, sensitive, and robust.^[10]^{[clarification needed]}

Applications

Tn-seq has proven to be a useful technique for identifying new gene functions.^{[clarification needed]} The highly sensitive nature of Tn-seq^{[citation needed]} can be used to determine phenotype-genotype relationships that may have been deemed insignificant by less sensitive methods. Tn-seq identified essential genes and pathways that are important for the utilization of cholesterol in Mycobacterium tuberculosis.^[16]

Tn-seq has been used to study higher order genome organization using gene interactions.^{[citation needed]} Genes function as a highly linked network^{[citation needed]}. Therefore, in order to study a gene's impact on phenotype, gene interactions must also be considered^{[citation needed]}. These gene networks can be studied by screening for synthetic lethality and gene interactions where a double mutant shows an unexpected fitness value compared to each individual mutant^{[clarification needed]}^{[citation needed]}. Tn-seq was used to determine genetic interactions between five query genes and the rest of the genome in Streptococcus pneumoniae, which revealed both aggravating and alleviating genetic interactions.^[4]^{[clarification needed]}^[10]

Tn-seq used in combination with RNA-seq can be utilized to examine the role of non-coding DNA regions.^[17]

References

^ Gawronski JD, Wong SM, Giannoukos G, Ward DV, Akerley BJ. Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung. Proc Natl Acad Sci USA. 2009;106:16422–7. doi: 10.1073/pnas.0906627106.PMC Free Article
^ Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe. 2009;6:279–89. doi: 10.1016/j.chom.2009.08.003.
^ Langridge GC, Phan MD, Turner DJ, Perkins TT, Parts L, Haase J, et al. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res. 2009;19:2308–16. doi: 10.1101/gr.097097.109.
^ ^a ^b van Opijnen T, Bodi KL, Camilli A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat Methods. 2009;6:767–72. doi: 10.1038/nmeth.1377.
^ ^a ^b ^c Barquist L, Boinett CJ, Cain AK (July 2013). "Approaches to querying bacterial genomes with transposon-insertion sequencing". RNA Biology. 10 (7): 1161–9. doi:10.4161/rna.24765. PMC 3849164. PMID 23635712.
^ Hayes F (2003). "Transposon-based strategies for microbial functional genomics and proteomics". Annual Review of Genetics. 37 (1): 3–29. doi:10.1146/annurev.genet.37.110801.142807. PMID 14616054.
^ Kleckner N, Chan RK, Tye BK, Botstein D (October 1975). "Mutagenesis by insertion of a drug-resistance element carrying an inverted repetition". Journal of Molecular Biology. 97 (4): 561–75. doi:10.1016/s0022-2836(75)80059-3. PMID 1102715.
^ Smith V, Chou KN, Lashkari D, Botstein D, Brown PO (December 1996). "Functional analysis of the genes of yeast chromosome V by genetic footprinting". Science. 274 (5295): 2069–74. Bibcode:1996Sci...274.2069S. doi:10.1126/science.274.5295.2069. PMID 8953036.
^ Akerley BJ, Rubin EJ, Camilli A, Lampe DJ, Robertson HM, Mekalanos JJ (July 1998). "Systematic identification of essential genes by in vitro mariner mutagenesis". Proceedings of the National Academy of Sciences of the United States of America. 95 (15): 8927–32. Bibcode:1998PNAS...95.8927A. doi:10.1073/pnas.95.15.8927. PMC 21179. PMID 9671781.
^ ^a ^b ^c ^d ^e ^f van Opijnen T, Bodi KL, Camilli A (October 2009). "Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms". Nature Methods. 6 (10): 767–72. doi:10.1038/nmeth.1377. PMC 2957483. PMID 19767758.
^ Mazurkiewicz P, Tang CM, Boone C, Holden DW (December 2006). "Signature-tagged mutagenesis: barcoding mutants for genome-wide screens". Nature Reviews Genetics. 7 (12): 929–39. doi:10.1038/nrg1984. PMID 17139324. S2CID 27956117.
^ Bork P (April 2000). "Powers and pitfalls in sequence analysis: the 70% hurdle". Genome Research. 10 (4): 398–400. doi:10.1101/gr.10.4.398. PMID 10779480.
^ Kasif S, Steffen M (January 2010). "Biochemical networks: the evolution of gene annotation". Nature Chemical Biology. 6 (1): 4–5. doi:10.1038/nchembio.288. PMC 2907659. PMID 20016491.
^ Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, Knight R, Gordon JI (September 2009). "Identifying genetic determinants needed to establish a human gut symbiont in its habitat". Cell Host & Microbe. 6 (3): 279–89. doi:10.1016/j.chom.2009.08.003. PMC 2895552. PMID 19748469.
^ Morgan RD, Dwinell EA, Bhatia TK, Lang EM, Luyten YA (August 2009). "The MmeI family: type II restriction-modification enzymes that employ single-strand modification for host protection". Nucleic Acids Research. 37 (15): 5208–21. doi:10.1093/nar/gkp534. PMC 2731913. PMID 19578066.
^ Griffin JE, Gawronski JD, Dejesus MA, Ioerger TR, Akerley BJ, Sassetti CM (September 2011). "High-resolution phenotypic profiling defines genes essential for mycobacterial growth and cholesterol catabolism". PLOS Pathogens. 7 (9): e1002251. doi:10.1371/journal.ppat.1002251. PMC 3182942. PMID 21980284.
^ Mann B, van Opijnen T, Wang J, Obert C, Wang YD, Carter R, McGoldrick DJ, Ridout G, Camilli A, Tuomanen EI, Rosch JW (2012). "Control of virulence by small RNAs in Streptococcus pneumoniae". PLOS Pathogens. 8 (7): e1002788. doi:10.1371/journal.ppat.1002788. PMC 3395615. PMID 22807675.

[1] Gawronski JD, Wong SM, Giannoukos G, Ward DV, Akerley BJ. Tracking insertion mutants within libraries by deep sequencing and a genome-wide screen for Haemophilus genes required in the lung. Proc Natl Acad Sci USA. 2009;106:16422–7. doi: 10.1073/pnas.0906627106.PMC Free Article

[2] Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, et al. Identifying genetic determinants needed to establish a human gut symbiont in its habitat. Cell Host Microbe. 2009;6:279–89. doi: 10.1016/j.chom.2009.08.003.

[3] Langridge GC, Phan MD, Turner DJ, Perkins TT, Parts L, Haase J, et al. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants. Genome Res. 2009;19:2308–16. doi: 10.1101/gr.097097.109.

[Opijnen_T_1377-4] van Opijnen T, Bodi KL, Camilli A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat Methods. 2009;6:767–72. doi: 10.1038/nmeth.1377.

[Barquist2013-5] Barquist L, Boinett CJ, Cain AK (July 2013). "Approaches to querying bacterial genomes with transposon-insertion sequencing". RNA Biology. 10 (7): 1161–9. doi:10.4161/rna.24765. PMC 3849164. PMID 23635712.

[6] Hayes F (2003). "Transposon-based strategies for microbial functional genomics and proteomics". Annual Review of Genetics. 37 (1): 3–29. doi:10.1146/annurev.genet.37.110801.142807. PMID 14616054.

[7] Kleckner N, Chan RK, Tye BK, Botstein D (October 1975). "Mutagenesis by insertion of a drug-resistance element carrying an inverted repetition". Journal of Molecular Biology. 97 (4): 561–75. doi:10.1016/s0022-2836(75)80059-3. PMID 1102715.

[8] Smith V, Chou KN, Lashkari D, Botstein D, Brown PO (December 1996). "Functional analysis of the genes of yeast chromosome V by genetic footprinting". Science. 274 (5295): 2069–74. Bibcode:1996Sci...274.2069S. doi:10.1126/science.274.5295.2069. PMID 8953036.

[9] Akerley BJ, Rubin EJ, Camilli A, Lampe DJ, Robertson HM, Mekalanos JJ (July 1998). "Systematic identification of essential genes by in vitro mariner mutagenesis". Proceedings of the National Academy of Sciences of the United States of America. 95 (15): 8927–32. Bibcode:1998PNAS...95.8927A. doi:10.1073/pnas.95.15.8927. PMC 21179. PMID 9671781.

[highthroughput-10] ^ ^a ^b ^c ^d ^e ^f van Opijnen T, Bodi KL, Camilli A (October 2009). "Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms". Nature Methods. 6 (10): 767–72. doi:10.1038/nmeth.1377. PMC 2957483. PMID 19767758.

[Mazurkiewicz-11] Mazurkiewicz P, Tang CM, Boone C, Holden DW (December 2006). "Signature-tagged mutagenesis: barcoding mutants for genome-wide screens". Nature Reviews Genetics. 7 (12): 929–39. doi:10.1038/nrg1984. PMID 17139324. S2CID 27956117.

[12] Bork P (April 2000). "Powers and pitfalls in sequence analysis: the 70% hurdle". Genome Research. 10 (4): 398–400. doi:10.1101/gr.10.4.398. PMID 10779480.

[13] Kasif S, Steffen M (January 2010). "Biochemical networks: the evolution of gene annotation". Nature Chemical Biology. 6 (1): 4–5. doi:10.1038/nchembio.288. PMC 2907659. PMID 20016491.

[14] Goodman AL, McNulty NP, Zhao Y, Leip D, Mitra RD, Lozupone CA, Knight R, Gordon JI (September 2009). "Identifying genetic determinants needed to establish a human gut symbiont in its habitat". Cell Host & Microbe. 6 (3): 279–89. doi:10.1016/j.chom.2009.08.003. PMC 2895552. PMID 19748469.

[15] Morgan RD, Dwinell EA, Bhatia TK, Lang EM, Luyten YA (August 2009). "The MmeI family: type II restriction-modification enzymes that employ single-strand modification for host protection". Nucleic Acids Research. 37 (15): 5208–21. doi:10.1093/nar/gkp534. PMC 2731913. PMID 19578066.

[16] Griffin JE, Gawronski JD, Dejesus MA, Ioerger TR, Akerley BJ, Sassetti CM (September 2011). "High-resolution phenotypic profiling defines genes essential for mycobacterial growth and cholesterol catabolism". PLOS Pathogens. 7 (9): e1002251. doi:10.1371/journal.ppat.1002251. PMC 3182942. PMID 21980284.

[17] Mann B, van Opijnen T, Wang J, Obert C, Wang YD, Carter R, McGoldrick DJ, Ridout G, Camilli A, Tuomanen EI, Rosch JW (2012). "Control of virulence by small RNAs in Streptococcus pneumoniae". PLOS Pathogens. 8 (7): e1002788. doi:10.1371/journal.ppat.1002788. PMC 3395615. PMID 22807675.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]