Jump to content

Genome skimming: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
OAbot (talk | contribs)
m Open access bot: doi added to citation with #oabot.
Citation bot (talk | contribs)
Alter: pages, title, url. Add: pages, issue, volume, bibcode, pmc, pmid. Removed URL that duplicated unique identifier. Removed parameters. Formatted dashes. | You can use this bot yourself. Report bugs here. | Activated by Headbomb | via #UCB_webform
Line 1: Line 1:
{{short description|Method of genome sequencing}}
{{short description|Method of genome sequencing}}
[[File:Genome Skimming.png|thumb|Genome skimming allows for assembly of high-copy fractions of the genome into contiguous, complete genomes.|alt=|500x500px]]
[[File:Genome Skimming.png|thumb|Genome skimming allows for assembly of high-copy fractions of the genome into contiguous, complete genomes.|alt=|500x500px]]
'''Genome skimming''' is a sequencing approach that uses low-pass, shallow [[DNA sequencing|sequencing]] of a [[genome]] (up to 5%), to generate fragments of DNA, known as '''genome skims'''.<ref name=":0">{{Cite journal|last=Straub|first=Shannon C. K.|last2=Parks|first2=Matthew|last3=Weitemier|first3=Kevin|last4=Fishbein|first4=Mark|last5=Cronn|first5=Richard C.|last6=Liston|first6=Aaron|date=February 2012|title=Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics|url=http://doi.wiley.com/10.3732/ajb.1100335|journal=American Journal of Botany|language=en|volume=99|issue=2|pages=349–364|doi=10.3732/ajb.1100335}}</ref><ref name=":13" /> These genome skims contain information about the high-copy fraction of the genome.<ref name=":13">{{Cite journal|last=Dodsworth|first=Steven|date=September 2015|title=Genome skimming for next-generation biodiversity analysis|url=https://linkinghub.elsevier.com/retrieve/pii/S1360138515001764|journal=Trends in Plant Science|language=en|volume=20|issue=9|pages=525–527|doi=10.1016/j.tplants.2015.06.012}}</ref> The high-copy fraction of the genome consists of the [[ribosomal DNA]], plastid genome ([[Chloroplast DNA|plastome]]), mitochondrial genome ([[Mitochondrial DNA|mitogenome]]), and nuclear repeats such as [[Microsatellite|microsatellites]] and [[Transposable element|transposable elements]].<ref name=":1">{{Cite book|last=Dodsworth, Steven Andrew, author.|url=http://worldcat.org/oclc/1108700470|title=Genome skimming for phylogenomics.|oclc=1108700470}}</ref> It employs high-throughput, [[DNA sequencing|next generation sequencing]] technology to generate these skims.<ref name=":0" /> Although these skims are merely 'the tip of the genomic iceberg', [[Phylogenomics|phylogenomic analysis]] of them can still provide insights on [[Evolutionary history of life|evolutionary history]] and [[biodiversity]] at a lower cost and larger scale than traditional methods.<ref name=":13" /><ref name=":1" /><ref name=":2">{{Cite journal|last=Trevisan|first=Bruna|last2=Alcantara|first2=Daniel M.C.|last3=Machado|first3=Denis Jacob|last4=Marques|first4=Fernando P.L.|last5=Lahr|first5=Daniel J.G.|date=2019-09-13|title=Genome skimming is a low-cost and robust strategy to assemble complete mitochondrial genomes from ethanol preserved specimens in biodiversity studies|url=https://peerj.com/articles/7543|journal=PeerJ|language=en|volume=7|pages=e7543|doi=10.7717/peerj.7543|issn=2167-8359|pmc=6746217|pmid=31565556}}</ref> Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and [[Forensic science|forensics]].<ref name=":3">{{Cite journal|last=Malé|first=Pierre-Jean G.|last2=Bardon|first2=Léa|last3=Besnard|first3=Guillaume|last4=Coissac|first4=Eric|last5=Delsuc|first5=Frédéric|last6=Engel|first6=Julien|last7=Lhuillier|first7=Emeline|last8=Scotti-Saintagne|first8=Caroline|last9=Tinaut|first9=Alexandra|last10=Chave|first10=Jérôme|date=April 2014|title=Genome skimming by shotgun sequencing helps resolve the phylogeny of a pantropical tree family|url=http://doi.wiley.com/10.1111/1755-0998.12246|journal=Molecular Ecology Resources|language=en|pages=n/a–n/a|doi=10.1111/1755-0998.12246}}</ref>
'''Genome skimming''' is a sequencing approach that uses low-pass, shallow [[DNA sequencing|sequencing]] of a [[genome]] (up to 5%), to generate fragments of DNA, known as '''genome skims'''.<ref name=":0">{{Cite journal|last=Straub|first=Shannon C. K.|last2=Parks|first2=Matthew|last3=Weitemier|first3=Kevin|last4=Fishbein|first4=Mark|last5=Cronn|first5=Richard C.|last6=Liston|first6=Aaron|date=February 2012|title=Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics|journal=American Journal of Botany|language=en|volume=99|issue=2|pages=349–364|doi=10.3732/ajb.1100335|pmid=22174336}}</ref><ref name=":13" /> These genome skims contain information about the high-copy fraction of the genome.<ref name=":13">{{Cite journal|last=Dodsworth|first=Steven|date=September 2015|title=Genome skimming for next-generation biodiversity analysis|journal=Trends in Plant Science|language=en|volume=20|issue=9|pages=525–527|doi=10.1016/j.tplants.2015.06.012|pmid=26205170}}</ref> The high-copy fraction of the genome consists of the [[ribosomal DNA]], plastid genome ([[Chloroplast DNA|plastome]]), mitochondrial genome ([[Mitochondrial DNA|mitogenome]]), and nuclear repeats such as [[Microsatellite|microsatellites]] and [[Transposable element|transposable elements]].<ref name=":1">{{Cite book|last=Dodsworth, Steven Andrew, author.|title=Genome skimming for phylogenomics.|oclc=1108700470}}</ref> It employs high-throughput, [[DNA sequencing|next generation sequencing]] technology to generate these skims.<ref name=":0" /> Although these skims are merely 'the tip of the genomic iceberg', [[Phylogenomics|phylogenomic analysis]] of them can still provide insights on [[Evolutionary history of life|evolutionary history]] and [[biodiversity]] at a lower cost and larger scale than traditional methods.<ref name=":13" /><ref name=":1" /><ref name=":2">{{Cite journal|last=Trevisan|first=Bruna|last2=Alcantara|first2=Daniel M.C.|last3=Machado|first3=Denis Jacob|last4=Marques|first4=Fernando P.L.|last5=Lahr|first5=Daniel J.G.|date=2019-09-13|title=Genome skimming is a low-cost and robust strategy to assemble complete mitochondrial genomes from ethanol preserved specimens in biodiversity studies|journal=PeerJ|language=en|volume=7|pages=e7543|doi=10.7717/peerj.7543|issn=2167-8359|pmc=6746217|pmid=31565556}}</ref> Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and [[Forensic science|forensics]].<ref name=":3">{{Cite journal|last=Malé|first=Pierre-Jean G.|last2=Bardon|first2=Léa|last3=Besnard|first3=Guillaume|last4=Coissac|first4=Eric|last5=Delsuc|first5=Frédéric|last6=Engel|first6=Julien|last7=Lhuillier|first7=Emeline|last8=Scotti-Saintagne|first8=Caroline|last9=Tinaut|first9=Alexandra|last10=Chave|first10=Jérôme|date=April 2014|title=Genome skimming by shotgun sequencing helps resolve the phylogeny of a pantropical tree family|journal=Molecular Ecology Resources|language=en|pages=n/a|doi=10.1111/1755-0998.12246|pmid=24606032}}</ref>


== Current Uses ==
== Current Uses ==
In addition to the assembly of the smaller organellar genomes, genome skimming can also be used to uncover conserved [[Sequence homology|ortholog]] sequences for [[Phylogenomics|phylogenomic studies]]. In phylogenomic studies of multicellular [[Pathogen|pathogens]], genome skimming can be used to find [[Effector (biology)|effector genes]], discover [[Endosymbiont|endosymbionts]] and characterize [[Genetic variation|genomic variation]].<ref name=":4">{{Cite journal|last=Denver|first=Dee R.|last2=Brown|first2=Amanda M. V.|last3=Howe|first3=Dana K.|last4=Peetz|first4=Amy B.|last5=Zasada|first5=Inga A.|date=2016-08-04|editor-last=Round|editor-first=June L.|title=Genome Skimming: A Rapid Approach to Gaining Diverse Biological Insights into Multicellular Pathogens|url=https://dx.plos.org/10.1371/journal.ppat.1005713|journal=PLOS Pathogens|language=en|volume=12|issue=8|pages=e1005713|doi=10.1371/journal.ppat.1005713|issn=1553-7374|pmc=4973915|pmid=27490201}}</ref>
In addition to the assembly of the smaller organellar genomes, genome skimming can also be used to uncover conserved [[Sequence homology|ortholog]] sequences for [[Phylogenomics|phylogenomic studies]]. In phylogenomic studies of multicellular [[Pathogen|pathogens]], genome skimming can be used to find [[Effector (biology)|effector genes]], discover [[Endosymbiont|endosymbionts]] and characterize [[Genetic variation|genomic variation]].<ref name=":4">{{Cite journal|last=Denver|first=Dee R.|last2=Brown|first2=Amanda M. V.|last3=Howe|first3=Dana K.|last4=Peetz|first4=Amy B.|last5=Zasada|first5=Inga A.|date=2016-08-04|editor-last=Round|editor-first=June L.|title=Genome Skimming: A Rapid Approach to Gaining Diverse Biological Insights into Multicellular Pathogens|journal=PLOS Pathogens|language=en|volume=12|issue=8|pages=e1005713|doi=10.1371/journal.ppat.1005713|issn=1553-7374|pmc=4973915|pmid=27490201}}</ref>


=== High-copy DNA ===
=== High-copy DNA ===


==== Ribosomal DNA ====
==== Ribosomal DNA ====
The [[Internal transcribed spacer|Internal transcribed spacers (ITS)]] are non-coding regions within the 18-5.8-28S rDNA in eukaryotes, and are one feature of rDNA that has been used in genome skimming studies.<ref name=":5">{{Cite journal|last=Lin|first=Geng-Ming|last2=Lai|first2=Yu-Heng|last3=Audira|first3=Gilbert|last4=Hsiao|first4=Chung-Der|date=November 2017|title=A Simple Method to Decode the Complete 18-5.8-28S rRNA Repeated Units of Green Algae by Genome Skimming|url=https://www.mdpi.com/1422-0067/18/11/2341|journal=International Journal of Molecular Sciences|language=en|volume=18|issue=11|pages=2341|doi=10.3390/ijms18112341|doi-access=free}}</ref> ITS are used to detect different species within a [[genus]], due to their high inter-species variability.<ref name=":5" /> These have low individual variability, preventing identification of distinct strains or individuals.<ref name=":5" /> They are also present in all [[Eukaryote|eukaryotes]], has a high evolution rate, and has been used in [[Phylogenetics|phylogenetic analysis]] between and across species.<ref name=":5" />
The [[Internal transcribed spacer|Internal transcribed spacers (ITS)]] are non-coding regions within the 18-5.8-28S rDNA in eukaryotes, and are one feature of rDNA that has been used in genome skimming studies.<ref name=":5">{{Cite journal|last=Lin|first=Geng-Ming|last2=Lai|first2=Yu-Heng|last3=Audira|first3=Gilbert|last4=Hsiao|first4=Chung-Der|date=November 2017|title=A Simple Method to Decode the Complete 18-5.8-28S rRNA Repeated Units of Green Algae by Genome Skimming|journal=International Journal of Molecular Sciences|language=en|volume=18|issue=11|pages=2341|doi=10.3390/ijms18112341|pmid=29113146|pmc=5713310|doi-access=free}}</ref> ITS are used to detect different species within a [[genus]], due to their high inter-species variability.<ref name=":5" /> These have low individual variability, preventing identification of distinct strains or individuals.<ref name=":5" /> They are also present in all [[Eukaryote|eukaryotes]], has a high evolution rate, and has been used in [[Phylogenetics|phylogenetic analysis]] between and across species.<ref name=":5" />


When targeting nuclear rDNA, it is suggested that a minimum final [[Coverage (genetics)|sequencing depth]] of 100X is achieved, and sequences with less than 5X depth are masked.<ref name=":0" />
When targeting nuclear rDNA, it is suggested that a minimum final [[Coverage (genetics)|sequencing depth]] of 100X is achieved, and sequences with less than 5X depth are masked.<ref name=":0" />


==== Plastomes ====
==== Plastomes ====
The [[Plastid|plastid genome]], or plastome, has been used extensively in identification and evolutionary studies using genome skimming due to its high abundance within plants (~3-5% of cell DNA), small size, simple structure, greater conservation of gene structure than nuclear or mitochondrial genes.<ref name=":17">{{Cite journal|last=Liu|first=Luxian|last2=Wang|first2=Yuewen|last3=He|first3=Peizi|last4=Li|first4=Pan|last5=Lee|first5=Joongku|last6=Soltis|first6=Douglas E.|last7=Fu|first7=Chengxin|date=2018-04-04|title=Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia (Saxifragaceae), using genome skimming data|url=https://doi.org/10.1186/s12864-018-4633-x|journal=BMC Genomics|volume=19|issue=1|pages=235|doi=10.1186/s12864-018-4633-x|issn=1471-2164|pmc=5885378|pmid=29618324}}</ref><ref name=":6" /> Plastids studies have been previously been limited by the number of regions that could be assessed in traditional approaches<ref name=":6">{{Cite journal|last=Hinsinger|first=Damien Daniel|last2=Strijk|first2=Joeri Sergej|date=2019-01-10|title=Plastome of Quercus xanthoclada and comparison of genomic diversity amongst selected Quercus species using genome skimming|url=https://phytokeys.pensoft.net/article/36365/|journal=PhytoKeys|language=en|volume=132|pages=75–89|doi=10.3897/phytokeys.132.36365|issn=1314-2003|doi-access=free}}</ref>. Using genome skimming, the sequencing of the entire plastid genome, or plastome, can be done at a fraction of the cost and time required for typical sequencing approaches like [[Sanger sequencing]].<ref name=":1" /> Plastomes have been suggested as a method to replace traditional [[DNA barcoding|DNA barcodes]] in plants,<ref name=":1" /> such as the ''rbcL'' and ''matK'' barcode genes. Compared to the typical DNA barcode, genome skimming produces plastomes at a tenth of the cost per base.<ref name=":3" /> Recent uses of genome skims of plastomes have allowed greater resolution of phylogenies, higher differentiation of specific groups within taxa, and more accurate estimates of biodiversity.<ref name=":6" /> Additionally, the plastome has been used to compare species within a genus to look at evolutionary changes and diversity within a group.<ref name=":6" />
The [[Plastid|plastid genome]], or plastome, has been used extensively in identification and evolutionary studies using genome skimming due to its high abundance within plants (~3-5% of cell DNA), small size, simple structure, greater conservation of gene structure than nuclear or mitochondrial genes.<ref name=":17">{{Cite journal|last=Liu|first=Luxian|last2=Wang|first2=Yuewen|last3=He|first3=Peizi|last4=Li|first4=Pan|last5=Lee|first5=Joongku|last6=Soltis|first6=Douglas E.|last7=Fu|first7=Chengxin|date=2018-04-04|title=Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia (Saxifragaceae), using genome skimming data|journal=BMC Genomics|volume=19|issue=1|pages=235|doi=10.1186/s12864-018-4633-x|issn=1471-2164|pmc=5885378|pmid=29618324}}</ref><ref name=":6" /> Plastids studies have been previously been limited by the number of regions that could be assessed in traditional approaches<ref name=":6">{{Cite journal|last=Hinsinger|first=Damien Daniel|last2=Strijk|first2=Joeri Sergej|date=2019-01-10|title=Plastome of Quercus xanthoclada and comparison of genomic diversity amongst selected Quercus species using genome skimming|journal=PhytoKeys|language=en|volume=132|pages=75–89|doi=10.3897/phytokeys.132.36365|pmid=31607787|pmc=6783484|issn=1314-2003|doi-access=free}}</ref>. Using genome skimming, the sequencing of the entire plastid genome, or plastome, can be done at a fraction of the cost and time required for typical sequencing approaches like [[Sanger sequencing]].<ref name=":1" /> Plastomes have been suggested as a method to replace traditional [[DNA barcoding|DNA barcodes]] in plants,<ref name=":1" /> such as the ''rbcL'' and ''matK'' barcode genes. Compared to the typical DNA barcode, genome skimming produces plastomes at a tenth of the cost per base.<ref name=":3" /> Recent uses of genome skims of plastomes have allowed greater resolution of phylogenies, higher differentiation of specific groups within taxa, and more accurate estimates of biodiversity.<ref name=":6" /> Additionally, the plastome has been used to compare species within a genus to look at evolutionary changes and diversity within a group.<ref name=":6" />


When targeting plastomes, it is suggested that a minimum final sequencing depth of 30X is achieved for single-copy regions to ensure high quality assemblies. [[Single-nucleotide polymorphism|Single nucleotide polymorphisms (SNPs)]] with less than 20X depth should be masked.<ref name=":0" />
When targeting plastomes, it is suggested that a minimum final sequencing depth of 30X is achieved for single-copy regions to ensure high quality assemblies. [[Single-nucleotide polymorphism|Single nucleotide polymorphisms (SNPs)]] with less than 20X depth should be masked.<ref name=":0" />


==== Mitogenomes ====
==== Mitogenomes ====
The [[Mitochondrial DNA|mitochondrial genome]], or mitogenome, is used as a [[molecular marker]] in a great variety of studies because of its [[Non-Mendelian inheritance|maternal inheritance]], high copy-number in the cell, lack of [[Homologous recombination|recombination]], and high mutation rate. It’s often used for phylogenetic studies as it is very uniform across metazoan groups, with a circular, double-stranded DNA molecule structure, about 15 to 20 kilobases, with 37 ribosomal RNA genes, 13 protein-coding genes, and 22 transfer RNA genes. Mitochondrial barcode sequences, such as COI, [[NADH2 dehydrogenase (ubiquinone)|NADH2]], [[16S rRNA (guanine1405-N7)-methyltransferase|16S rRNA]], and [[12S rRNA]], can also be used for taxonomic identification.<ref name=":7">{{Cite journal|last=Johri|first=Shaili|last2=Solanki|first2=Jitesh|last3=Cantu|first3=Vito Adrian|last4=Fellows|first4=Sam R.|last5=Edwards|first5=Robert A.|last6=Moreno|first6=Isabel|last7=Vyas|first7=Asit|last8=Dinsdale|first8=Elizabeth A.|date=December 2019|title=‘Genome skimming’ with the MinION hand-held sequencer identifies CITES-listed shark species in India’s exports market|url=http://www.nature.com/articles/s41598-019-40940-9|journal=Scientific Reports|language=en|volume=9|issue=1|pages=4476|doi=10.1038/s41598-019-40940-9|issn=2045-2322|pmc=6418218|pmid=30872700|via=}}</ref> The increased publishing of complete [[Mitochondrial DNA|mitogenomes]] allows for inference of robust phylogenies across many taxonomic groups, and it can capture events such as gene rearrangements and positioning of mobile genetic elements. Using genome skimming to assemble complete mitogenomes, phylogenetic history and biodiversity of many organisms can be resolved.<ref name=":2" />
The [[Mitochondrial DNA|mitochondrial genome]], or mitogenome, is used as a [[molecular marker]] in a great variety of studies because of its [[Non-Mendelian inheritance|maternal inheritance]], high copy-number in the cell, lack of [[Homologous recombination|recombination]], and high mutation rate. It’s often used for phylogenetic studies as it is very uniform across metazoan groups, with a circular, double-stranded DNA molecule structure, about 15 to 20 kilobases, with 37 ribosomal RNA genes, 13 protein-coding genes, and 22 transfer RNA genes. Mitochondrial barcode sequences, such as COI, [[NADH2 dehydrogenase (ubiquinone)|NADH2]], [[16S rRNA (guanine1405-N7)-methyltransferase|16S rRNA]], and [[12S rRNA]], can also be used for taxonomic identification.<ref name=":7">{{Cite journal|last=Johri|first=Shaili|last2=Solanki|first2=Jitesh|last3=Cantu|first3=Vito Adrian|last4=Fellows|first4=Sam R.|last5=Edwards|first5=Robert A.|last6=Moreno|first6=Isabel|last7=Vyas|first7=Asit|last8=Dinsdale|first8=Elizabeth A.|date=December 2019|title='Genome skimming' with the MinION hand-held sequencer identifies CITES-listed shark species in India's exports market|journal=Scientific Reports|language=en|volume=9|issue=1|pages=4476|doi=10.1038/s41598-019-40940-9|issn=2045-2322|pmc=6418218|pmid=30872700|bibcode=2019NatSR...9.4476J}}</ref> The increased publishing of complete [[Mitochondrial DNA|mitogenomes]] allows for inference of robust phylogenies across many taxonomic groups, and it can capture events such as gene rearrangements and positioning of mobile genetic elements. Using genome skimming to assemble complete mitogenomes, phylogenetic history and biodiversity of many organisms can be resolved.<ref name=":2" />


When targeting mitogenomes, there are no specific suggestions for minimum final sequencing depth, as mitogenomes are more variable in size and more variable in complexity in plant species, increasing the difficulty of assembling repeated sequences. However, highly conserved coding sequences and nonrepetitive flanking regions can be assembled using [[Sequence assembly|reference-guided assembly]]. Sequences should be masked similarly to targeting plastomes and nuclear ribosomal DNA.<ref name=":0" />
When targeting mitogenomes, there are no specific suggestions for minimum final sequencing depth, as mitogenomes are more variable in size and more variable in complexity in plant species, increasing the difficulty of assembling repeated sequences. However, highly conserved coding sequences and nonrepetitive flanking regions can be assembled using [[Sequence assembly|reference-guided assembly]]. Sequences should be masked similarly to targeting plastomes and nuclear ribosomal DNA.<ref name=":0" />
Line 27: Line 27:


=== Low-copy DNA ===
=== Low-copy DNA ===
Low-copy DNA can prove useful for evolution developmental and phylogenetic studies.<ref name=":8">{{Cite journal|last=Berger|first=Brent A.|last2=Han|first2=Jiahong|last3=Sessa|first3=Emily B.|last4=Gardner|first4=Andrew G.|last5=Shepherd|first5=Kelly A.|last6=Ricigliano|first6=Vincent A.|last7=Jabaily|first7=Rachel S.|last8=Howarth|first8=Dianella G.|date=2017|title=The unexpected depths of genome-skimming data: A case study examining Goodeniaceae floral symmetry genes1|url=https://bsapubs.onlinelibrary.wiley.com/doi/abs/10.3732/apps.1700042|journal=Applications in Plant Sciences|language=en|volume=5|issue=10|pages=1700042|doi=10.3732/apps.1700042|issn=2168-0450|pmc=5664964|pmid=29109919}}</ref> It can be mined from high-copy fractions in a number of ways such as developing [[Oligonucleotide synthesis|primers]] from databases that contain conserved [[Sequence homology|orthologous genes]], single‐copy conserved orthologous gene, and shared copy genes.<ref name=":8" /> Another method is looking for novel probes that target low copy genes using transcriptomics via Hyb-Seq.<ref name=":8" /> While nuclear genomes assembled using genome skims are extremely fragmented, some low-copy single-copy nuclear genes can be successfully assembled.<ref>{{Cite journal|last=Berger|first=Brent A.|last2=Han|first2=Jiahong|last3=Sessa|first3=Emily B.|last4=Gardner|first4=Andrew G.|last5=Shepherd|first5=Kelly A.|last6=Ricigliano|first6=Vincent A.|last7=Jabaily|first7=Rachel S.|last8=Howarth|first8=Dianella G.|date=October 2017|title=The Unexpected Depths of Genome-Skimming Data: A Case Study Examining Goodeniaceae Floral Symmetry Genes|url=http://doi.wiley.com/10.3732/apps.1700042|journal=Applications in Plant Sciences|language=en|volume=5|issue=10|pages=1700042|doi=10.3732/apps.1700042|issn=2168-0450|pmc=5664964|pmid=29109919}}</ref>
Low-copy DNA can prove useful for evolution developmental and phylogenetic studies.<ref name=":8">{{Cite journal|last=Berger|first=Brent A.|last2=Han|first2=Jiahong|last3=Sessa|first3=Emily B.|last4=Gardner|first4=Andrew G.|last5=Shepherd|first5=Kelly A.|last6=Ricigliano|first6=Vincent A.|last7=Jabaily|first7=Rachel S.|last8=Howarth|first8=Dianella G.|date=2017|title=The unexpected depths of genome-skimming data: A case study examining Goodeniaceae floral symmetry genes1|journal=Applications in Plant Sciences|language=en|volume=5|issue=10|pages=1700042|doi=10.3732/apps.1700042|issn=2168-0450|pmc=5664964|pmid=29109919}}</ref> It can be mined from high-copy fractions in a number of ways such as developing [[Oligonucleotide synthesis|primers]] from databases that contain conserved [[Sequence homology|orthologous genes]], single‐copy conserved orthologous gene, and shared copy genes.<ref name=":8" /> Another method is looking for novel probes that target low copy genes using transcriptomics via Hyb-Seq.<ref name=":8" /> While nuclear genomes assembled using genome skims are extremely fragmented, some low-copy single-copy nuclear genes can be successfully assembled.<ref>{{Cite journal|last=Berger|first=Brent A.|last2=Han|first2=Jiahong|last3=Sessa|first3=Emily B.|last4=Gardner|first4=Andrew G.|last5=Shepherd|first5=Kelly A.|last6=Ricigliano|first6=Vincent A.|last7=Jabaily|first7=Rachel S.|last8=Howarth|first8=Dianella G.|date=October 2017|title=The Unexpected Depths of Genome-Skimming Data: A Case Study Examining Goodeniaceae Floral Symmetry Genes|journal=Applications in Plant Sciences|language=en|volume=5|issue=10|pages=1700042|doi=10.3732/apps.1700042|issn=2168-0450|pmc=5664964|pmid=29109919}}</ref>


=== Low-quantity degraded DNA ===
=== Low-quantity degraded DNA ===
Previous methods of trying to recover degraded DNA were based on [[Sanger sequencing]] and relied on large intact DNA templates and were affected by contamination and method of preservation. Genome skimming, on the other hand, can be used to extract genetic information from preserved species in [[Herbarium|herbariums]] and museums, where the DNA is often very degraded, and very little remains.<ref name=":2" /><ref name=":9">{{Cite journal|last=Zeng|first=Chun-Xia|last2=Hollingsworth|first2=Peter M.|last3=Yang|first3=Jing|last4=He|first4=Zheng-Shan|last5=Zhang|first5=Zhi-Rong|last6=Li|first6=De-Zhu|last7=Yang|first7=Jun-Bo|date=2018-06-05|title=Genome skimming herbarium specimens for DNA barcoding and phylogenomics|url=https://doi.org/10.1186/s13007-018-0300-0|journal=Plant Methods|volume=14|issue=1|pages=43|doi=10.1186/s13007-018-0300-0|issn=1746-4811|pmc=5987614|pmid=29928291}}</ref> Studies in plants show that DNA as old as 80 years and with as little as 500 pg of degraded DNA, can be used with genome skimming to infer genomic information.<ref name=":9" /> In [[Herbarium|herbaria]], even with low yield and low quality DNA, one study was still able to produce "high-quality complete chloroplast and ribosomal DNA sequences" at a large scale for downstream analyses.<ref name=":11">{{Cite journal|last=Nevill|first=Paul G.|last2=Zhong|first2=Xiao|last3=Tonti-Filippini|first3=Julian|last4=Byrne|first4=Margaret|last5=Hislop|first5=Michael|last6=Thiele|first6=Kevin|last7=van Leeuwen|first7=Stephen|last8=Boykin|first8=Laura M.|last9=Small|first9=Ian|date=2020-01-04|title=Large scale genome skimming from herbarium material for accurate plant identification and phylogenomics|url=https://doi.org/10.1186/s13007-019-0534-5|journal=Plant Methods|volume=16|issue=1|pages=1|doi=10.1186/s13007-019-0534-5|issn=1746-4811|pmc=6942304|pmid=31911810}}</ref>
Previous methods of trying to recover degraded DNA were based on [[Sanger sequencing]] and relied on large intact DNA templates and were affected by contamination and method of preservation. Genome skimming, on the other hand, can be used to extract genetic information from preserved species in [[Herbarium|herbariums]] and museums, where the DNA is often very degraded, and very little remains.<ref name=":2" /><ref name=":9">{{Cite journal|last=Zeng|first=Chun-Xia|last2=Hollingsworth|first2=Peter M.|last3=Yang|first3=Jing|last4=He|first4=Zheng-Shan|last5=Zhang|first5=Zhi-Rong|last6=Li|first6=De-Zhu|last7=Yang|first7=Jun-Bo|date=2018-06-05|title=Genome skimming herbarium specimens for DNA barcoding and phylogenomics|journal=Plant Methods|volume=14|issue=1|pages=43|doi=10.1186/s13007-018-0300-0|issn=1746-4811|pmc=5987614|pmid=29928291}}</ref> Studies in plants show that DNA as old as 80 years and with as little as 500 pg of degraded DNA, can be used with genome skimming to infer genomic information.<ref name=":9" /> In [[Herbarium|herbaria]], even with low yield and low quality DNA, one study was still able to produce "high-quality complete chloroplast and ribosomal DNA sequences" at a large scale for downstream analyses.<ref name=":11">{{Cite journal|last=Nevill|first=Paul G.|last2=Zhong|first2=Xiao|last3=Tonti-Filippini|first3=Julian|last4=Byrne|first4=Margaret|last5=Hislop|first5=Michael|last6=Thiele|first6=Kevin|last7=van Leeuwen|first7=Stephen|last8=Boykin|first8=Laura M.|last9=Small|first9=Ian|date=2020-01-04|title=Large scale genome skimming from herbarium material for accurate plant identification and phylogenomics|journal=Plant Methods|volume=16|issue=1|pages=1|doi=10.1186/s13007-019-0534-5|issn=1746-4811|pmc=6942304|pmid=31911810}}</ref>


In field studies, invertebrates are stored in ethanol which is usually discarded during DNA-based studies.<ref name=":10">{{Cite journal|last=Linard|first=B.|last2=Arribas|first2=P.|last3=Andújar|first3=C.|last4=Crampton‐Platt|first4=A.|last5=Vogler|first5=A. P.|date=2016|title=Lessons from genome skimming of arthropod-preserving ethanol|url=https://onlinelibrary.wiley.com/doi/abs/10.1111/1755-0998.12539|journal=Molecular Ecology Resources|language=en|volume=16|issue=6|pages=1365–1377|doi=10.1111/1755-0998.12539|issn=1755-0998}}</ref> Genome skimming has been shown to detect the low quantity of DNA from this ethanol-fraction and provide information about the biomass of the specimens in a fraction, the microbiota of outer tissue layers and the gut contents (like prey) released by the vomit reflex<ref name=":10" />. Thus, genome skimming can provide an additional method of understanding [[ecology]] via low copy DNA.<ref name=":10" />
In field studies, invertebrates are stored in ethanol which is usually discarded during DNA-based studies.<ref name=":10">{{Cite journal|last=Linard|first=B.|last2=Arribas|first2=P.|last3=Andújar|first3=C.|last4=Crampton‐Platt|first4=A.|last5=Vogler|first5=A. P.|date=2016|title=Lessons from genome skimming of arthropod-preserving ethanol|journal=Molecular Ecology Resources|language=en|volume=16|issue=6|pages=1365–1377|doi=10.1111/1755-0998.12539|pmid=27235167|issn=1755-0998|url=https://hal.archives-ouvertes.fr/hal-01636888/file/Linard_et_al_MER_R1_final_clean_dryad.pdf}}</ref> Genome skimming has been shown to detect the low quantity of DNA from this ethanol-fraction and provide information about the biomass of the specimens in a fraction, the microbiota of outer tissue layers and the gut contents (like prey) released by the vomit reflex<ref name=":10" />. Thus, genome skimming can provide an additional method of understanding [[ecology]] via low copy DNA.<ref name=":10" />


== Workflow ==
== Workflow ==
Line 41: Line 41:
==== Plants ====
==== Plants ====
* Plant DNAzol Reagent
* Plant DNAzol Reagent
* Qiagen DNeasy Plant Mini kit<ref name=":3" /><ref name=":11" /><ref name=":15">{{Cite journal|last=Liu|first=Shih-Hui|last2=Edwards|first2=Christine E.|last3=Hoch|first3=Peter C.|last4=Raven|first4=Peter H.|last5=Barber|first5=Janet C.|date=May 2018|title=Genome skimming provides new insight into the relationships in Ludwigia section Macrocarpon , a polyploid complex|url=http://doi.wiley.com/10.1002/ajb2.1086|journal=American Journal of Botany|language=en|volume=105|issue=5|pages=875–887|doi=10.1002/ajb2.1086|via=}}</ref><ref name=":16">{{Cite journal|last=Nauheimer|first=Lars|last2=Cui|first2=Lujing|last3=Clarke|first3=Charles|last4=Crayn|first4=Darren M.|last5=Bourke|first5=Greg|last6=Nargar|first6=Katharina|date=2019|title=Genome skimming provides well resolved plastid and nuclear phylogenies, showing patterns of deep reticulate evolution in the tropical carnivorous plant genus Nepenthes (Caryophyllales)|url=http://www.publish.csiro.au/?paper=SB18057|journal=Australian Systematic Botany|language=en|doi=10.1071/SB18057|issn=1030-1887}}</ref>
* Qiagen DNeasy Plant Mini kit<ref name=":3" /><ref name=":11" /><ref name=":15">{{Cite journal|last=Liu|first=Shih-Hui|last2=Edwards|first2=Christine E.|last3=Hoch|first3=Peter C.|last4=Raven|first4=Peter H.|last5=Barber|first5=Janet C.|date=May 2018|title=Genome skimming provides new insight into the relationships in Ludwigia section Macrocarpon , a polyploid complex|journal=American Journal of Botany|language=en|volume=105|issue=5|pages=875–887|doi=10.1002/ajb2.1086|pmid=29791715}}</ref><ref name=":16">{{Cite journal|last=Nauheimer|first=Lars|last2=Cui|first2=Lujing|last3=Clarke|first3=Charles|last4=Crayn|first4=Darren M.|last5=Bourke|first5=Greg|last6=Nargar|first6=Katharina|date=2019|title=Genome skimming provides well resolved plastid and nuclear phylogenies, showing patterns of deep reticulate evolution in the tropical carnivorous plant genus Nepenthes (Caryophyllales)|url=http://www.publish.csiro.au/?paper=SB18057|journal=Australian Systematic Botany|volume=32|issue=3|pages=243–254|language=en|doi=10.1071/SB18057|issn=1030-1887}}</ref>
* Tiangen DNAsecure Plant kit<ref name=":9" />
* Tiangen DNAsecure Plant kit<ref name=":9" />
* Invitrogen ChargeSwitch gDNA Plant kit<ref name=":16" />
* Invitrogen ChargeSwitch gDNA Plant kit<ref name=":16" />
Line 47: Line 47:
==== Other ====
==== Other ====
* Quick-DNA Plus Extraction kit<ref name=":7" />
* Quick-DNA Plus Extraction kit<ref name=":7" />
* Cetyl Trimethylammonium Bromide (CTAB) method<ref name=":5" /><ref name=":geneious">{{Cite journal|last=Ripma|first=Lee A.|last2=Simpson|first2=Michael G.|last3=Hasenstab-Lehman|first3=Kristen|date=December 2014|title=Geneious! Simplified Genome Skimming Methods for Phylogenetic Systematic Studies: A Case Study in Oreocarya (Boraginaceae)|url=http://doi.wiley.com/10.3732/apps.1400062|journal=Applications in Plant Sciences|language=en|volume=2|issue=12|pages=1400062|doi=10.3732/apps.1400062|issn=2168-0450|pmc=4259456|pmid=25506521|via=}}</ref><ref name=":14" /><ref name=":18">{{Cite journal|last=Stoughton|first=Thomas R.|last2=Kriebel|first2=Ricardo|last3=Jolles|first3=Diana D.|last4=O'Quinn|first4=Robin L.|date=March 2018|title=Next-generation lineage discovery: A case study of tuberous Claytonia L.|url=http://doi.wiley.com/10.1002/ajb2.1061|journal=American Journal of Botany|language=en|volume=105|issue=3|pages=536–548|doi=10.1002/ajb2.1061|via=|doi-access=free}}</ref><ref name=":19" />
* Cetyl Trimethylammonium Bromide (CTAB) method<ref name=":5" /><ref name=":geneious">{{Cite journal|last=Ripma|first=Lee A.|last2=Simpson|first2=Michael G.|last3=Hasenstab-Lehman|first3=Kristen|date=December 2014|title=Geneious! Simplified Genome Skimming Methods for Phylogenetic Systematic Studies: A Case Study in Oreocarya (Boraginaceae)|journal=Applications in Plant Sciences|language=en|volume=2|issue=12|pages=1400062|doi=10.3732/apps.1400062|issn=2168-0450|pmc=4259456|pmid=25506521}}</ref><ref name=":14" /><ref name=":18">{{Cite journal|last=Stoughton|first=Thomas R.|last2=Kriebel|first2=Ricardo|last3=Jolles|first3=Diana D.|last4=O'Quinn|first4=Robin L.|date=March 2018|title=Next-generation lineage discovery: A case study of tuberous Claytonia L.|journal=American Journal of Botany|language=en|volume=105|issue=3|pages=536–548|doi=10.1002/ajb2.1061|pmid=29672830|doi-access=free}}</ref><ref name=":19" />
* Qiagen DNeasy Tissue Extraction kit<ref name=":12">{{Cite journal|last=Jackson|first=David|last2=Emslie|first2=Steven D|last3=van Tuinen|first3=Marcel|date=2012|title=Genome skimming identifies polymorphism in tern populations and species|url=http://bmcresnotes.biomedcentral.com/articles/10.1186/1756-0500-5-94|journal=BMC Research Notes|language=en|volume=5|issue=1|pages=94|doi=10.1186/1756-0500-5-94|issn=1756-0500|pmc=3292991|pmid=22333071}}</ref>
* Qiagen DNeasy Tissue Extraction kit<ref name=":12">{{Cite journal|last=Jackson|first=David|last2=Emslie|first2=Steven D|last3=van Tuinen|first3=Marcel|date=2012|title=Genome skimming identifies polymorphism in tern populations and species|journal=BMC Research Notes|language=en|volume=5|issue=1|pages=94|doi=10.1186/1756-0500-5-94|issn=1756-0500|pmc=3292991|pmid=22333071}}</ref>
* Qiagen DNeasy Blood and Tissue kit<ref name=":2" /><ref name=":4" />
* Qiagen DNeasy Blood and Tissue kit<ref name=":2" /><ref name=":4" />
{{col-end}}
{{col-end}}
Line 57: Line 57:
{{columns-list|colwidth=15em|
{{columns-list|colwidth=15em|
* Illumina TruSeq DNA Sample Preparation kit<ref name=":3" /><ref name=":4" /><ref name=":10" />
* Illumina TruSeq DNA Sample Preparation kit<ref name=":3" /><ref name=":4" /><ref name=":10" />
* Illumina TruSeq PCR-free kit<ref name=":5" /><ref name=":19">{{Cite journal|last=Dodsworth|first=Steven|last2=Guignard|first2=Maïté S.|last3=Christenhusz|first3=Maarten J. M.|last4=Cowan|first4=Robyn S.|last5=Knapp|first5=Sandra|last6=Maurin|first6=Olivier|last7=Struebig|first7=Monika|last8=Leitch|first8=Andrew R.|last9=Chase|first9=Mark W.|last10=Forest|first10=Félix|date=2018-10-29|title=Potential of Herbariomics for Studying Repetitive DNA in Angiosperms|url=https://www.frontiersin.org/article/10.3389/fevo.2018.00174/full|journal=Frontiers in Ecology and Evolution|volume=6|pages=174|doi=10.3389/fevo.2018.00174|issn=2296-701X|doi-access=free}}</ref>
* Illumina TruSeq PCR-free kit<ref name=":5" /><ref name=":19">{{Cite journal|last=Dodsworth|first=Steven|last2=Guignard|first2=Maïté S.|last3=Christenhusz|first3=Maarten J. M.|last4=Cowan|first4=Robyn S.|last5=Knapp|first5=Sandra|last6=Maurin|first6=Olivier|last7=Struebig|first7=Monika|last8=Leitch|first8=Andrew R.|last9=Chase|first9=Mark W.|last10=Forest|first10=Félix|date=2018-10-29|title=Potential of Herbariomics for Studying Repetitive DNA in Angiosperms|journal=Frontiers in Ecology and Evolution|volume=6|pages=174|doi=10.3389/fevo.2018.00174|issn=2296-701X|doi-access=free}}</ref>
* NEXTFlex DNA Sequencing kit<ref name=":geneious" />
* NEXTFlex DNA Sequencing kit<ref name=":geneious" />
* NEBNext Ultra II DNA<ref name=":6" /><ref name=":9" /><ref name=":15" />
* NEBNext Ultra II DNA<ref name=":6" /><ref name=":9" /><ref name=":15" />
Line 67: Line 67:


=== Sequencing ===
=== Sequencing ===
[[DNA sequencing|Sequencing]] with short reads or long reads will depend on the target genome or genes. [[Microsatellite|Microsatellites]] in nuclear repeats require longer reads.<ref name=":20">{{Cite journal|last=Xia|first=Yun|last2=Luo|first2=Wei|last3=Yuan|first3=Siqi|last4=Zheng|first4=Yuchi|last5=Zeng|first5=Xiaomao|date=December 2018|title=Microsatellite development from genome skimming and transcriptome sequencing: comparison of strategies and lessons from frog species|url=https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-018-5329-y|journal=BMC Genomics|language=en|volume=19|issue=1|pages=886|doi=10.1186/s12864-018-5329-y|issn=1471-2164|pmc=6286531|pmid=30526480|via=}}</ref> The following sequencing platforms have been used in genome skimming:
[[DNA sequencing|Sequencing]] with short reads or long reads will depend on the target genome or genes. [[Microsatellite|Microsatellites]] in nuclear repeats require longer reads.<ref name=":20">{{Cite journal|last=Xia|first=Yun|last2=Luo|first2=Wei|last3=Yuan|first3=Siqi|last4=Zheng|first4=Yuchi|last5=Zeng|first5=Xiaomao|date=December 2018|title=Microsatellite development from genome skimming and transcriptome sequencing: comparison of strategies and lessons from frog species|journal=BMC Genomics|language=en|volume=19|issue=1|pages=886|doi=10.1186/s12864-018-5329-y|issn=1471-2164|pmc=6286531|pmid=30526480}}</ref> The following sequencing platforms have been used in genome skimming:


{{columns-list|colwidth=15em|
{{columns-list|colwidth=15em|
* Illumina HiSeq 2000 platform<ref name=":3" /><ref name=":geneious" /><ref name=":21">{{Cite journal|last=Fonseca|first=Luiz Henrique M.|last2=Lohmann|first2=Lúcia G.|date=January 2020|title=Exploring the potential of nuclear and mitochondrial sequencing data generated through genome‐skimming for plant phylogenetics: A case study from a clade of neotropical lianas|url=https://onlinelibrary.wiley.com/doi/abs/10.1111/jse.12533|journal=Journal of Systematics and Evolution|language=en|volume=58|issue=1|pages=18–32|doi=10.1111/jse.12533|issn=1674-4918|via=}}</ref><ref name=":22">{{Cite journal|last=Bock|first=Dan G.|last2=Kane|first2=Nolan C.|last3=Ebert|first3=Daniel P.|last4=Rieseberg|first4=Loren H.|date=February 2014|title=Genome skimming reveals the origin of the Jerusalem Artichoke tuber crop species: neither from Jerusalem nor an artichoke|url=http://doi.wiley.com/10.1111/nph.12560|journal=New Phytologist|language=en|volume=201|issue=3|pages=1021–1030|doi=10.1111/nph.12560|via=}}</ref>
* Illumina HiSeq 2000 platform<ref name=":3" /><ref name=":geneious" /><ref name=":21">{{Cite journal|last=Fonseca|first=Luiz Henrique M.|last2=Lohmann|first2=Lúcia G.|date=January 2020|title=Exploring the potential of nuclear and mitochondrial sequencing data generated through genome‐skimming for plant phylogenetics: A case study from a clade of neotropical lianas|journal=Journal of Systematics and Evolution|language=en|volume=58|issue=1|pages=18–32|doi=10.1111/jse.12533|issn=1674-4918}}</ref><ref name=":22">{{Cite journal|last=Bock|first=Dan G.|last2=Kane|first2=Nolan C.|last3=Ebert|first3=Daniel P.|last4=Rieseberg|first4=Loren H.|date=February 2014|title=Genome skimming reveals the origin of the Jerusalem Artichoke tuber crop species: neither from Jerusalem nor an artichoke|journal=New Phytologist|language=en|volume=201|issue=3|pages=1021–1030|doi=10.1111/nph.12560|pmid=24245977}}</ref>
* Illumina HiSeq 2500 platform<ref name=":17" /><ref name=":6" /><ref name=":11" /><ref name=":18" /><ref name=":16" /><ref name=":23">{{Cite journal|last=Richter|first=Sandy|last2=Schwarz|first2=Francine|last3=Hering|first3=Lars|last4=Böggemann|first4=Markus|last5=Bleidorn|first5=Christoph|date=December 2015|title=The Utility of Genome Skimming for Phylogenomic Analyses as Demonstrated for Glycerid Relationships (Annelida, Glyceridae)|url=https://academic.oup.com/gbe/article-lookup/doi/10.1093/gbe/evv224|journal=Genome Biology and Evolution|language=en|volume=7|issue=12|pages=3443–3462|doi=10.1093/gbe/evv224|issn=1759-6653|pmc=4700955|pmid=26590213|via=}}</ref>
* Illumina HiSeq 2500 platform<ref name=":17" /><ref name=":6" /><ref name=":11" /><ref name=":18" /><ref name=":16" /><ref name=":23">{{Cite journal|last=Richter|first=Sandy|last2=Schwarz|first2=Francine|last3=Hering|first3=Lars|last4=Böggemann|first4=Markus|last5=Bleidorn|first5=Christoph|date=December 2015|title=The Utility of Genome Skimming for Phylogenomic Analyses as Demonstrated for Glycerid Relationships (Annelida, Glyceridae)|journal=Genome Biology and Evolution|language=en|volume=7|issue=12|pages=3443–3462|doi=10.1093/gbe/evv224|issn=1759-6653|pmc=4700955|pmid=26590213}}</ref>
* Illumina HiSeq 4000 platform<ref name=":14" />
* Illumina HiSeq 4000 platform<ref name=":14" />
* Illumina HiSeq X Ten platform<ref name=":5" /><ref name=":9" /><ref name=":14" />
* Illumina HiSeq X Ten platform<ref name=":5" /><ref name=":9" /><ref name=":14" />
Line 110: Line 110:
* SOAPdenovo-Trans<ref name=":14" />
* SOAPdenovo-Trans<ref name=":14" />
* Celera<ref name=":10" />
* Celera<ref name=":10" />
* IDBA-UD<ref name=":10" /><ref name=":23" /><ref name=":24">{{Cite journal|last=Grandjean|first=Frederic|last2=Tan|first2=Mun Hua|last3=Gan|first3=Han Ming|last4=Lee|first4=Yin Peng|last5=Kawai|first5=Tadashi|last6=Distefano|first6=Robert J.|last7=Blaha|first7=Martin|last8=Roles|first8=Angela J.|last9=Austin|first9=Christopher M.|date=November 2017|title=Rapid recovery of nuclear and mitochondrial genes by genome skimming from Northern Hemisphere freshwater crayfish|url=http://doi.wiley.com/10.1111/zsc.12247|journal=Zoologica Scripta|language=en|volume=46|issue=6|pages=718–728|doi=10.1111/zsc.12247|via=}}</ref>
* IDBA-UD<ref name=":10" /><ref name=":23" /><ref name=":24">{{Cite journal|last=Grandjean|first=Frederic|last2=Tan|first2=Mun Hua|last3=Gan|first3=Han Ming|last4=Lee|first4=Yin Peng|last5=Kawai|first5=Tadashi|last6=Distefano|first6=Robert J.|last7=Blaha|first7=Martin|last8=Roles|first8=Angela J.|last9=Austin|first9=Christopher M.|date=November 2017|title=Rapid recovery of nuclear and mitochondrial genes by genome skimming from Northern Hemisphere freshwater crayfish|journal=Zoologica Scripta|language=en|volume=46|issue=6|pages=718–728|doi=10.1111/zsc.12247}}</ref>
* Newbler<ref name=":10" />
* Newbler<ref name=":10" />
* Ray-Meta<ref name=":10" />
* Ray-Meta<ref name=":10" />
Line 181: Line 181:


=== Hyb-Seq ===
=== Hyb-Seq ===
Hyb-Seq is a new protocol for capturing low-copy nuclear genes that combines target enrichment and genome skimming.<ref>{{Cite journal|last=Weitemier|first=Kevin|last2=Straub|first2=Shannon C. K.|last3=Cronn|first3=Richard C.|last4=Fishbein|first4=Mark|last5=Schmickl|first5=Roswitha|last6=McDonnell|first6=Angela|last7=Liston|first7=Aaron|date=September 2014|title=Hyb-Seq: Combining Target Enrichment and Genome Skimming for Plant Phylogenomics|url=http://doi.wiley.com/10.3732/apps.1400042|journal=Applications in Plant Sciences|language=en|volume=2|issue=9|pages=1400042|doi=10.3732/apps.1400042|issn=2168-0450|pmc=4162667|pmid=25225629}}</ref> Target enrichment of the low-copy loci is achieved through designed enrichment probes for specific single-copy exons, but requires a nuclear draft genome and transcriptome of the targeted organism. The target-enriched libraries are then sequenced, and the resulting reads processed, assembled, and identified. Using off-target reads, [[Ribosomal DNA|rDNA cistrons]] and complete plastomes can also be assembled. Through this process, Hyb-Seq is able to produce genome-scale datasets for [[phylogenomics]].
Hyb-Seq is a new protocol for capturing low-copy nuclear genes that combines target enrichment and genome skimming.<ref>{{Cite journal|last=Weitemier|first=Kevin|last2=Straub|first2=Shannon C. K.|last3=Cronn|first3=Richard C.|last4=Fishbein|first4=Mark|last5=Schmickl|first5=Roswitha|last6=McDonnell|first6=Angela|last7=Liston|first7=Aaron|date=September 2014|title=Hyb-Seq: Combining Target Enrichment and Genome Skimming for Plant Phylogenomics|journal=Applications in Plant Sciences|language=en|volume=2|issue=9|pages=1400042|doi=10.3732/apps.1400042|issn=2168-0450|pmc=4162667|pmid=25225629}}</ref> Target enrichment of the low-copy loci is achieved through designed enrichment probes for specific single-copy exons, but requires a nuclear draft genome and transcriptome of the targeted organism. The target-enriched libraries are then sequenced, and the resulting reads processed, assembled, and identified. Using off-target reads, [[Ribosomal DNA|rDNA cistrons]] and complete plastomes can also be assembled. Through this process, Hyb-Seq is able to produce genome-scale datasets for [[phylogenomics]].


=== GetOrganelle ===
=== GetOrganelle ===
GetOrganelle is a toolkit that assembles organellar genomes uses genome skimming reads.<ref>{{Cite journal|last=Jin|first=Jian-Jun|last2=Yu|first2=Wen-Bin|last3=Yang|first3=Jun-Bo|last4=Song|first4=Yu|last5=dePamphilis|first5=Claude W.|last6=Yi|first6=Ting-Shuang|last7=Li|first7=De-Zhu|date=2018-03-09|title=GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes|url=http://biorxiv.org/lookup/doi/10.1101/256479|language=en|doi=10.1101/256479|doi-access=free}}</ref> Organelle-associated reads are recruited using a modified “baiting and iterative mapping” approach. The reads [[Sequence alignment|aligning]] to the target genome, using Bowtie2<ref>{{Cite journal|last=Langmead|first=Ben|last2=Salzberg|first2=Steven L|date=Mar 2012|title=Fast gapped-read alignment with Bowtie 2|url=http://www.nature.com/articles/nmeth.1923|journal=Nature Methods|language=en|volume=9|issue=4|pages=357–359|doi=10.1038/nmeth.1923|issn=1548-7091|pmc=3322381|pmid=22388286|via=}}</ref>, are referred to as “seed reads”. The seed reads are used as “baits” to recruit more organelle-associated reads via multiple iterations of extension. The read extension algorithm uses a [[Hash function|hashing approach]], where the reads are cut into substrings of certain lengths, referred to as “words”. At each extension iteration, these “words” are added to a [[hash table]], referred to as a “baits pool”, which dynamically increases in size with each iteration. Due to the low sequencing coverage of genome skims, non-target reads, even those with high sequence similarity to target reads, are largely not recruited. Using the final recruited organellar-associated reads, GetOrganelle conducts a ''de novo'' [[Sequence assembly|assembly]], using [[SPAdes (software)|SPAdes]].<ref>{{Cite journal|last=Bankevich|first=Anton|last2=Nurk|first2=Sergey|last3=Antipov|first3=Dmitry|last4=Gurevich|first4=Alexey A.|last5=Dvorkin|first5=Mikhail|last6=Kulikov|first6=Alexander S.|last7=Lesin|first7=Valery M.|last8=Nikolenko|first8=Sergey I.|last9=Pham|first9=Son|last10=Prjibelski|first10=Andrey D.|last11=Pyshkin|first11=Alexey V.|date=May 2012|title=SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing|url=http://www.liebertpub.com/doi/10.1089/cmb.2012.0021|journal=Journal of Computational Biology|language=en|volume=19|issue=5|pages=455–477|doi=10.1089/cmb.2012.0021|issn=1066-5277|pmc=3342519|pmid=22506599|via=}}</ref> The [[De Bruijn graph|assembly graph]] is filtered and untangled, producing all possible paths of the graph, and therefore all configurations of the circular organellar genomes.
GetOrganelle is a toolkit that assembles organellar genomes uses genome skimming reads.<ref>{{Cite journal|last=Jin|first=Jian-Jun|last2=Yu|first2=Wen-Bin|last3=Yang|first3=Jun-Bo|last4=Song|first4=Yu|last5=dePamphilis|first5=Claude W.|last6=Yi|first6=Ting-Shuang|last7=Li|first7=De-Zhu|date=2018-03-09|title=GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes|language=en|doi=10.1101/256479|doi-access=free}}</ref> Organelle-associated reads are recruited using a modified “baiting and iterative mapping” approach. The reads [[Sequence alignment|aligning]] to the target genome, using Bowtie2<ref>{{Cite journal|last=Langmead|first=Ben|last2=Salzberg|first2=Steven L|date=Mar 2012|title=Fast gapped-read alignment with Bowtie 2|journal=Nature Methods|language=en|volume=9|issue=4|pages=357–359|doi=10.1038/nmeth.1923|issn=1548-7091|pmc=3322381|pmid=22388286}}</ref>, are referred to as “seed reads”. The seed reads are used as “baits” to recruit more organelle-associated reads via multiple iterations of extension. The read extension algorithm uses a [[Hash function|hashing approach]], where the reads are cut into substrings of certain lengths, referred to as “words”. At each extension iteration, these “words” are added to a [[hash table]], referred to as a “baits pool”, which dynamically increases in size with each iteration. Due to the low sequencing coverage of genome skims, non-target reads, even those with high sequence similarity to target reads, are largely not recruited. Using the final recruited organellar-associated reads, GetOrganelle conducts a ''de novo'' [[Sequence assembly|assembly]], using [[SPAdes (software)|SPAdes]].<ref>{{Cite journal|last=Bankevich|first=Anton|last2=Nurk|first2=Sergey|last3=Antipov|first3=Dmitry|last4=Gurevich|first4=Alexey A.|last5=Dvorkin|first5=Mikhail|last6=Kulikov|first6=Alexander S.|last7=Lesin|first7=Valery M.|last8=Nikolenko|first8=Sergey I.|last9=Pham|first9=Son|last10=Prjibelski|first10=Andrey D.|last11=Pyshkin|first11=Alexey V.|date=May 2012|title=SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing|journal=Journal of Computational Biology|language=en|volume=19|issue=5|pages=455–477|doi=10.1089/cmb.2012.0021|issn=1066-5277|pmc=3342519|pmid=22506599}}</ref> The [[De Bruijn graph|assembly graph]] is filtered and untangled, producing all possible paths of the graph, and therefore all configurations of the circular organellar genomes.


=== Skmer ===
=== Skmer ===
Skmer is an assembly-free and alignment-free tool to compute genomic distances between the query and reference genome skims.<ref name=":25">{{Cite journal|last=Sarmashghi|first=Shahab|last2=Bohmann|first2=Kristine|last3=P. Gilbert|first3=M. Thomas|last4=Bafna|first4=Vineet|last5=Mirarab|first5=Siavash|date=December 2019|title=Skmer: assembly-free and alignment-free sample identification using genome skims|url=https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1632-4|journal=Genome Biology|language=en|volume=20|issue=1|pages=34|doi=10.1186/s13059-019-1632-4|issn=1474-760X|pmc=6374904|pmid=30760303}}</ref> Skmer uses a 2 stage approach to compute these distances. First, it generates k-mer frequency profiling using a tool called JellyFish<ref>{{Cite journal|last=Marçais|first=Guillaume|last2=Kingsford|first2=Carl|date=2011-03-15|title=A fast, lock-free approach for efficient parallel counting of occurrences of k-mers|url=https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btr011|journal=Bioinformatics|language=en|volume=27|issue=6|pages=764–770|doi=10.1093/bioinformatics/btr011|issn=1460-2059|pmc=3051319|pmid=21217122}}</ref> and then these k-mers are converted into hashes.<ref name=":25" /> A random subset of these hashes are selected to form a so-called "sketch".<ref name=":25" /> For its second stage, Skmer uses Mash<ref>{{Cite journal|last=Ondov|first=Brian D.|last2=Treangen|first2=Todd J.|last3=Melsted|first3=Páll|last4=Mallonee|first4=Adam B.|last5=Bergman|first5=Nicholas H.|last6=Koren|first6=Sergey|last7=Phillippy|first7=Adam M.|date=Dec 2016|title=Mash: fast genome and metagenome distance estimation using MinHash|url=http://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x|journal=Genome Biology|language=en|volume=17|issue=1|pages=132|doi=10.1186/s13059-016-0997-x|issn=1474-760X|pmc=4915045|pmid=27323842|via=}}</ref> to estimate the [[Jaccard index]] of two of these sketches. <ref name=":25" /> The combination of these 2 stages is used to estimate the evolutionary distance.<ref name=":25" />
Skmer is an assembly-free and alignment-free tool to compute genomic distances between the query and reference genome skims.<ref name=":25">{{Cite journal|last=Sarmashghi|first=Shahab|last2=Bohmann|first2=Kristine|last3=P. Gilbert|first3=M. Thomas|last4=Bafna|first4=Vineet|last5=Mirarab|first5=Siavash|date=December 2019|title=Skmer: assembly-free and alignment-free sample identification using genome skims|journal=Genome Biology|language=en|volume=20|issue=1|pages=34|doi=10.1186/s13059-019-1632-4|issn=1474-760X|pmc=6374904|pmid=30760303}}</ref> Skmer uses a 2 stage approach to compute these distances. First, it generates k-mer frequency profiling using a tool called JellyFish<ref>{{Cite journal|last=Marçais|first=Guillaume|last2=Kingsford|first2=Carl|date=2011-03-15|title=A fast, lock-free approach for efficient parallel counting of occurrences of k-mers|journal=Bioinformatics|language=en|volume=27|issue=6|pages=764–770|doi=10.1093/bioinformatics/btr011|issn=1460-2059|pmc=3051319|pmid=21217122}}</ref> and then these k-mers are converted into hashes.<ref name=":25" /> A random subset of these hashes are selected to form a so-called "sketch".<ref name=":25" /> For its second stage, Skmer uses Mash<ref>{{Cite journal|last=Ondov|first=Brian D.|last2=Treangen|first2=Todd J.|last3=Melsted|first3=Páll|last4=Mallonee|first4=Adam B.|last5=Bergman|first5=Nicholas H.|last6=Koren|first6=Sergey|last7=Phillippy|first7=Adam M.|date=Dec 2016|title=Mash: fast genome and metagenome distance estimation using MinHash|journal=Genome Biology|language=en|volume=17|issue=1|pages=132|doi=10.1186/s13059-016-0997-x|issn=1474-760X|pmc=4915045|pmid=27323842}}</ref> to estimate the [[Jaccard index]] of two of these sketches. <ref name=":25" /> The combination of these 2 stages is used to estimate the evolutionary distance.<ref name=":25" />


=== Geneious ===
=== Geneious ===
[https://www.geneious.com Geneious] is a integrative software platform that allows users to perform various steps in bioinformatic analysis such as [[Sequence assembly|assembly]], [[Sequence alignment|alignment]], and [[phylogenetics]] by incorporating other tools within a GUI based platform.<ref name=":geneious" /><ref name=":26">{{Cite web|url=https://ostr.ccr.cancer.gov/bioinformatics/software/geneious/|title=Geneious – OSTR|language=en-US|access-date=2020-02-28}}</ref>
[https://www.geneious.com Geneious] is a integrative software platform that allows users to perform various steps in bioinformatic analysis such as [[Sequence assembly|assembly]], [[Sequence alignment|alignment]], and [[phylogenetics]] by incorporating other tools within a GUI based platform.<ref name=":geneious" /><ref name=":26">{{Cite web|url=https://ostr.ccr.cancer.gov/bioinformatics/software/geneious/|title=Geneious – OSTR|language=en-US|access-date=2020-02-28}}</ref>
== ''In silico'' Genome skimming ==
== ''In silico'' Genome skimming ==
Although genome skimming is usually chosen as a cost-effective method to sequence organellar genomes, genome skimming can be done ''in silico'' if (deep) whole-genome sequencing data has already been obtained. Genome skimming has been demonstrated to simplify organellar genome assembly by subsampling the reads of the nuclear genome via ''in silico'' genome skimming.<ref>{{Cite journal|last=Lin|first=Diana|last2=Coombe|first2=Lauren|last3=Jackman|first3=Shaun D.|last4=Gagalova|first4=Kristina K.|last5=Warren|first5=René L.|last6=Hammond|first6=S. Austin|last7=Kirk|first7=Heather|last8=Pandoh|first8=Pawan|last9=Zhao|first9=Yongjun|last10=Moore|first10=Richard A.|last11=Mungall|first11=Andrew J.|date=2019-06-06|editor-last=Rokas|editor-first=Antonis|title=Complete Chloroplast Genome Sequence of a White Spruce ( Picea glauca , Genotype WS77111) from Eastern Canada|url=http://genomea.asm.org/lookup/doi/10.1128/MRA.00381-19|journal=Microbiology Resource Announcements|language=en|volume=8|issue=23|pages=e00381–19, /mra/8/23/MRA.00381–19.atom|doi=10.1128/MRA.00381-19|issn=2576-098X|pmc=6554609|pmid=31171622}}</ref><ref>{{Cite journal|last=Lin|first=Diana|last2=Coombe|first2=Lauren|last3=Jackman|first3=Shaun D.|last4=Gagalova|first4=Kristina K.|last5=Warren|first5=René L.|last6=Hammond|first6=S. Austin|last7=McDonald|first7=Helen|last8=Kirk|first8=Heather|last9=Pandoh|first9=Pawan|last10=Zhao|first10=Yongjun|last11=Moore|first11=Richard A.|date=2019-06-13|editor-last=Stajich|editor-first=Jason E.|title=Complete Chloroplast Genome Sequence of an Engelmann Spruce ( Picea engelmannii , Genotype Se404-851) from Western Canada|url=http://genomea.asm.org/lookup/doi/10.1128/MRA.00382-19|journal=Microbiology Resource Announcements|language=en|volume=8|issue=24|pages=e00382–19, /mra/8/24/MRA.00382–19.atom|doi=10.1128/MRA.00382-19|issn=2576-098X|pmc=6588038|pmid=31196920}}</ref> Since the organellar genomes will be high-copy in the cell, ''in silico'' genome skimming essentially filters out nuclear sequences, leaving a higher organellar to nuclear sequence ratio for assembly, reducing the complexity of the assembly paradigm. ''In silico'' genome skimming was first done as a proof-of-concept, optimizing the parameters for read type, read length, and sequencing coverage.<ref name=":0" />
Although genome skimming is usually chosen as a cost-effective method to sequence organellar genomes, genome skimming can be done ''in silico'' if (deep) whole-genome sequencing data has already been obtained. Genome skimming has been demonstrated to simplify organellar genome assembly by subsampling the reads of the nuclear genome via ''in silico'' genome skimming.<ref>{{Cite journal|last=Lin|first=Diana|last2=Coombe|first2=Lauren|last3=Jackman|first3=Shaun D.|last4=Gagalova|first4=Kristina K.|last5=Warren|first5=René L.|last6=Hammond|first6=S. Austin|last7=Kirk|first7=Heather|last8=Pandoh|first8=Pawan|last9=Zhao|first9=Yongjun|last10=Moore|first10=Richard A.|last11=Mungall|first11=Andrew J.|date=2019-06-06|editor-last=Rokas|editor-first=Antonis|title=Complete Chloroplast Genome Sequence of a White Spruce ( Picea glauca , Genotype WS77111) from Eastern Canada|journal=Microbiology Resource Announcements|language=en|volume=8|issue=23|pages=e00381–19, /mra/8/23/MRA.00381–19.atom|doi=10.1128/MRA.00381-19|issn=2576-098X|pmc=6554609|pmid=31171622}}</ref><ref>{{Cite journal|last=Lin|first=Diana|last2=Coombe|first2=Lauren|last3=Jackman|first3=Shaun D.|last4=Gagalova|first4=Kristina K.|last5=Warren|first5=René L.|last6=Hammond|first6=S. Austin|last7=McDonald|first7=Helen|last8=Kirk|first8=Heather|last9=Pandoh|first9=Pawan|last10=Zhao|first10=Yongjun|last11=Moore|first11=Richard A.|date=2019-06-13|editor-last=Stajich|editor-first=Jason E.|title=Complete Chloroplast Genome Sequence of an Engelmann Spruce ( Picea engelmannii , Genotype Se404-851) from Western Canada|journal=Microbiology Resource Announcements|language=en|volume=8|issue=24|pages=e00382–19, /mra/8/24/MRA.00382–19.atom|doi=10.1128/MRA.00382-19|issn=2576-098X|pmc=6588038|pmid=31196920}}</ref> Since the organellar genomes will be high-copy in the cell, ''in silico'' genome skimming essentially filters out nuclear sequences, leaving a higher organellar to nuclear sequence ratio for assembly, reducing the complexity of the assembly paradigm. ''In silico'' genome skimming was first done as a proof-of-concept, optimizing the parameters for read type, read length, and sequencing coverage.<ref name=":0" />


== Other Applications ==
== Other Applications ==
Other than the current uses listed above, genome skimming has also been applied to other tasks, such as quantifying pollen mixtures,<ref name=":14">{{Cite journal|last=Lang|first=Dandan|last2=Tang|first2=Min|last3=Hu|first3=Jiahui|last4=Zhou|first4=Xin|date=November 2019|title=Genome‐skimming provides accurate quantification for pollen mixtures|url=https://onlinelibrary.wiley.com/doi/abs/10.1111/1755-0998.13061|journal=Molecular Ecology Resources|language=en|volume=19|issue=6|pages=1433–1446|doi=10.1111/1755-0998.13061|issn=1755-098X|pmc=6900181|pmid=31325909}}</ref> monitoring and conservation of certain populations.<ref>{{Cite journal|last=Johri|first=Shaili|last2=Doane|first2=Michael|last3=Allen|first3=Lauren|last4=Dinsdale|first4=Elizabeth|date=2019-03-29|title=Taking Advantage of the Genomics Revolution for Monitoring and Conservation of Chondrichthyan Populations|url=https://www.mdpi.com/1424-2818/11/4/49|journal=Diversity|language=en|volume=11|issue=4|pages=49|doi=10.3390/d11040049|issn=1424-2818|doi-access=free}}</ref> Genome skimming can also be used for variant calling, to examine [[Single-nucleotide polymorphism|single nucleotide polymorphisms]] across a species.<ref name=":12" />
Other than the current uses listed above, genome skimming has also been applied to other tasks, such as quantifying pollen mixtures,<ref name=":14">{{Cite journal|last=Lang|first=Dandan|last2=Tang|first2=Min|last3=Hu|first3=Jiahui|last4=Zhou|first4=Xin|date=November 2019|title=Genome‐skimming provides accurate quantification for pollen mixtures|journal=Molecular Ecology Resources|language=en|volume=19|issue=6|pages=1433–1446|doi=10.1111/1755-0998.13061|issn=1755-098X|pmc=6900181|pmid=31325909}}</ref> monitoring and conservation of certain populations.<ref>{{Cite journal|last=Johri|first=Shaili|last2=Doane|first2=Michael|last3=Allen|first3=Lauren|last4=Dinsdale|first4=Elizabeth|date=2019-03-29|title=Taking Advantage of the Genomics Revolution for Monitoring and Conservation of Chondrichthyan Populations|url=https://www.mdpi.com/1424-2818/11/4/49|journal=Diversity|language=en|volume=11|issue=4|pages=49|doi=10.3390/d11040049|issn=1424-2818|doi-access=free}}</ref> Genome skimming can also be used for variant calling, to examine [[Single-nucleotide polymorphism|single nucleotide polymorphisms]] across a species.<ref name=":12" />


== Advantages ==
== Advantages ==
Line 214: Line 214:


=== Scalability ===
=== Scalability ===
Both the wet-lab and the bioinformatics parts of genome skimming have certain challenges with scalability. Although the cost of sequencing in genome skimming is affordable at $80 for 1 Gb in 2016, the library preparation for sequencing is still very expensive, at least ~$200 per sample (as of 2016). Additionally, most library preparation protocols have not been fully automated with robotics yet. On the bioinformatics side, large complex databases and automated workflows need to be designed to handle the large amounts of data resulting from genome skimming. The automation of the following processes need to be implemented:<ref>{{Cite journal|last=Coissac|first=Eric|last2=Hollingsworth|first2=Peter M.|last3=Lavergne|first3=Sébastien|last4=Taberlet|first4=Pierre|date=April 2016|title=From barcodes to genomes: extending the concept of DNA barcoding|url=http://doi.wiley.com/10.1111/mec.13549|journal=Molecular Ecology|language=en|volume=25|issue=7|pages=1423–1428|doi=10.1111/mec.13549|doi-access=free}}</ref>
Both the wet-lab and the bioinformatics parts of genome skimming have certain challenges with scalability. Although the cost of sequencing in genome skimming is affordable at $80 for 1 Gb in 2016, the library preparation for sequencing is still very expensive, at least ~$200 per sample (as of 2016). Additionally, most library preparation protocols have not been fully automated with robotics yet. On the bioinformatics side, large complex databases and automated workflows need to be designed to handle the large amounts of data resulting from genome skimming. The automation of the following processes need to be implemented:<ref>{{Cite journal|last=Coissac|first=Eric|last2=Hollingsworth|first2=Peter M.|last3=Lavergne|first3=Sébastien|last4=Taberlet|first4=Pierre|date=April 2016|title=From barcodes to genomes: extending the concept of DNA barcoding|journal=Molecular Ecology|language=en|volume=25|issue=7|pages=1423–1428|doi=10.1111/mec.13549|pmid=26821259|doi-access=free}}</ref>


# Assembly of the standard barcodes
# Assembly of the standard barcodes

Revision as of 00:26, 12 May 2020

Genome skimming allows for assembly of high-copy fractions of the genome into contiguous, complete genomes.

Genome skimming is a sequencing approach that uses low-pass, shallow sequencing of a genome (up to 5%), to generate fragments of DNA, known as genome skims.[1][2] These genome skims contain information about the high-copy fraction of the genome.[2] The high-copy fraction of the genome consists of the ribosomal DNA, plastid genome (plastome), mitochondrial genome (mitogenome), and nuclear repeats such as microsatellites and transposable elements.[3] It employs high-throughput, next generation sequencing technology to generate these skims.[1] Although these skims are merely 'the tip of the genomic iceberg', phylogenomic analysis of them can still provide insights on evolutionary history and biodiversity at a lower cost and larger scale than traditional methods.[2][3][4] Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and forensics.[5]

Current Uses

In addition to the assembly of the smaller organellar genomes, genome skimming can also be used to uncover conserved ortholog sequences for phylogenomic studies. In phylogenomic studies of multicellular pathogens, genome skimming can be used to find effector genes, discover endosymbionts and characterize genomic variation.[6]

High-copy DNA

Ribosomal DNA

The Internal transcribed spacers (ITS) are non-coding regions within the 18-5.8-28S rDNA in eukaryotes, and are one feature of rDNA that has been used in genome skimming studies.[7] ITS are used to detect different species within a genus, due to their high inter-species variability.[7] These have low individual variability, preventing identification of distinct strains or individuals.[7] They are also present in all eukaryotes, has a high evolution rate, and has been used in phylogenetic analysis between and across species.[7]

When targeting nuclear rDNA, it is suggested that a minimum final sequencing depth of 100X is achieved, and sequences with less than 5X depth are masked.[1]

Plastomes

The plastid genome, or plastome, has been used extensively in identification and evolutionary studies using genome skimming due to its high abundance within plants (~3-5% of cell DNA), small size, simple structure, greater conservation of gene structure than nuclear or mitochondrial genes.[8][9] Plastids studies have been previously been limited by the number of regions that could be assessed in traditional approaches[9]. Using genome skimming, the sequencing of the entire plastid genome, or plastome, can be done at a fraction of the cost and time required for typical sequencing approaches like Sanger sequencing.[3] Plastomes have been suggested as a method to replace traditional DNA barcodes in plants,[3] such as the rbcL and matK barcode genes. Compared to the typical DNA barcode, genome skimming produces plastomes at a tenth of the cost per base.[5] Recent uses of genome skims of plastomes have allowed greater resolution of phylogenies, higher differentiation of specific groups within taxa, and more accurate estimates of biodiversity.[9] Additionally, the plastome has been used to compare species within a genus to look at evolutionary changes and diversity within a group.[9]

When targeting plastomes, it is suggested that a minimum final sequencing depth of 30X is achieved for single-copy regions to ensure high quality assemblies. Single nucleotide polymorphisms (SNPs) with less than 20X depth should be masked.[1]

Mitogenomes

The mitochondrial genome, or mitogenome, is used as a molecular marker in a great variety of studies because of its maternal inheritance, high copy-number in the cell, lack of recombination, and high mutation rate. It’s often used for phylogenetic studies as it is very uniform across metazoan groups, with a circular, double-stranded DNA molecule structure, about 15 to 20 kilobases, with 37 ribosomal RNA genes, 13 protein-coding genes, and 22 transfer RNA genes. Mitochondrial barcode sequences, such as COI, NADH2, 16S rRNA, and 12S rRNA, can also be used for taxonomic identification.[10] The increased publishing of complete mitogenomes allows for inference of robust phylogenies across many taxonomic groups, and it can capture events such as gene rearrangements and positioning of mobile genetic elements. Using genome skimming to assemble complete mitogenomes, phylogenetic history and biodiversity of many organisms can be resolved.[4]

When targeting mitogenomes, there are no specific suggestions for minimum final sequencing depth, as mitogenomes are more variable in size and more variable in complexity in plant species, increasing the difficulty of assembling repeated sequences. However, highly conserved coding sequences and nonrepetitive flanking regions can be assembled using reference-guided assembly. Sequences should be masked similarly to targeting plastomes and nuclear ribosomal DNA.[1]

Nuclear repeats (satellites or transposable elements)

Nuclear repeats in the genome are an underused source of phylogenetic data. When the nuclear genome is sequenced at 5% of the genome, thousands of copies of the nuclear repeats will be present. Although the repeats sequenced will only be representative of those in the entire genome, it has been shown that these sequenced fractions accurately reflect genomic abundance. These repeats can be clustered de novo and their abundance is estimated. The distribution and occurrence of these repeat types can be phylogenetically informative and provide information about the evolutionary history of various species.[1]

Low-copy DNA

Low-copy DNA can prove useful for evolution developmental and phylogenetic studies.[11] It can be mined from high-copy fractions in a number of ways such as developing primers from databases that contain conserved orthologous genes, single‐copy conserved orthologous gene, and shared copy genes.[11] Another method is looking for novel probes that target low copy genes using transcriptomics via Hyb-Seq.[11] While nuclear genomes assembled using genome skims are extremely fragmented, some low-copy single-copy nuclear genes can be successfully assembled.[12]

Low-quantity degraded DNA

Previous methods of trying to recover degraded DNA were based on Sanger sequencing and relied on large intact DNA templates and were affected by contamination and method of preservation. Genome skimming, on the other hand, can be used to extract genetic information from preserved species in herbariums and museums, where the DNA is often very degraded, and very little remains.[4][13] Studies in plants show that DNA as old as 80 years and with as little as 500 pg of degraded DNA, can be used with genome skimming to infer genomic information.[13] In herbaria, even with low yield and low quality DNA, one study was still able to produce "high-quality complete chloroplast and ribosomal DNA sequences" at a large scale for downstream analyses.[14]

In field studies, invertebrates are stored in ethanol which is usually discarded during DNA-based studies.[15] Genome skimming has been shown to detect the low quantity of DNA from this ethanol-fraction and provide information about the biomass of the specimens in a fraction, the microbiota of outer tissue layers and the gut contents (like prey) released by the vomit reflex[15]. Thus, genome skimming can provide an additional method of understanding ecology via low copy DNA.[15]

Workflow

DNA extraction

DNA extraction protocols will vary depending on the source of the sample (i.e. plants, animals, etc.). The following DNA extraction protocols have been used in genome skimming:

Library preparation

Library preparation protocols will depend on a variety of factors: organism, tissue type, etc. In the cases of preserved specimens, specific library preparation protocols modifications may have to be made.[1] The following library preparation protocols have been used in genome skimming:

  • Illumina TruSeq DNA Sample Preparation kit[5][6][15]
  • Illumina TruSeq PCR-free kit[7][21]
  • NEXTFlex DNA Sequencing kit[18]
  • NEBNext Ultra II DNA[9][13][16]
  • NEBNext Multiplex Oligos[16]
  • Nextera XT DNA Library Preparation kit[4]
  • TruSeq Nano DNA LT Library Preparation kit[14][17]
  • Rapid Sequencing kit[10]

Sequencing

Sequencing with short reads or long reads will depend on the target genome or genes. Microsatellites in nuclear repeats require longer reads.[23] The following sequencing platforms have been used in genome skimming:

The Illumina MiSeq platform has been chosen by certain researchers for its long read length for short reads.[6]

Assembly

After genome skimming, high-copy organellar DNA can be assembled with a reference guide or assembled de novo. High-copy nuclear repeats can be clustered de novo.[1] Assemblers chosen will depend on the target genome and whether short or long reads are used. The following tools have been used to assemble genomes from genome skims:

Other

Annotation

Annotation is used to identify genes in the genome assemblies. The annotation tool chosen will depend on the target genome, and the target features of that genome. The following annotation tools have been used in genome skimming to annotate organellar genomes:

Other

Phylogeny construction

The assembled sequences are globally aligned, and then phylogenetic trees are constructed using phylogeny construction software. The software chosen for phylogeny construction will depend on whether a Maximum Likelihood (ML), Maximum Parsimony (MP), or Bayesian Inference (BI) method is appropriate. The following phylogeny construction programs have been used in genome skimming:

Tools and Pipelines

Various protocols, pipelines, and bioinformatic tools have been developed to help automate the downstream processes of genome skimming.

Hyb-Seq

Hyb-Seq is a new protocol for capturing low-copy nuclear genes that combines target enrichment and genome skimming.[29] Target enrichment of the low-copy loci is achieved through designed enrichment probes for specific single-copy exons, but requires a nuclear draft genome and transcriptome of the targeted organism. The target-enriched libraries are then sequenced, and the resulting reads processed, assembled, and identified. Using off-target reads, rDNA cistrons and complete plastomes can also be assembled. Through this process, Hyb-Seq is able to produce genome-scale datasets for phylogenomics.

GetOrganelle

GetOrganelle is a toolkit that assembles organellar genomes uses genome skimming reads.[30] Organelle-associated reads are recruited using a modified “baiting and iterative mapping” approach. The reads aligning to the target genome, using Bowtie2[31], are referred to as “seed reads”. The seed reads are used as “baits” to recruit more organelle-associated reads via multiple iterations of extension. The read extension algorithm uses a hashing approach, where the reads are cut into substrings of certain lengths, referred to as “words”. At each extension iteration, these “words” are added to a hash table, referred to as a “baits pool”, which dynamically increases in size with each iteration. Due to the low sequencing coverage of genome skims, non-target reads, even those with high sequence similarity to target reads, are largely not recruited. Using the final recruited organellar-associated reads, GetOrganelle conducts a de novo assembly, using SPAdes.[32] The assembly graph is filtered and untangled, producing all possible paths of the graph, and therefore all configurations of the circular organellar genomes.

Skmer

Skmer is an assembly-free and alignment-free tool to compute genomic distances between the query and reference genome skims.[33] Skmer uses a 2 stage approach to compute these distances. First, it generates k-mer frequency profiling using a tool called JellyFish[34] and then these k-mers are converted into hashes.[33] A random subset of these hashes are selected to form a so-called "sketch".[33] For its second stage, Skmer uses Mash[35] to estimate the Jaccard index of two of these sketches. [33] The combination of these 2 stages is used to estimate the evolutionary distance.[33]

Geneious

Geneious is a integrative software platform that allows users to perform various steps in bioinformatic analysis such as assembly, alignment, and phylogenetics by incorporating other tools within a GUI based platform.[18][28]

In silico Genome skimming

Although genome skimming is usually chosen as a cost-effective method to sequence organellar genomes, genome skimming can be done in silico if (deep) whole-genome sequencing data has already been obtained. Genome skimming has been demonstrated to simplify organellar genome assembly by subsampling the reads of the nuclear genome via in silico genome skimming.[36][37] Since the organellar genomes will be high-copy in the cell, in silico genome skimming essentially filters out nuclear sequences, leaving a higher organellar to nuclear sequence ratio for assembly, reducing the complexity of the assembly paradigm. In silico genome skimming was first done as a proof-of-concept, optimizing the parameters for read type, read length, and sequencing coverage.[1]

Other Applications

Other than the current uses listed above, genome skimming has also been applied to other tasks, such as quantifying pollen mixtures,[19] monitoring and conservation of certain populations.[38] Genome skimming can also be used for variant calling, to examine single nucleotide polymorphisms across a species.[22]

Advantages

Genome skimming is a cost-effective, rapid and reliable method to generate large shallow datasets,[5] since several datasets (plastid, mitochondrial, nuclear) are generated per run.[3] It is very simple to implement, requires less lab work and optimization, and does not require a priori knowledge of the organism nor its genome size.[3] This provides a low-risk avenue for biological inquiry and hypothesis generation without a huge commitment of resources.[6]

Genome skimming is an especially advantageous approach regarding cases where the genomic DNA may be old and degraded from chemical treatments, such as specimens from herbarium and museum collections,[4] a largely untapped genomic resource. Genome skimming allows for the molecular characterization of rare or extinct species.[5] The preservation processes in ethanol often damage the genomic DNA, which hinders the success of standard PCR protocols[3] and other amplicon-based approaches.[5] This presents an opportunity to sequence samples with very low DNA concentrations, without the need for DNA enrichment or amplification. Library preparation for specific to genome skimming has been shown to work with as low as 37 ng of DNA (0.2 ng/ul), 135-fold less than recommended by Illumina.[1]

Although genome skimming is mostly used to extract high-copy plastomes and mitogenomes, it can also provide partial sequences of low-copy nuclear sequences. These sequences may not be sufficiently complete for phylogenomic analysis, but can be sufficient for designing PCR primers and probes for hybridization-based approaches.[1]

Genome skimming is not dependent on any specific primers and remains unaffected by gene rearrangements.[4]

Limitations

Genome skimming scratches the surface of the genome, so it will not suffice for biological questions that require gene prediction and annotation.[6] These downstream steps are required for deep and more meaningful analyses.

Although plastid genomic sequences are abundant in genome skims, the presence of mitochondrial and nuclear pseudogenes of plastid origin can potentially pose issues for plastome assemblies.[1]

A combination of sequencing depth and read type, as well as genomic target (plastome, mitogenome, etc.) will influence the success of single-end and paired-end assemblies, so these parameters must be carefully chosen.[1]

Scalability

Both the wet-lab and the bioinformatics parts of genome skimming have certain challenges with scalability. Although the cost of sequencing in genome skimming is affordable at $80 for 1 Gb in 2016, the library preparation for sequencing is still very expensive, at least ~$200 per sample (as of 2016). Additionally, most library preparation protocols have not been fully automated with robotics yet. On the bioinformatics side, large complex databases and automated workflows need to be designed to handle the large amounts of data resulting from genome skimming. The automation of the following processes need to be implemented:[39]

  1. Assembly of the standard barcodes
  2. Assembly of organellar DNA (as well as nuclear ribosomal tandem repeats)
  3. Annotation of the different assembled fragments
  4. Removal of potential contaminant sequences
  5. Estimation of sequencing coverage for single-copy genes
  6. Extraction of reads corresponding to single-copy genes
  7. Identification of unknown specimen from a small shotgun sequencing or any DNA fragment
  8. Identification of the different organisms from shotgun sequencing of environmental DNA (metagenomics)

Some of these scalability challenges have already been implemented, as shown above in the "Tools and Pipelines" section.

See also

References

  1. ^ a b c d e f g h i j k l m n o Straub, Shannon C. K.; Parks, Matthew; Weitemier, Kevin; Fishbein, Mark; Cronn, Richard C.; Liston, Aaron (February 2012). "Navigating the tip of the genomic iceberg: Next-generation sequencing for plant systematics". American Journal of Botany. 99 (2): 349–364. doi:10.3732/ajb.1100335. PMID 22174336.
  2. ^ a b c Dodsworth, Steven (September 2015). "Genome skimming for next-generation biodiversity analysis". Trends in Plant Science. 20 (9): 525–527. doi:10.1016/j.tplants.2015.06.012. PMID 26205170.
  3. ^ a b c d e f g Dodsworth, Steven Andrew, author. Genome skimming for phylogenomics. OCLC 1108700470. {{cite book}}: |last= has generic name (help)CS1 maint: multiple names: authors list (link)
  4. ^ a b c d e f g h i j k l m n o p Trevisan, Bruna; Alcantara, Daniel M.C.; Machado, Denis Jacob; Marques, Fernando P.L.; Lahr, Daniel J.G. (2019-09-13). "Genome skimming is a low-cost and robust strategy to assemble complete mitochondrial genomes from ethanol preserved specimens in biodiversity studies". PeerJ. 7: e7543. doi:10.7717/peerj.7543. ISSN 2167-8359. PMC 6746217. PMID 31565556.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  5. ^ a b c d e f g h i j k l Malé, Pierre-Jean G.; Bardon, Léa; Besnard, Guillaume; Coissac, Eric; Delsuc, Frédéric; Engel, Julien; Lhuillier, Emeline; Scotti-Saintagne, Caroline; Tinaut, Alexandra; Chave, Jérôme (April 2014). "Genome skimming by shotgun sequencing helps resolve the phylogeny of a pantropical tree family". Molecular Ecology Resources: n/a. doi:10.1111/1755-0998.12246. PMID 24606032.
  6. ^ a b c d e f g h i Denver, Dee R.; Brown, Amanda M. V.; Howe, Dana K.; Peetz, Amy B.; Zasada, Inga A. (2016-08-04). Round, June L. (ed.). "Genome Skimming: A Rapid Approach to Gaining Diverse Biological Insights into Multicellular Pathogens". PLOS Pathogens. 12 (8): e1005713. doi:10.1371/journal.ppat.1005713. ISSN 1553-7374. PMC 4973915. PMID 27490201.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  7. ^ a b c d e f g h i j k Lin, Geng-Ming; Lai, Yu-Heng; Audira, Gilbert; Hsiao, Chung-Der (November 2017). "A Simple Method to Decode the Complete 18-5.8-28S rRNA Repeated Units of Green Algae by Genome Skimming". International Journal of Molecular Sciences. 18 (11): 2341. doi:10.3390/ijms18112341. PMC 5713310. PMID 29113146.
  8. ^ a b c d e f g Liu, Luxian; Wang, Yuewen; He, Peizi; Li, Pan; Lee, Joongku; Soltis, Douglas E.; Fu, Chengxin (2018-04-04). "Chloroplast genome analyses and genomic resource development for epilithic sister genera Oresitrophe and Mukdenia (Saxifragaceae), using genome skimming data". BMC Genomics. 19 (1): 235. doi:10.1186/s12864-018-4633-x. ISSN 1471-2164. PMC 5885378. PMID 29618324.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  9. ^ a b c d e f g h i j k l m Hinsinger, Damien Daniel; Strijk, Joeri Sergej (2019-01-10). "Plastome of Quercus xanthoclada and comparison of genomic diversity amongst selected Quercus species using genome skimming". PhytoKeys. 132: 75–89. doi:10.3897/phytokeys.132.36365. ISSN 1314-2003. PMC 6783484. PMID 31607787.
  10. ^ a b c d e f g h i Johri, Shaili; Solanki, Jitesh; Cantu, Vito Adrian; Fellows, Sam R.; Edwards, Robert A.; Moreno, Isabel; Vyas, Asit; Dinsdale, Elizabeth A. (December 2019). "'Genome skimming' with the MinION hand-held sequencer identifies CITES-listed shark species in India's exports market". Scientific Reports. 9 (1): 4476. Bibcode:2019NatSR...9.4476J. doi:10.1038/s41598-019-40940-9. ISSN 2045-2322. PMC 6418218. PMID 30872700.
  11. ^ a b c Berger, Brent A.; Han, Jiahong; Sessa, Emily B.; Gardner, Andrew G.; Shepherd, Kelly A.; Ricigliano, Vincent A.; Jabaily, Rachel S.; Howarth, Dianella G. (2017). "The unexpected depths of genome-skimming data: A case study examining Goodeniaceae floral symmetry genes1". Applications in Plant Sciences. 5 (10): 1700042. doi:10.3732/apps.1700042. ISSN 2168-0450. PMC 5664964. PMID 29109919.
  12. ^ Berger, Brent A.; Han, Jiahong; Sessa, Emily B.; Gardner, Andrew G.; Shepherd, Kelly A.; Ricigliano, Vincent A.; Jabaily, Rachel S.; Howarth, Dianella G. (October 2017). "The Unexpected Depths of Genome-Skimming Data: A Case Study Examining Goodeniaceae Floral Symmetry Genes". Applications in Plant Sciences. 5 (10): 1700042. doi:10.3732/apps.1700042. ISSN 2168-0450. PMC 5664964. PMID 29109919.
  13. ^ a b c d e f g h Zeng, Chun-Xia; Hollingsworth, Peter M.; Yang, Jing; He, Zheng-Shan; Zhang, Zhi-Rong; Li, De-Zhu; Yang, Jun-Bo (2018-06-05). "Genome skimming herbarium specimens for DNA barcoding and phylogenomics". Plant Methods. 14 (1): 43. doi:10.1186/s13007-018-0300-0. ISSN 1746-4811. PMC 5987614. PMID 29928291.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  14. ^ a b c d e f g h i j k Nevill, Paul G.; Zhong, Xiao; Tonti-Filippini, Julian; Byrne, Margaret; Hislop, Michael; Thiele, Kevin; van Leeuwen, Stephen; Boykin, Laura M.; Small, Ian (2020-01-04). "Large scale genome skimming from herbarium material for accurate plant identification and phylogenomics". Plant Methods. 16 (1): 1. doi:10.1186/s13007-019-0534-5. ISSN 1746-4811. PMC 6942304. PMID 31911810.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  15. ^ a b c d e f g h i j k Linard, B.; Arribas, P.; Andújar, C.; Crampton‐Platt, A.; Vogler, A. P. (2016). "Lessons from genome skimming of arthropod-preserving ethanol" (PDF). Molecular Ecology Resources. 16 (6): 1365–1377. doi:10.1111/1755-0998.12539. ISSN 1755-0998. PMID 27235167.
  16. ^ a b c d e f g h i Liu, Shih-Hui; Edwards, Christine E.; Hoch, Peter C.; Raven, Peter H.; Barber, Janet C. (May 2018). "Genome skimming provides new insight into the relationships in Ludwigia section Macrocarpon , a polyploid complex". American Journal of Botany. 105 (5): 875–887. doi:10.1002/ajb2.1086. PMID 29791715.
  17. ^ a b c d e f g h Nauheimer, Lars; Cui, Lujing; Clarke, Charles; Crayn, Darren M.; Bourke, Greg; Nargar, Katharina (2019). "Genome skimming provides well resolved plastid and nuclear phylogenies, showing patterns of deep reticulate evolution in the tropical carnivorous plant genus Nepenthes (Caryophyllales)". Australian Systematic Botany. 32 (3): 243–254. doi:10.1071/SB18057. ISSN 1030-1887.
  18. ^ a b c d e f g h i Ripma, Lee A.; Simpson, Michael G.; Hasenstab-Lehman, Kristen (December 2014). "Geneious! Simplified Genome Skimming Methods for Phylogenetic Systematic Studies: A Case Study in Oreocarya (Boraginaceae)". Applications in Plant Sciences. 2 (12): 1400062. doi:10.3732/apps.1400062. ISSN 2168-0450. PMC 4259456. PMID 25506521.
  19. ^ a b c d e f g h Lang, Dandan; Tang, Min; Hu, Jiahui; Zhou, Xin (November 2019). "Genome‐skimming provides accurate quantification for pollen mixtures". Molecular Ecology Resources. 19 (6): 1433–1446. doi:10.1111/1755-0998.13061. ISSN 1755-098X. PMC 6900181. PMID 31325909.
  20. ^ a b c d Stoughton, Thomas R.; Kriebel, Ricardo; Jolles, Diana D.; O'Quinn, Robin L. (March 2018). "Next-generation lineage discovery: A case study of tuberous Claytonia L." American Journal of Botany. 105 (3): 536–548. doi:10.1002/ajb2.1061. PMID 29672830.
  21. ^ a b c d e f Dodsworth, Steven; Guignard, Maïté S.; Christenhusz, Maarten J. M.; Cowan, Robyn S.; Knapp, Sandra; Maurin, Olivier; Struebig, Monika; Leitch, Andrew R.; Chase, Mark W.; Forest, Félix (2018-10-29). "Potential of Herbariomics for Studying Repetitive DNA in Angiosperms". Frontiers in Ecology and Evolution. 6: 174. doi:10.3389/fevo.2018.00174. ISSN 2296-701X.
  22. ^ a b c d Jackson, David; Emslie, Steven D; van Tuinen, Marcel (2012). "Genome skimming identifies polymorphism in tern populations and species". BMC Research Notes. 5 (1): 94. doi:10.1186/1756-0500-5-94. ISSN 1756-0500. PMC 3292991. PMID 22333071.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  23. ^ a b c Xia, Yun; Luo, Wei; Yuan, Siqi; Zheng, Yuchi; Zeng, Xiaomao (December 2018). "Microsatellite development from genome skimming and transcriptome sequencing: comparison of strategies and lessons from frog species". BMC Genomics. 19 (1): 886. doi:10.1186/s12864-018-5329-y. ISSN 1471-2164. PMC 6286531. PMID 30526480.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  24. ^ a b c d e f g Fonseca, Luiz Henrique M.; Lohmann, Lúcia G. (January 2020). "Exploring the potential of nuclear and mitochondrial sequencing data generated through genome‐skimming for plant phylogenetics: A case study from a clade of neotropical lianas". Journal of Systematics and Evolution. 58 (1): 18–32. doi:10.1111/jse.12533. ISSN 1674-4918.
  25. ^ a b c d Bock, Dan G.; Kane, Nolan C.; Ebert, Daniel P.; Rieseberg, Loren H. (February 2014). "Genome skimming reveals the origin of the Jerusalem Artichoke tuber crop species: neither from Jerusalem nor an artichoke". New Phytologist. 201 (3): 1021–1030. doi:10.1111/nph.12560. PMID 24245977.
  26. ^ a b c d e f Richter, Sandy; Schwarz, Francine; Hering, Lars; Böggemann, Markus; Bleidorn, Christoph (December 2015). "The Utility of Genome Skimming for Phylogenomic Analyses as Demonstrated for Glycerid Relationships (Annelida, Glyceridae)". Genome Biology and Evolution. 7 (12): 3443–3462. doi:10.1093/gbe/evv224. ISSN 1759-6653. PMC 4700955. PMID 26590213.
  27. ^ a b c d e f g Grandjean, Frederic; Tan, Mun Hua; Gan, Han Ming; Lee, Yin Peng; Kawai, Tadashi; Distefano, Robert J.; Blaha, Martin; Roles, Angela J.; Austin, Christopher M. (November 2017). "Rapid recovery of nuclear and mitochondrial genes by genome skimming from Northern Hemisphere freshwater crayfish". Zoologica Scripta. 46 (6): 718–728. doi:10.1111/zsc.12247.
  28. ^ a b "Geneious – OSTR". Retrieved 2020-02-28.
  29. ^ Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron (September 2014). "Hyb-Seq: Combining Target Enrichment and Genome Skimming for Plant Phylogenomics". Applications in Plant Sciences. 2 (9): 1400042. doi:10.3732/apps.1400042. ISSN 2168-0450. PMC 4162667. PMID 25225629.
  30. ^ Jin, Jian-Jun; Yu, Wen-Bin; Yang, Jun-Bo; Song, Yu; dePamphilis, Claude W.; Yi, Ting-Shuang; Li, De-Zhu (2018-03-09). "GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes". doi:10.1101/256479. {{cite journal}}: Cite journal requires |journal= (help)
  31. ^ Langmead, Ben; Salzberg, Steven L (Mar 2012). "Fast gapped-read alignment with Bowtie 2". Nature Methods. 9 (4): 357–359. doi:10.1038/nmeth.1923. ISSN 1548-7091. PMC 3322381. PMID 22388286.
  32. ^ Bankevich, Anton; Nurk, Sergey; Antipov, Dmitry; Gurevich, Alexey A.; Dvorkin, Mikhail; Kulikov, Alexander S.; Lesin, Valery M.; Nikolenko, Sergey I.; Pham, Son; Prjibelski, Andrey D.; Pyshkin, Alexey V. (May 2012). "SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing". Journal of Computational Biology. 19 (5): 455–477. doi:10.1089/cmb.2012.0021. ISSN 1066-5277. PMC 3342519. PMID 22506599.
  33. ^ a b c d e Sarmashghi, Shahab; Bohmann, Kristine; P. Gilbert, M. Thomas; Bafna, Vineet; Mirarab, Siavash (December 2019). "Skmer: assembly-free and alignment-free sample identification using genome skims". Genome Biology. 20 (1): 34. doi:10.1186/s13059-019-1632-4. ISSN 1474-760X. PMC 6374904. PMID 30760303.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  34. ^ Marçais, Guillaume; Kingsford, Carl (2011-03-15). "A fast, lock-free approach for efficient parallel counting of occurrences of k-mers". Bioinformatics. 27 (6): 764–770. doi:10.1093/bioinformatics/btr011. ISSN 1460-2059. PMC 3051319. PMID 21217122.
  35. ^ Ondov, Brian D.; Treangen, Todd J.; Melsted, Páll; Mallonee, Adam B.; Bergman, Nicholas H.; Koren, Sergey; Phillippy, Adam M. (Dec 2016). "Mash: fast genome and metagenome distance estimation using MinHash". Genome Biology. 17 (1): 132. doi:10.1186/s13059-016-0997-x. ISSN 1474-760X. PMC 4915045. PMID 27323842.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  36. ^ Lin, Diana; Coombe, Lauren; Jackman, Shaun D.; Gagalova, Kristina K.; Warren, René L.; Hammond, S. Austin; Kirk, Heather; Pandoh, Pawan; Zhao, Yongjun; Moore, Richard A.; Mungall, Andrew J. (2019-06-06). Rokas, Antonis (ed.). "Complete Chloroplast Genome Sequence of a White Spruce ( Picea glauca , Genotype WS77111) from Eastern Canada". Microbiology Resource Announcements. 8 (23): e00381–19, /mra/8/23/MRA.00381–19.atom. doi:10.1128/MRA.00381-19. ISSN 2576-098X. PMC 6554609. PMID 31171622.
  37. ^ Lin, Diana; Coombe, Lauren; Jackman, Shaun D.; Gagalova, Kristina K.; Warren, René L.; Hammond, S. Austin; McDonald, Helen; Kirk, Heather; Pandoh, Pawan; Zhao, Yongjun; Moore, Richard A. (2019-06-13). Stajich, Jason E. (ed.). "Complete Chloroplast Genome Sequence of an Engelmann Spruce ( Picea engelmannii , Genotype Se404-851) from Western Canada". Microbiology Resource Announcements. 8 (24): e00382–19, /mra/8/24/MRA.00382–19.atom. doi:10.1128/MRA.00382-19. ISSN 2576-098X. PMC 6588038. PMID 31196920.
  38. ^ Johri, Shaili; Doane, Michael; Allen, Lauren; Dinsdale, Elizabeth (2019-03-29). "Taking Advantage of the Genomics Revolution for Monitoring and Conservation of Chondrichthyan Populations". Diversity. 11 (4): 49. doi:10.3390/d11040049. ISSN 1424-2818.
  39. ^ Coissac, Eric; Hollingsworth, Peter M.; Lavergne, Sébastien; Taberlet, Pierre (April 2016). "From barcodes to genomes: extending the concept of DNA barcoding". Molecular Ecology. 25 (7): 1423–1428. doi:10.1111/mec.13549. PMID 26821259.