MEGAN

MEGAN
Developer(s)	Daniel Huson et al.
Stable release	6.4 / 2016
Written in	Java
Operating system	Windows, Unix, Linux, Mac OS X
Type	Bioinformatics
License	Free open source "community edition", commercial "Ultimate edition" licensed by Computomics
Website	http://ab.inf.uni-tuebingen.de/software/megan6/welcome/

MEGAN ("MEtaGenome ANalyzer") is a computer program that allows optimized analysis of large metagenomic datasets.^[1]^[2]

Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample. A large term goal of most metagenomics is to inventory and measure the extent and the role of microbial biodiversity in the ecosystem due to discoveries that the diversity of microbial organisms and viral agents in the environment is far greater than previously estimated.^[3] Tools that allow the investigation of very large data sets from environmental samples using shotgun sequencing techniques in particular, such as MEGAN, are designed to sample and investigate the unknown biodiversity of environmental samples where more precise techniques with smaller, better known samples, cannot be used.

Fragments of DNA from an metagenomics sample, such as ocean waters or soil, are compared against databases of known DNA sequences using BLAST or another sequence comparison tool to assemble the segments into discrete comparable sequences. MEGAN is then used to compare the resulting sequences with gene sequences from GenBank in NCBI.^[4] The program was used to investigate the DNA of a mammoth recovered from the Siberian permafrost ^[5] and Sargasso Sea data set.^[6]

Introduction

Metagenomics is the study of genomic content of samples from same habitat, which is designed to determine the role and the extent of species diversity. Targeted or random sequencing are widely used with comparisons against sequence databases.^[1] Recent developments in sequencing technology increased the number of metagenomics samples. MEGAN is an easy to use tool for analysing such metagenomics data. First version of MEGAN was released in 2007 ^[1] and the most recent version is MEGAN6.^[7] First version is capable of analysing taxonomic content of a single dataset while the latest version can analyse multiple datasets including new features (query different databases, new algorithm etc.).

MEGAN Pipeline

MEGAN analysis starts with collecting reads from any shotgun platform. Then, the reads are compared with sequence databases using BLAST or similar. Third, MEGAN assigns a taxon ID to processed read results based on NCBI taxonomy which creates a MEGAN file that contains required information for statistical and graphical analysis. Lastly, lowest common ancestor (LCA) algorithm can be run to inspect assignments, to analyze data and to create summaries of data based on different NCBI taxonomy levels. LCA algorithm simply finds the lowest common ancestor of different species.^[1]^[2]

How to use MEGAN

Latest version of MEGAN can be downloaded here.^[8] It is available in Windows, MAC and Unix platforms. The Community edition is open-source and free to use, the Ultimate edition with command-line support is licensed by Computomics.

MEGAN can be used to explore taxonomic diversification of the dataset which could be collected from any type of metagenomic project or sequencing platform. In pre-processing step, the set of DNA reads is compared with sequence databases which can be computationally exhaustive and computationally complex for a standard user. MEGAN makes such a task easy and data analyses can be made on a workstation after completing sequence comparison on a computer cluster. In addition to that, functional analysis using SEED, functional analysis using KEGG and functional analysis using COG/EGGNOG is possible. Principal coordinate analysis (PCoA) is also available in the latest version for taxonomy and functional profiles. Comparative visualization options also provides extra functionality to display and present data.^[7]

References

^ ^a ^b ^c ^d Huson, H.; A. Auch; Ji Qi; S. C. Schuster (2007). "MEGAN Analysis of Metagenomic Data". Genome Research. 17 (3). Woodbury, New York: Cold Spring Harbor Laboratory Press: 377–386. doi:10.1101/gr.5969107. PMC 1800929. PMID 17255551. Retrieved April 3, 2008.
^ ^a ^b Huson, Daniel H. "Integrative analysis of environmental sequences using MEGAN4". Genome Research. 21 (9): 1552–1560. doi:10.1101/gr.120618.111. PMC 3166839. PMID 21690186. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Nee, S. (2004). "More than meets the eye". Nature. 429 (6994): 804–805. doi:10.1038/429804a. PMID 15215837.
^ Frias-Lopez, Jorge; Yanmei Shi; Gene W. Tyson; Maureen L. Coleman; Stephan C. Schuster; Sallie W. Chisholm; band Edward F. DeLong (March 11, 2008). "Microbial community gene expression in ocean surface waters" (PDF). PNAS. 105 (10). United States of America: National Academy of Sciences: 3805–3810. doi:10.1073/pnas.0708897105. PMC 2268829. PMID 18316740. Retrieved April 3, 2008.
^ Poinar, Hendrik N. (2007). "Metagenomics to Paleogenomics: Large-Scale Sequencing of Mammoth DNA". Science. 331 (6016). United States of America: American Association for the Advancement of Science: 392–394. doi:10.1126/science.331.6016.392. PMID 21273464. Retrieved April 3, 2008. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Smith, Hamilton O, Venter, J Craig (April 2004). "Environmental Genome Shotgun Sequencing of the Sargasso Sea". 304 (5667): 66–74. doi:10.1126/science.1093857. PMID 15001713. {{cite journal}}: Cite journal requires |journal= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: multiple names: authors list (link)
^ ^a ^b "MEGAN6 — Algorithms in Bioinformatics". ab.inf.uni-tuebingen.de. Retrieved June 12, 2016.
^ "MEGAN6-download". ab.inf.uni-tuebingen.de. Retrieved June 12, 2016.

[MEGAN-1] Huson, H.; A. Auch; Ji Qi; S. C. Schuster (2007). "MEGAN Analysis of Metagenomic Data". Genome Research. 17 (3). Woodbury, New York: Cold Spring Harbor Laboratory Press: 377–386. doi:10.1101/gr.5969107. PMC 1800929. PMID 17255551. Retrieved April 3, 2008.

[MEGAN2011-2] Huson, Daniel H. "Integrative analysis of environmental sequences using MEGAN4". Genome Research. 21 (9): 1552–1560. doi:10.1101/gr.120618.111. PMC 3166839. PMID 21690186. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[3] Nee, S. (2004). "More than meets the eye". Nature. 429 (6994): 804–805. doi:10.1038/429804a. PMID 15215837.

[FriazLopez2008-4] Frias-Lopez, Jorge; Yanmei Shi; Gene W. Tyson; Maureen L. Coleman; Stephan C. Schuster; Sallie W. Chisholm; band Edward F. DeLong (March 11, 2008). "Microbial community gene expression in ocean surface waters" (PDF). PNAS. 105 (10). United States of America: National Academy of Sciences: 3805–3810. doi:10.1073/pnas.0708897105. PMC 2268829. PMID 18316740. Retrieved April 3, 2008.

[Poinar2007-5] Poinar, Hendrik N. (2007). "Metagenomics to Paleogenomics: Large-Scale Sequencing of Mammoth DNA". Science. 331 (6016). United States of America: American Association for the Advancement of Science: 392–394. doi:10.1126/science.331.6016.392. PMID 21273464. Retrieved April 3, 2008. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[6] Smith, Hamilton O, Venter, J Craig (April 2004). "Environmental Genome Shotgun Sequencing of the Sargasso Sea". 304 (5667): 66–74. doi:10.1126/science.1093857. PMID 15001713. {{cite journal}}: Cite journal requires |journal= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: multiple names: authors list (link)

[ab.inf.uni-tuebingen.de-7] "MEGAN6 — Algorithms in Bioinformatics". ab.inf.uni-tuebingen.de. Retrieved June 12, 2016.

[8] "MEGAN6-download". ab.inf.uni-tuebingen.de. Retrieved June 12, 2016.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]