= MEGAN =

MEGAN
- Developer: Daniel Huson et al.
- Latest Release Version: 6.25.10
- Latest Release Date: 2024
- Repo: https://github.com/husonlab/megan-ce
- Programming Language: Java
- Operating System: Windows, Unix, Linux, macOS
- Platform: Java
- Genre: Bioinformatics
- License: Free open source "community edition", commercial "Ultimate edition" licensed by Computomics

MEGAN (MEtaGenome ANalyzer) is a computer program that allows optimized analysis of large metagenomic datasets.

Metagenomics is the analysis of the genomic sequences from a usually uncultured environmental sample. One of its long-term goals is to inventory and measure the extent and role of microbial biodiversity in the ecosystem, based on discoveries that the diversity of microbial organisms and viral agents in the environment is far greater than previously estimated. MEGAN is an example of a tool that allows the investigation of very large datasets from environmental samples (using shotgun sequencing techniques in particular). It is designed to sample and investigate the unknown biodiversity of environmental samples where more precise techniques with smaller, better known samples, cannot be used.

Fragments of DNA from a metagenomics sample, such as ocean water or soil, are compared against databases of known DNA sequences using BLAST or another sequence comparison tool to assemble the segments into discrete, comparable sequences. MEGAN is then used to compare the resulting sequences with gene sequences from GenBank in NCBI. This program was used to investigate the DNA of a woolly mammoth recovered from the Siberian permafrost and the Sargasso Sea dataset.

== Introduction ==

Metagenomics is the study of genomic content of samples from the same habitat, aimed at determining the role and extent of species diversity. Both targeted and random sequencing are commonly used, with comparisons made against sequence databases. Recent developments in sequencing technology have led to an increase in the number of metagenomics samples. MEGAN is a tool for analyzing metagenomics data. The first version of MEGAN was released in 2007, and the most recent version is MEGAN6. While the initial version could analyze the taxonomic content of a single dataset, later versions can handle multiple datasets and include new features such as querying different databases and employing updated algorithms.

== MEGAN Pipeline ==

MEGAN analysis starts with collecting reads from any shotgun platform. Then, the reads are compared with sequence databases using BLAST or similar tools. After that, MEGAN assigns a taxon ID to processed read results based on NCBI taxonomy, creating a MEGAN file that contains the necessary information for statistical and graphical analysis. Lastly, the lowest common ancestor (LCA) algorithm can be run to inspect assignments, analyze data, and create summaries based on different NCBI taxonomy levels. The LCA algorithm identifies the lowest common ancestor among different species.

== Release history ==

| Version | Release date | Major highlights |
| 1.0 | 25 January 2007 | Initial standalone interactive tool for single metagenome analysis |
| 2.0 | | Multiple dataset support, read extraction by taxon, COGs analysis, accession numbers for read identification, basic charting capabilities |
| 3.0 | | New RMA file format |
| 4.0 | 20 June 2011 | Introduced integrative, comparative, functional analyses |
| 5.0 | April 2013 | COG/EGGNOG analysis, new SEED and KEGG mapping files, PCoA analysis of taxonomy and function, improved LCA algorithm, new charts, biome extraction methods |
| 6.0 | 21 June 2016 | New RMA6 file format; supports NCBI/SILVA taxonomy, InterPro2GO, SEED, eggNOG, and KEGG; gene-centric read assembly; enhanced GUI and metadata tools |
| 7.0 | 2 February 2024 | Updated taxonomic/functional mapping, KEGG, UniRef, enhanced security and scripting |
