From Wikipedia, the free encyclopedia
DescriptionFor comparative metagenomic studies
Research centerUniversity of Alberta
LaboratoryDavid S. Wishart
Primary citation[1]
Release date2012
Data formatData input: Taxonomic profile and sample-specific metadata information. Data output: Statistical analysis results and plots/graphs.
Data release
Last updated on 2014
Curation policyManually curated

METAGENassist is a freely available web server for comparative metagenomic analysis.[1] Comparative metagenomic studies involve the large-scale comparison of genomic or taxonomic census data from bacterial samples across different environments. Historically this has required a sound knowledge of statistics, computer programming, genetics and microbiology. As a result, only a small number of researchers are routinely able to perform comparative metagenomic studies. To circumvent these limitations, METAGENassist was developed to allow metagenomic analyses to be performed by non-specialists, easily and intuitively over the web. METAGENassist is particularly notable for its rich graphical output and its extensive database of bacterial phenotypic information.


METAGENassist is designed to support a wide range of statistical comparisons across metagenomic samples. METAGENassist accepts a wide range of bacterial census data or taxonomic profile data derived from 16S rRNA data, classical DNA sequencing, NextGen shotgun sequencing or even classical microbial culturing techniques. These taxonomic profile data can be in different formats including standard comma-separated value (CSV) formats or in program-specific formats generated by tools such as mothur [2] and QIIME.[3] Once the data are uploaded to the website, METAGENassist offers users a large selection of data pre-processing and data quality checking tools such as: 1) taxonomic name normalization; 2) taxonomic-to-phenotypic mapping; 3) data integrity/quality checks and 4) data normalization. METAGENassist also supports an extensive collection of classical univariate and multivariate analyses, such as fold-change analysis, t-tests, one-way ANOVA, partial least-squares discriminant analysis (PLS-DA) and principal component analysis (PCA). Each of these analyses generates colorful, informative graphs and tables in PNG or PDF formats. All of the processed data and images are also available for download. These data analysis and visualization tools can be used to visualize key features that distinguish or characterize microbial populations in different environments or in different conditions. METAGENassist distinguishes itself from most other metagenomics data analysis tools through its extensive use of automated taxonomic-to-phenotypic mapping and its ability to support sophisticated data analyses with the resulting phenotypic data. METAGENassist’s phenotype database covers more than 11,000 microbial species annotated with 20 different phenotypic categories, including oxygen requirements, energy source(s), metabolism, and GC content. This gives users substantially more features with which to compare and analyze different samples. The phenotype database is regularly updated with information retrieved from several resources including BacMap,[4] GOLD,[5] and other NCBI taxonomy resources.[6]

See also[edit]


  1. ^ a b Arndt D; Xia J; Liu Y; Zhou Y; Guo AC; Cruz JA; Sinelnikov I; Budwill K; Nesbø CL; Wishart DS (July 2012). "METAGENassist: a comprehensive web server for comparative metagenomics". Nucleic Acids Res. 40 (Web Server issue): W88-95. doi:10.1093/nar/gks497. PMC 3394294. PMID 22645318.
  2. ^ Schloss, P.D.; Westcott,S.L.; Ryabin,T.; Hall,J.R.; Hartmann,M.; Hollister,E.B.; Lesniewski,R.A.; Oakley,B.B.; Parks,D.H.; Robinson,C.J.; et al. (2009). "Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities". Appl. Environ. Microbiol. 75 (23): 7537–7541. Bibcode:2009ApEnM..75.7537S. doi:10.1128/AEM.01541-09. PMC 2786419. PMID 19801464.
  3. ^ Caporaso, J.G.; Kuczynski,J.; Stombaugh,J.; Bittinger,K.; Bushman,F.D.; Costello,E.K.; Fierer,N.; Pena,A.G.; Goodrich,J.K.; Gordon,J.I.; et al. (2010). "QIIME allows analysis of high-throughput community sequencing data". Nat. Methods. 7 (5): 335–336. doi:10.1038/nmeth.f.303. PMC 3156573. PMID 20383131.
  4. ^ Cruz, J.; Liu,Y.; Liang,Y.; Zhou,Y.; Wilson,M.; Dennis,J.J.; Stothard,P.; Van Domselaar,G.; Wishart,D.S. (2012). "BacMap: an up-to-date electronic atlas of annotated bacterial genomes". Nucleic Acids Res. 40 (Database issue): D599–D604. doi:10.1093/nar/gkr1105. PMC 3245156. PMID 22135301.
  5. ^ Pagani, I.; Liolios,K.; Jansson,J.; Chen,I.M.; Smirnova,T.; Nosrat,B.; Markowitz,V.M.; Kyrpides,N.C. (2012). "The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata". Nucleic Acids Res. 40 (Database issue): D571–D579. doi:10.1093/nar/gkr1100. PMC 3245063. PMID 22135293.
  6. ^ Sayers, E.W.; Barrett,T.; Benson,D.A.; Bolton,E.; Bryant,S.H.; Canese,K.; Chetvernin,V.; Church,D.M.; Dicuccio,M.; Federhen,S.; et al. (2012). "Database resources of the National Center for Biotechnology Information". Nucleic Acids Res. 40 (Database issue): D13–D25. doi:10.1093/nar/gkr1184. PMC 3245031. PMID 22140104.