|Description||genome-wide collections of gene phylogenies.|
|Laboratory||Comparative Genomics Group, Centre for Genomic Regulation (CRG), Barcelona, Spain.|
|Authors||Jaime Huerta-Cepas, Salvador Capella-Gutierrez, Leszek Pryszcz, Marina Marcet-Houben, Ernst Thür, Laia Carreté, Miguel Ángel Naranjo-Ortiz and Toni Gabaldón|
|Primary citation||Huerta-Cepas et al. (2014)|
PhylomeDB is a public biological database for complete catalogs of gene phylogenies (phylomes). It allows users to interactively explore the evolutionary history of genes through the visualization of phylogenetic trees and multiple sequence alignments. Moreover, phylomeDB provides genome-wide orthology and paralogy predictions which are based on the analysis of the phylogenetic trees. The automated pipeline used to reconstruct trees aims at providing a high-quality phylogenetic analysis of different genomes, including Maximum Likelihood tree inference, alignment trimming  and evolutionary model testing.
PhylomeDB includes also a public download section with the complete set of trees, alignments and orthology predictions, as well as a web API that facilitates cross linking trees from external sources. Finally, phylomeDB provides an advanced tree visualization interface based on the ETE toolkit, which integrates tree topologies, taxonomic information, domain mapping and alignment visualization in a single and interactive tree image.
New steps on phylomeDB
The tree searching engine of PhylomeDB was updated to provide a gene-centric view of all phylomeDB resources. Thus, after a protein or gene search, all the available trees in phylomeDB are listed and organized by phylome and tree type. Users can switch among all available seed and collateral trees without missing the focus on the searched protein or gene.
In phylomeDB v4 all the information available for each tree is now shown using an integrated layout in which tree topology, taxonomy data, alignments and domain annotations, and event-age (phylostratigraphy) information are rendered in the same figure using the newest visualization features provided by the ETE toolkit v2.2:
- Pfam domains have been mapped to each alignment in our database and are now displayed in a compact panel at the right side of the tree. For each sequence, domains and their names are shown, they can be clicked to obtain a short description and the external link to Pfam. Protein regions not mapped to domains are shown using the standard amino acid color codes, while gap regions are represented by a flat line.
- Tree images have been also simplified to improve readability. Mappings and/or cross-linking to general and organism-oriented databases has been extended to include the major Arabidopsis thaliana sequence database TAIR, Drosophila’s Flybase, as well as the Ascomycete-based genome database Genolevures.
- Speciation and duplication events are indicated using different node colors and branch support values are now automatically highlighted for lowly supported partitions using a transparent red bubble inversely proportional to the branch bootstrap or aLRT value.
- Internal tree searches can be performed for any of the annotated node attributes while links to other databases are provided through the contextual menu of the tree browser that appear when clicking any node.
Also, users can download relevant data, including the whole database, a specific phylome or, from the tree entry page, the relevant data corresponding to that tree. In this new release we have implemented the possibility to download orthology predictions from a tree in the recently developed OrthoXML standard format, in addition to a tabulated format.
Quest for Orthologs
The Quest for Orthologs (QfO) consortium involve more than 30 phylogenomic databases. The main of the consortium is improve and standardize orthology predictions through collaboration and discuss about new emerging methods.
- Huerta-Cepas, Jaime; Capella-Gutierrez, S; Pryszcz, LP; Marcet-Houben, M; Gabaldón, T (Jan 2014). "PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome". Nucleic Acids Res. England. 42 (Database issue): D897–902. doi:10.1093/nar/gkt1177. PMC . PMID 24275491.
- Huerta-Cepas, J; Bueno, A; Dopazo, J; Gabaldón, T (Jan 2008). "PhylomeDB: a database for genome-wide collections of gene phylogenies". Nucleic Acids Res. England. 36 (Database issue): D491–6. doi:10.1093/nar/gkm899. PMC . PMID 17962297.
- Huerta-Cepas, Jaime; Capella-Gutierrez, S; Pryszcz, LP; Denisov, I; Kormes, D; Marcet-Houben, M; Gabaldón, T (Jan 2011). "PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions". Nucleic Acids Res. England. 39 (Database issue): D556–60. doi:10.1093/nar/gkq1109. PMC . PMID 21075798.
- Capella-Gutierrez, S; Silla-Martínez, JM; Gabaldón, T (Aug 2009). "trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses". Bioinformatics. 25 (Database issue): 1972–3. doi:10.1093/bioinformatics/btp348. PMC . PMID 19505945.
- Huerta-Cepas, J; Dopazo, J; Gabaldón, T (Jan 2010). "ETE: a python Environment for Tree Exploration". BMC Bioinformatics. 11 (Database issue): 24. doi:10.1186/1471-2105-11-24. PMC . PMID 20070885.