List of biological databases

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Biological databases are stores of biological information.[1] The journal Nucleic Acids Research regularly publishes special issues on biological databases and has a list of such databases. For instance, the 2016 issue has a list of about 180 such databases and updates to previously described databases.[2]

Meta Databases[edit]

Meta databases are databases of databases that collect data about data to generate new data. They are capable of merging information from different sources and making it available in a new and more convenient form, or with an emphasis on a particular disease or organism.

Model Organism Databases[edit]

Model organism databases provide in-depth biological data for intensively studied model organisms.

Nucleic Acid Databases[edit]

DNA Databases[edit]

Primary Databases International Nucleotide Sequence Database (INSD) consists of the following databases.

The three databases, DDBJ (Japan), GenBank (USA) and European Nucleotide Archive (Europe), are repositories for nucleotide sequence data from all organisms. All three databases accept nucleotide sequence submissions, and then exchange new and updated data on a daily basis to achieve optimal synchronisation between them. These three databases are primary databases, as they house original sequence data. They collaborate with Sequence Read Archive (SRA), which archives raw reads from high-throughput sequencing instruments.

Secondary Databases

  • RefSeq
  • SNP / Disease Databases
  • OMIM Online Mendelian Inheritance in Man OMIM Inherited Diseases
  • HapMap
  • 23andme's database

Gene Expression Databases (mostly Microarray data)[edit]

Genome Databases

These databases collect genome sequences, annotate and analyze them, and provide public access. Some add curation of experimental literature to improve computed annotations. These databases may hold many species genomes, or a single model organism genome.

Phenotype Databases[edit]

  • PHI-base Pathogen-host interaction database. It links gene information to phenotypic information from microbial pathogens on their hosts. Information is manually curated from peer reviewed literature.
  • RGD Rat Genome Database: Genomic and phenotype data for Rattus norvegicus

RNA Databases[edit]

Amino Acid / Protein Databases[edit]

Protein Sequence Databases[edit]

Protein Structure Databases[edit]

  • Protein Data Bank (PDB) comprising:
    • Protein DataBank in Europe (PDBe)
    • ProteinDatabank in Japan (PDBj)
    • Research Collaboratory for Structural Bioinformatics (RCSB)

For more protein structure databases, see also Protein structure database

Protein Model Databases[edit]

  • Swiss-model Server and Repository for Protein Structure Models
  • ModBase Database of Comparative Protein Structure Models (Sali Lab, UCSF)
  • Protein Model Portal (PMP) Meta database that combines several databases of protein structure models (Biozentrum, Basel, Switzerland)
  • Similarity Matrix of Proteins (SIMAP) is a database of protein similarities computed using FASTA.

Protein-Protein and Other Molecular Interactions[edit]

Additional Databases[edit]

Signal Transduction Pathway Databases[edit]

Metabolic Pathway and Protein Function Databases[edit]

Exosomal Databases[edit]

Mathematical Model Databases[edit]

Taxonomic Databases[edit]

  • EzTaxon-e, database for the identification of prokaryotes based on 16S ribosomal RNA gene sequences
  • BacDive is a bacterial metadatabase that provides strain-linked information about bacterial and archaeal biodiversity, including taxonomy information.

Radiologic Databases[edit]

Wiki-Style Databases[edit]

Specialized Databases[edit]


  1. ^ Wren JD, Bateman A (2008). "Databases, data tombs and dust in the wind". Bioinformatics. 24 (19): 2127–8. doi:10.1093/bioinformatics/btn464. PMID 18819940. 
  2. ^ "Nucleic Acids Research Database issue 2016". Nucleic Acids Research. Oxford University Press. Retrieved 26 Oct 2016. 
  3. ^ Dash, Sudhansu; Campbell, Jacqueline D.; Cannon, Ethalinda K. S.; Cleary, Alan M.; Huang, Wei; Kalberer, Scott R.; Karingula, Vijay; Rice, Alex G.; Singh, Jugpreet (2016-01-04). "Legume information system ( a key component of a set of federated data resources for the legume family". Nucleic Acids Research. 44 (D1): D1181–D1188. doi:10.1093/nar/gkv1159. ISSN 0305-1048. PMC 4702835Freely accessible. PMID 26546515. 
  4. ^ "Sharing epigenomes globally". Nature Methods. 15 (3): 151–151. 2018. doi:10.1038/nmeth.4630. ISSN 1548-7105. 

External links[edit]