The Bioinformatic Harvester is a bioinformatic meta search engine created by the European Molecular Biology Laboratory and subsequently hosted and further developed by KIT Karlsruhe Institute of Technology for genes and protein-associated information. Harvester currently works for human, mouse, rat, zebrafish, drosophila and arabidopsis thaliana based information. Harvester cross-links >50 popular bioinformatic resources and allows cross searches. Harvester serves tens of thousands of pages every day to scientists and physicians.
|Developer(s)||Urban Liebel, Björn Kindler|
|Stable release||4 / May 24, 2011|
|Operating system||Web based|
How Harvester works
Harvester collects information from protein and gene databases along with information from so called "prediction servers." Prediction server e.g. provide online sequence analysis for a single protein. Harvesters search index is based on the IPI and UniProt protein information collection. The collections consists of:
- ~72.000 human, ~57.000 mouse, ~41.000 rat, ~51.000 zebrafish, ~35.000 arabidopsis protein pages, which cross-link ~50 major bioinfiormatic resources.
Text based information
From the following databases:
- UniProt, one of the largest protein databases
- SOURCE, convenient gene information overview
- Simple Modular Architecture Research Tool (SMART)
- SOSUI, predicts transmembrane domains
- PSORT, predicts protein localisation
- HomoloGene, compares proteins from different species
- gfp-cdna, protein localisation with fluorescence microscopy
- International Protein Index (IPI)
Databases rich in graphical elements
These databases are not collected, but are crosslinked, being displayed via iframes. An iframe is a window within an HTML page for an embedded view of and interactive access to the linked database. Several such iframes are combined on a single Harvester protein page. This allows simultaneous convenient comparison of information from several databases.
- NCBI-BLAST, an algorithm for comparing biological sequences from the NCBI
- Ensembl, automatic gene annotation by the EMBL-EBI and Sanger Institute
- FlyBase is a database of model organism Drosophila melanogaster
- GoPubMed is a knowledge-based search engine for biomedical texts
- iHOP, information hyperlinked over proteins via gene/protein synonyms
- Mendelian Inheritance in Man project catalogues all the known diseases
- RZPD, German resources Center for genome research in Berlin/Heidelberg
- STRING, Search Tool for the Retrieval of Interacting Genes/Proteins, developed by EMBL, SIB and UZH
- Zebrafish Information Network
- LOCATE subcellular localization database (mouse)
Access from external application
- Genome browser, working draft assemblies for genomes UCSC
- Google Scholar
- PolyMeta, meta search engine for Google, Yahoo, MSN, Ask, Exalead, AllTheWeb, GigaBlast
What one can find
Harvester allows a combination of different search terms and single words.
- Gene-name: "golga3"
- Gene-alias: "ADAP-S ADAS ADHAPS ADPS" (one gene name is sufficient)
- Gene-Ontologies: "Enzyme linked receptor protein signaling pathway"
- Unigene-Cluster: "Hs.449360"
- Go-annotation: "intra-Golgi transport"
- Molecular function: "protein kinase binding"
- Protein: "Q9NPD3"
- Protein domain: "SH2 sar"
- Protein Localisation: "endoplasmic reticulum"
- Chromosome: "2q31"
- Disease relevant: use the word "diseaselink"
- Combinations: "golgi diseaselink" (finds all golgi proteins associated with a disease)
- mRNA: "AL136897"
- Word: "Cancer"
- Comment: "highly expressed in heart"
- Author: "Merkel, Schmidt"
- Publication or project: "cDNA sequencing project"
- Biological databases
- European Bioinformatics Institute
- Human Protein Reference Database
- Sequence profiling tool
- Liebel U, Kindler B, Pepperkok R (August 2004). "'Harvester': a fast meta search engine of human protein resources". Bioinformatics 20 (12): 1962–3. doi:10.1093/bioinformatics/bth146. PMID 14988114.
- Liebel U, Kindler B, Pepperkok R (2005). "Bioinformatic "Harvester": a search engine for genome-wide human, mouse, and rat protein resources". Meth. Enzymol. 404: 19–26. doi:10.1016/S0076-6879(05)04003-6. PMID 16413254.
Notes and references
- Manoj, M, Elizabeth, Jacob (Oct 2008). "Information retrieval on Internet using meta-search engines: A review". JSIR (CSIR) 67 (10): 739–746. ISSN 0022-4456.