Similarity Matrix of Proteins, or SIMAP, is a database of protein similarities created using distributed computing, which is freely accessible for scientific purposes. SIMAP uses the FASTA algorithm to precalculate protein similarity, while another application uses hidden Markov models to search for Protein domains.
The project usually gets new work units at the beginning of each month. More recently, (2010), inclusion of environmental sequences into the database has required longer periods of activity, several months of continuous work for example. Typically, these updates occur twice each year.
In the fourth quarter of 2010, the project relocated to the University of Vienna due to the failing electrical infrastructure at the Technical University of Munich. Part of this exercise involved the creation of a project specific URL requiring existing volunteers and users to detach/reattach to the project.
SIMAP uses the Berkeley Open Infrastructure for Network Computing (BOINC) distributed computing platform.
Application performance notes:
- Work unit CPU times can vary widely, ranging between 15 minutes and 3 hours.
- Work units vary in size from 1.5 to 2.2 MB each, averaging around 2 MB.
- SIMAP provides client software optimized for SSE enabled processors and x86-64 processors. For older processors non SSE applications are provided but require manual installation steps to be taken. Operating Systems supported by SIMAP are Linux, Windows, Mac OS and other UNIX platforms.
- Since the database has sometimes been completed with all publicly known protein sequences and metagenomes having been precalculated by the project, the work available consists of newly published protein sequences and metagenomes that need to be precomputed for SIMAP.
|This scientific software article is a stub. You can help Wikipedia by expanding it.|