Jump to content

HHpred/HHsearch: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Reverted edits by Soeding to last revision by 84.57.21.154 (HG)
Soeding (talk | contribs)
I renamed this page to "HHpred / HHsearch".
Line 1: Line 1:
{{Infobox Software
| name = HHsearch
| developer = Johannes Söding
| latest_release_version = 1.5.0
| latest_release_date = {{release date|2008|12}}
| programming language = [[C++ | C++]]
| language = [[English_language | English]]
| genre = [[Bioinformatics]] tool
| license = [[Creative Commons| Creative Commons Attribution-NonCommercial-2.0]]
| website = ftp://toolkit.lmb.uni-muenchen.de/hhsearch/
}}

'''HHsearch''' is a program for [[protein]] sequence searching<ref name="pmid15531603">{{ cite journal
| author = Söding J
| title = Protein homology detection by HMM-HMM comparison
| journal = Bioinformatics
| year = 2005
| volume = 21
| issue = 7
| pages = 951-960
| PMID = 15531603}}
</ref> that is free for non-commercial use. [http://toolkit.lmb.uni-muenchen.de/hhpred '''HHpred'''] is a free protein function and [[protein structure prediction]] server based on the HHsearch method.<ref name="pmid15980461">{{ cite journal
| author = Söding J, Biegert A, Lupas AN.
| title = The HHpred interactive server for protein homology detection and structure prediction\
| journal = Nucleic Acids Res
| year = 2005
| volume = 33
| issue = (Web Server issue)
| pages = W244-248
| PMID = 15980461}}
</ref> HHpred/HHsearch are among the most popular methods for protein structure prediction and the detection of remotely related sequences, having been cited over 340 times [http://scholar.google.de/scholar?q=(hhpred+OR+HHsearch)+and+protein&hl=en&lr=&btnG=Search (Google Scholar search)].

Sequence searches are frequently performed by biologists to infer the function of an unknown protein from its sequence. For this purpose, the protein's sequence is compared to the sequences of other proteins in public databases and its function is deduced from those of the most similar sequences. Often, no sequences with annotated functions can be found in such a search. In this case, more sensitive methods are required to identify more remotely related proteins or [[protein family|protein families]]. From these relationships, hypotheses about the protein's functions, [[Protein structure prediction|structure]], and [[Protein domain|domain composition]] can be inferred. HHsearch performs searches with a protein sequence through databases. The HHpred server and the HHsearch software package offer many popular, regularly updated databases, such as the [[Protein Data Bank|PDB (protein data bank)]], the [[InterPro]], [[Pfam]], [http://www.ncbi.nlm.nih.gov/COG/ COG], or [http://scop.mrc-lmb.cam.ac.uk/scop/ SCOP] databases.

HHsearch belongs to the class of profile-profile comparison tools, which includes the most sensitive sequence search methods to date.<ref name="pmid10975570">{{cite journal
| author = Jaroszewski L, Rychlewski L, Godzik A.
| title = Improving the quality of twilight-zone alignments.
| journal = Protein Sci
| year = 2000
| volume = 9
| issue = 8
| pages = 1487-1496
| PMID = 10975570}}
</ref> <ref>{{cite journal
| author = Sadreyev RI, Baker D, Grishin NV
| title = Profile-profile comparisons by COMPASS predict intricate homologies between protein families
| journal = Protein Sci
| year = 2003
| volume = 12
| issue = 10
| pages = 2262-2272
| PMID = 14500884}}
</ref> <ref>{{cite journal
| author = Dunbrack RL Jr.
| title = Sequence comparison and protein structure prediction.
| journal = Curr Opin Struct Biol
| year = 2006
| volume = 16
| issue = 3
| pages = 374-384
| PMID = 16713709}}
</ref> <ref name="pmid15531603"></ref>
They represent both the query sequence and the database sequences by ''sequence profiles'', also called [[position-specific scoring matrix|''position-specific scoring matrices (PSSMs)'']]. Profiles are calculated from a [[multiple sequence alignment]] of related sequences which are typically collected using the [[BLAST|PSI-BLAST]] program<ref>{{ cite journal
| author = Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ
| title = Basic local alignment search tool.
| journal = J Mol Biol
| year = 1990
| volume = 215
| issue = 3
| pages = 403-410
| PMID = 2231712}}
</ref> from [[National Center for Biotechnology Information|NCBI ]]. A profile is a matrix containing for each position in the query sequence the similarity score for the 20 amino acids. These scores are calculated from the frequencies of the amino acids at the corresponding positions in the multiple sequence alignment. Because profiles contain much more information than a single sequence (e.g. the position-specific degree of conservation), profile-profile comparison methods are much more powerful than sequence-sequence comparison methods like [[BLAST]] or profile-sequence comparison methods like [[BLAST|PSI-BLAST]].<ref name="pmid10975570"></ref>

HHpred represents query and database proteins by [[hidden Markov model|profile hidden Markov models (HMMs)]], an extension of [[position-specific scoring matrix|sequence profiles]] which also record position-specific amino acid insertion and deletion frequencies. HHsearch searches a database of HMMs with a query HMM. Before starting the search through the actual database of HMMs, HHsearch/HHpred builds a [[multiple sequence alignment]] of related sequences using a context-specific version of [[BLAST|PSI-BLAST]] ([[CS-BLAST|CSI-BLAST]]). From this alignment, a profile HMM is calculated. The databases contain HMMs that are precalculated in the same fashion using PSI-BLAST. The output of HHpred and HHsearch is a ranked list of database matches (including E-values and probabilities for a true relationship) and the pairwise query-database sequence alignments. A search through the PDB database of proteins with solved 3D structure takes a few minutes. If a significant match with a protein of known structure (a "template") is found in the PDB database, HHpred allows to build a homology model using the [http://www.salilab.org/modeller/ MODELLER] software, starting from the pairwise query-template alignment.

Applications of HHpred/HHsearch include protein structure prediction, function prediction, domain prediction, domain boundary prediction, and evolutionary classification of proteins. In the CASP7 benchmark experiment (see [[CASP|CASP - Critical Assessment of Techniques for Protein Structure Prediction]]), HHpred5 was ranked 2nd out of 68 automatic structure prediction servers, while being more than 50 times faster than the best 20 servers.<ref>{{ cite journal
| author = Battey JN, Kopp J, Bordoli L, Read RJ, Clarke ND, Schwede T
| title = Automated server predictions in CASP7
| journal = Proteins
| year = 2007
| volume = 69
| issue = Suppl 8
| pages = 68-82.
| pmid = 17894354
}}
</ref>


==See also==
*[[Sequence alignment software]]
*[[Protein structure prediction]]
*[[Position-specific scoring matrix]]
*[[Multiple sequence alignment]]
*[[CASP|CASP - Critical Assessment of Techniques for Protein Structure Prediction]]
*[[BLAST| BLAST (Basic Local Alignment Search Tool)]]
*[[CS-BLAST| Context-specific BLAST (CS-BLAST)]]

==References==
{{reflist}}


==External links==
*http://toolkit.lmb.uni-muenchen.de/hhpred (free server at University of Munich (LMU))
*http://toolkit.tuebingen.mpg.de/hhpred (free server at Max-Planck Institute in Tuebingen)
*[http://predictioncenter.org/ CASP website]

{{bioinformatics-stub}}
[[Category:Bioinformatics]]
[[Category:Bioinformatics software]]
[[Category:Computational science]]

Revision as of 20:47, 14 April 2009