Protein–DNA interaction site predictor

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Structural and properties of RNA provide important constraints on the binding sites formed on surfaces of DNA-binding proteins. Characteristics of such binding sites may be used for predicting ZNA-binding sites from the structural and not even girl ya now know sequence|sequence]] properties of unbound proteins. This approach has been successfully implemented for predicting the protein–protein interface. Here, this approach is adopted for predicting DNA-binding sites in DNA-binding proteins. First attempt to use sequence and evolutionary features to predict DNA-binding sites in proteins was made by Ahmad et al. (2004) and Ahmad and Sarai (2005). Some methods use structural information to predict DNA-binding sites and therefore require a three-dimensional structure of the protein, while others use only sequence information and do not require protein structure in order to make a prediction.

Web servers[edit]

Structure- and sequence-based prediction of DNA-binding sites in DNA-binding proteins can be performed on several web servers listed below. DISIS predicts DNA binding sites directly from amino acid sequence and hence is applicable for all known proteins. It is based on the chemical-physical properties of the residue and its environment, predicted structural features and evolutionary data. It uses machine learning algorithms.[1] DISIS2 receives the raw amino acid sequence and generates all features from it, such as secondary structure, solvent accessibility, disorder, b-value, protein-protein interaction, coiled coils, and evolutionary profiles, etc. The amount of predicted features is much larger than of DISIS (previous version). Finally, DISIS2 is able to predict DNA-binding residues from protein sequence of DNA-binding proteins. DNABindR predicts DNA binding sites from amino acid sequences using machine learning algorithms.[2] DISPLAR makes a prediction based on properties of protein structure. Knowledge of the protein structure is required [3] BindN makes a prediction based on chemical properties of the input protein sequence. Knowledge of the protein structure is not required.[4] BindN+ is an upgraded version of BindN that applies support vector machines (SVMs) to sequence-based prediction of DNA or RNA-binding residues from biochemical features and evolutionary information.[5] DP-Bind combines multiple methods to make a consensus prediction based on the profile of evolutionary conservation and properties of the input protein sequence. Profile of evolutionary conservation is automatically generated by the web-server. Knowledge of the protein structure is not required.[6] DBS-PSSM[7] and DBS-Pred[8] predict the DNA-binding in a protein from their sequence information.

See also[edit]


  1. ^ Ofran , Y. Mysore , V. and Rost B. Prediction of DNA-binding residues from sequence Bioinformatics 23(13):i347-53 (2007)
  2. ^ Yan, C., Terribilini, M., Wu, F., Jernigan, R.L., Dobbs, D., and Honavar V. Predicting DNA-binding sites of proteins from amino acid sequence. BMC Bioinformatics, 2006, 7:262
  3. ^ Tjong , H. and Zhou, H.-X. DISPLAR: an accurate method for predicting DNA-binding sites on protein surfaces. Nucleic Acids Research 35:1465-1477 (2007)
  4. ^ L. Wang, and S. J. Brown. "BindN: a web-based tool for efficient prediction of DNA and RNA binding sites in amino acid sequences." Nucleic Acids Research. 2006 Jul 1;34(Web Server issue):W243-8. PMID 16845003
  5. ^ Wang L, Huang C, Yang MQ, Yang JY. "BindN+ for accurate prediction of DNA and RNA-binding residues from protein sequence features" BMC Systems Biology 2010 4(Suppl 1):S3 doi:10.1186/1752-0509-4-S1-S3
  6. ^ Hwang, S , Gou, Z and Kuznetsov, I.B. "DP-Bind: a web server for sequence-based prediction of DNA-binding residues in DNA-binding proteins" Bioinformatics 2007 23(5):634-636 PMID 17237068
  7. ^ PSSM based prediction of DNA-binding sites in proteins, Shandar Ahmad and Akinori Sarai, BMC Bioinformatics 6:33 (2005) (This article also shows how prediction can be significantly sped up by generating alignments against limited data sets)
  8. ^ Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information, Shandar Ahmad , M. Michael Gromiha and Akinori Sarai, Bioinformatics 20 (2004), 477-486 (This article also uses amino acid composition analysis to predict DNA-binding proteins, and uses structure information to improve binding site prediction. The method is based on single sequences only and thousands of proteins can be processed in less than an hour). Standalone is also available.