Protein design
Protein design is the design of new protein molecules, either from scratch or by making calculated variations on a known structure. The use of rational design techniques for proteins is a major aspect of protein engineering.
The design of minimalist computer models of proteins (lattice proteins), and the secondary structural modification of real proteins, began in the mid-1990s. The de novo design of real proteins became possible shortly afterwards, and in the 21st century it has become a productive field of research. There is great hope that the design of new proteins, small and large, will have applications in medicine and bioengineering (see examples below).
Contents |
[edit] Overview
The number of possible amino acid sequences is enormous, but only a subset of them will fold reliably and quickly to a single native state. Protein design involves identifying novel sequences within this subset, in particular those with a physiologically active native state. Physically, the native state of a protein is the conformational free energy minimum for the chain. Therefore protein design is the search for sequences which have the chosen structure as a free energy minimum. In a sense it is the reverse of structure prediction: in design, a tertiary structure is specified, and a sequence is identified which will fold to it. Hence it is also referred to as inverse folding.
Protein design requires an understanding of the molecular interactions that stabilize proteins in specific folded configurations; experience has shown, however, that it does not require an understanding of the dynamical process by which proteins fold.[further explanation needed]
Prion diseases like mad-cow disease illustrate how important it is that designer proteins possess only one stable conformation. In mad-cow disease, there exists a healthy protein with a fatal weakness: there is another conformation that it can "comfortably" take; the abnormally folded shape has very little free energy and is therefore very stable. For reasons that are not yet fully understood, this mis-folded prion protein can catalyze other proteins of its type to also adopt the mis-folded shape, causing a disease-generating cascade of previously functional proteins to quickly mis-fold. They lose the ability to perform their intended function in the new conformation, and have a tendency to form aggregates called plaques. The buildup of these aggregates in the brain leads to progressive neuronal death, and eventually death of the entire organism. It is therefore easy to see the importance both that a designer protein have only one possible stable tertiary structure, and that researchers exercise extreme diligence to ensure that this remains the case in all environments – especially in vivo.
[edit] Examples of designed proteins
The early 21st century saw the creation of small proteins with real biological functions including chiroselective catalysis,[1] ion detection,[2] and antiviral behaviour.[3] Using computational methods, a protein with a novel fold (Top7) was designed in 2003,[4] as well as sensors for unnatural molecules.[5] Recent computational redesign was capable of experimentally switching the cofactor specificity of Candida boidinii xylose reductase from NADPH to NADH.[6]
On the other hand, it is widely believed that not all possible protein structures are designable, which means that there are compact configurations of the chain which no sequences can fold to. In particular, conformations which are poor in secondary structures are unlikely to be designable. The designability of given structures is still poorly understood.
[edit] Models of protein structure and function used in protein design
Protein design can be accomplished using computer models, which, while simplifying the problem, are able to generate sequences that fold to the desired structure. Computational protein design algorithms search the sequence-conformation space for sequences that are low in energy when folded to the target structure. This search space is large; currently the most challenging requirement for computational protein design is a fast, yet accurate, energy function that can distinguish optimal sequences from similar suboptimal ones.
Computational protein design algorithms use rotamer libraries and models of protein energetics to evaluate how mutations would affect a protein's structure and function. These energy functions typically include a combination of molecular mechanics, statistical (i.e. knowledge-based), and other empirical terms. However, the trend has been towards using more physically based potential energy functions.[7]
[edit] Ancestral sequence reconstruction
Ancestral reconstruction techniques have been used to design proteins with putative ancient functions.[8]
[edit] Software
Iterative Protein Redesign and Optimization. IPRO redesigns proteins to increase or give specificity to native or novel substrates and cofactors. This is done by repeatedly randomly perturbing the backbones of the proteins around specified design positions, identifying the lowest energy combination of rotamers, and determining if the new design has a lower binding energy than previous ones. The iterative nature of this process allows IPRO to make additive mutations to the protein sequence that collectively improve the specificity towards the desired substrates and/or cofactors. Experimental testing of predictions by IPRO successfully switched the cofactor preference of Candida boidinii xylose reductase from NADPH to NADH.[6]
EGAD: A Genetic Algorithm for protein Design.[9] A free, open-source software package for protein design and prediction of mutation effects on protein folding stabilities and binding affinities. EGAD can also consider multiple structures simultaneously for designing specific binding proteins or locking proteins into specific conformational states. In addition to natural protein residues, EGAD can also consider free-moving ligands with or without rotatable bonds. EGAD can be used with single or multiple processors.
RosettaDesign. A software package, under active development and free for academic use, that has seen extensive successful use.[10][11][12][13] RosettaDesign is accessible via a web server.[14]
SHARPEN. A permissive open-source library for protein design and structure prediction. SHARPEN offers a variety of combinatorial optimization methods (e.g. Monte Carlo, Simulated Annealing, FASTER[15]) and can score proteins using the successful Rosetta all-atom force field or molecular mechanics force fields (OPLSaa). In addition to the protein modeling library, SHARPEN includes tools for scalable distributed computing.
WHAT IF software for protein modelling, design, validation, and visualisation.
Abalone software for protein modelling and visualisation.
[edit] See also
- Ancestral reconstruction
- Molecular design software
- PEGylation
- Protein engineering
- Protein structure prediction software
- Software for molecular modeling
- Meganucleases
[edit] References
- ^ Saghatelian, Alan; Yokobayashi, Yohei; Soltani, Kathy & Ghadiri, M. Reza (2001), "A chiroselective peptide replicator", Nature 409 (6822): 797–801, doi:10.1038/35057238, PMID 11236988, http://www.nature.com/nature/journal/v409/n6822/abs/409797a0.html
- ^ Nagai, Takeharu; Sawano, Asako; Park, Eun Sun & Miyawaki, Atsushi (2001), "Circularly permuted green fluorescent proteins engineered to sense Ca2+", PNAS 98 (6): 3197–3202, Bibcode 2001PNAS...98.3197N, doi:10.1073/pnas.051636098, PMC 30630, PMID 11248055
- ^ Root, Michael J.; Kay, Michael S.; Kim, Peter S. (2001), "Protein design of an HIV-1 entry inhibitor", Science 291 (5505): 884–888, Bibcode 2001Sci...291..884R, doi:10.1126/science.1057453, http://www.sciencemag.org/content/291/5505/884.abstract
- ^ Kuhlman, Brian; Dantas, Gautam; Ireton, Gregory C.; Varani, Gabriele; Stoddard, Barry L. & Baker, David (2003), "Design of a Novel Globular Protein Fold with Atomic-Level Accuracy", Science 302 (5649): 1364–1368, Bibcode 2003Sci...302.1364K, doi:10.1126/science.1089427, PMID 14631033
- ^ Looger, Loren L.; Dwyer, Mary A.; Smith, James J. & Hellinga, Homme W. (2003), "Computational design of receptor and sensor proteins with novel functions", Nature 423 (6936): 185–190, Bibcode 2003Natur.423..185L, doi:10.1038/nature01556, PMID 12736688
- ^ a b Khoury, GA; Fazelinia, H; Chin, JW; Pantazes, RJ; Cirino, PC; Maranas, CD (October 2009), "Computational design of Candida boidinii xylose reductase for altered cofactor specificity", Protein Science 18 (10): 2125–38, doi:10.1002/pro.227, PMC 2786976, PMID 19693930
- ^ Boas, F. E. & Harbury, P. B. (2007), "Potential energy functions for protein design", Current Opinion in Structural Biology 17 (2): 199–204, doi:10.1016/j.sbi.2007.03.006, PMID 17387014
- ^ Chang, BS; Jönsson, K; Kazmi, MA; Donoghue, MJ; Sakmar, TP (2002), "Recreating a functional ancestral archosaur visual pigment.", Molecular Biology and Evolution 19 (9): 1483–9, PMID 12200476
- ^ User's Manual for EGAD! a Genetic Algorithm for protein Design!, http://egad.ucsd.edu/EGAD_manual/index.html
- ^ Liu, Y; Kuhlman, B (July 2006), "RosettaDesign server for protein design", Nucleic Acids Research 34 (Web Server issue): W235–8, doi:10.1093/nar/gkl163, PMC 1538902, PMID 16845000
- ^ Dantas, Gautam; Kuhlman, Brian; Callender, David; Wong, Michelle; Baker, David (2003), "A Large Scale Test of Computational Protein Design: Folding and Stability of Nine Completely Redesigned Globular Proteins", Journal of Molecular Biology 332 (2): 449, doi:10.1016/S0022-2836(03)00888-X, PMID 12948494.
- ^ Dobson, N; Dantas, G; Baker, D; Varani, G (2006), "High-Resolution Structural Validation of the Computational Redesign of Human U1A Protein", Structure 14 (5): 847, doi:10.1016/j.str.2006.02.011, PMID 16698546.
- ^ Dantas, G; Corrent, C; Reichow, S; Havranek, J; Eletr, Z; Isern, N; Kuhlman, B; Varani, G et al. (2007), "High-resolution Structural and Thermodynamic Analysis of Extreme Stabilization of Human Procarboxypeptidase by Computational Protein Design", Journal of Molecular Biology 366 (4): 1209, doi:10.1016/j.jmb.2006.11.080, PMID 17196978.
- ^ http://rosettadesign.med.unc.edu/
- ^ Desmet, J; Spriet, J; Lasters, I (July 2002), "Fast and accurate side-chain topology and energy refinement (FASTER) as a new method for protein structure optimization", Proteins 48 (1): 31–43, doi:10.1002/prot.10131, PMID 12012335
[edit] Further reading
- Dahiyat, Bassil I. & Mayo, Stephen L. (1997), "De Novo Protein Design: Fully Automated Sequence Selection", Science 278 (5335): 82–87, doi:10.1126/science.278.5335.82, PMID 9311930
- Sander, Chris; Vriend, Gerrit; Bazan, Fernando; et al., Amnon; Nakamura, Haruki; Ribas, Luis; Finkelstein, Alexei V.; Lockhart, Andrew et al. (1992), "Protein Design on computers. Five new proteins: Shpilka, Grendel, Fingerclasp, Leather and Aida", Proteins: Structure, Function, and Bioinformatics 12 (2): 105–110, doi:10.1002/prot.340120203, PMID 1603799
- Jin, Wenzhen; Kambara, Ohki; Sasakawa, Hiroaki; Tamura, Atsuo & Takada, Shoji (2003), "De Novo Design of Foldable Proteins with Smooth Folding Funnel: Automated Negative Design and Experimental Verification", Structure 11 (5): 581–590, doi:10.1016/S0969-2126(03)00075-3, PMID 12737823
- Pokala, Tracy M. & Handel, Navin (2005), "Energy Functions for Protein Design: Adjustment with Protein–Protein Complex Affinities, Models for the Unfolded State, and Negative Design of Solubility and Specificity", Journal of Molecular Biology 347 (1): 203–227, doi:10.1016/j.jmb.2004.12.019, PMID 15733929
- Röthlisberger, Daniela; Khersonsky, Olga; Wollacott, Andrew M.; Jiang, Lin; Dechancie, Jason; Betker, Jamie; Gallaher, Jasmine L.; Althoff, Eric A. et al. (2008), "Kemp elimination catalysts by computational enzyme design", Nature 453 (7192): 190, Bibcode 2008Natur.453..190R, doi:10.1038/nature06879, PMID 18354394
- Jiang, Lin; Althoff, Eric A.; Clemente, Fernando R.; Doyle, Lindsey; Rothlisberger, Daniela; Zanghellini, Alexandre; Gallaher, Jasmine L.; Betker, Jamie L. et al. (2008), "De Novo Computational Design of Retro-Aldol Enzymes", Science 319 (5868): 1387, Bibcode 2008Sci...319.1387J, doi:10.1126/science.1152692, PMID 18323453
|
|||||||||||