Ebook: Protein Folding and Drug Design
One of the great unsolved problems of science and also physics is the prediction of the three dimensional structure of a protein from its amino acid sequence: the folding problem. It may be stated that the deep connection existing between physics and protein folding is not so much, or in any case not only, through physical methods (experimental: X–rays, NMR, etc, or theoretical: statistical mechanics, spin glasses, etc), but through physical concepts. In fact, protein folding can be viewed as an emergent property not contained neither in the atoms forming the protein nor in the forces acting among them, in a similar way as superconductivity emerges as an unexpected coherent phenomenon taking place on a sea of electrons at low temperature. Already much is known about the protein folding problem, thanks, among other things, to protein engineering experiments as well as from a variety of theoretical inputs: inverse folding problem, funnel–like energy landscapes (Peter Wolynes), helix–coil transitions, etc. Although quite different in appearance, the fact that the variety of models can account for much of the experimental ?ndings is likely due to the fact that they contain much of the same (right) physics. A physics which is related to the important role played by selected highly conserved, “hot”, amino acids which participate to the stability of independent folding units which, upon docking, give rise to a (post–critical) folding nucleus lying beyond the highest maximum of the free energy associated to the process.
“One of the great unsolved problems of science is the prediction of the three-dimensional structure of a protein from its amino acid sequence: the folding problem”. Thus wrote Sir Alan Fehrst, an illustrious scientist from Cambridge, a few years ago. But because, according to another Cambridge scholar, Lord Rutherford : “science is either physics or it is stamp collection”, the “protein folding problem” is also one of the great unsolved problems of physics. This is the reason why the Italian Physical Society organized the present “Enrico Fermi” Summer School, on the premises of Villa Monastero, where Enrico Fermi lectured for the last time in Italy (summer 1954) before his untimely death on November 29th of that year.
It may be stated that the deep connection existing between physics and protein folding is not so much, or in any case not only, through physical methods (experimental: X-rays, NMR, etc, or theoretical: statistical mechanics, spin glasses, etc), but through physical concepts. In particular those associated with the transition of many-body (finite) systems between an initial and a final phase implying breaking of symmetry. In fact, protein folding can be viewed as an emergent property not contained either in the atoms forming the protein or in the forces acting among them, in a similar way as superconductivity emerges as an unexpected coherent phenomenon taking place on a sea of electrons at low temperature.
Let us recall that in spite of the fact that one does not yet know how to read the 3D structure of a protein from its 1D structure, much is known about the protein folding problem, thanks, among other things, to protein engineering experiments (ϕ-values determination, Alan Fehrst and Luis Serrano) as well as from a variety of theoretical inputs: inverse folding problem (Eugene Shakhnovich), funnel-like energy landscapes (Peter Wolynes), helix-coil transitions (Harold Scheraga), etc.
Although quite different in appearance, the fact that the variety of models can account for much of the experimental findings is likely due to the fact that they contain much of the same (right) physics. A physics which is related to the important role played by selected highly conserved, “hot”, amino acids which participate in the stability of independent folding units which, upon docking, give rise to a (post-critical) folding nucleus lying beyond the highest maximum of the free energy associated to the process. In the same way as Heisenberg (matrix) and Schroedinger (differential equation) versions of quantum mechanics have been shown to contain the same physics, it is highly likely that the physics which is at the basis of the different views presented by the lecturers of the phenomenon of protein folding is, to a large extent, equivalent.
This impression also emerged from the answers given by the lecturers to many questions and comments put forward by the lively group of students which attended the School. Within this context, we want to thank them for their attendance and acknowledge the assiduity of their intervents in terms of questions and comments, the high level of the ten minutes talks many of them gave, as well as the high level of the posters presented.
At the basis of their collective response were the high level of the lecturers and seminar speakers presentations. In particular, Harold Scheraga, Peter Wolynes, Eugene Shakhnovich, Amos Maritan, Luis Serrano, Leonid Mirny and Guido Tiana reminded us that while we still do not know how to solve the “protein folding problem”, one has developed a series of powerful methods (like, e.g., chain initiation folding events, foldons and folding funnels, local elementary structures, etc.) which have shed much light on the mechanism which is the basis of the folding of proteins.
It is remarkable that the read thread going through these concepts and corresponding results, starting from those associated with the simplified lattice models discussed by R. A. Broglia, seem to extend all the way to HIV–1 reproduction in infected cell, opening the way for the development of non-conventional (folding) inhibitors, as was reported by Stefano Rusconi, a non-obvious consequence of the in vitro experiments reported by Davide Provasi. A new interdisciplinarity embracing not only physicists, chemists and biologists, but also medical doctors which is likely to be needed to solve such formidable problems as those created by HIV, seems to be in the nascent stage.
The remarkable advances in the field of ab initio studies which has taken place during the last years were reported by Michele Parrinello, Wilfred Van Gunsteren, Paolo Carloni and Peter Winn. Peter Wolynes, Gennady Verkhivker and Giorgio Colombo updated students and lecturers alike on the latest developments on drug design and on the many successes as well as challenges facing this exciting field lying at the borderline between pure and applied research. The role quantum mechanics plays within this context, as well as within the framework of protein folding, was discussed by Kenneth Merz.
The School could have not been possible without the indefatigable support of the President of the Italian Physical Society Professor Franco Bassani, who very early in the programming of the Enrico Fermi School realized the relevance, for physicists, of the subject of protein folding and drug design. To him and to the Secretarial staff (Ramona Brigatti and Giovanna Bianchi Bazzi) headed by Barbara Alzani, as well as to Villa Monastero housekeeper, Antonio Cintorino, our warmest thanks.
Aside from the economical support provided by the Italian Physical Society, we acknowledge the support coming from the University of Milan. The presence in the concluding Session of the School of the Vice President of research Prof. Giampiero Sironi and of the Dean of the Faculty of Sciences, Prof. M. Pignanelli testifies to the importance adscribed by our University to the interdisciplinary field of protein folding and drug design. Within this context, the presence of Prof. Mauro Moroni (Head of the Department of Clinical Sciences, Division of Infective Diseases) and of Prof. Massimo Galli (Director of the Institute of Infective Diseases) of the Faculty of Medicine (and Sacco Hospital), of the University of Milan, was only natural. We also thank the support of INFN (Istituto Nazionale di Fisica Nucleare).
Last, but not least, we acknowledge the privilege of having held this School in the suggestive premises of Villa Monastero, at Varenna, on Como Lake. As one of the lecturers vividly put it, it felt almost surreal to be able to carry out business as usual, namely discuss with each other what we understood, as well as what we do not understand about protein folding, in such beautiful sorroundings.
R. A. Broglia, L. Serrano and G. Tiana
A description is provided of experimental studies of the folding pathways of bovine pancreatic ribonuclease A (RNase A) and of a physics-based theoretical approach to compute both the folded native structure of a globular protein and the pathways leading to it, using the information contained in the amino acid sequence and an empirical potential energy function. A brief discussion of hydrophobic interactions is also provided.
Introduction; 1. The basis of folding landscapes; 2. Random sequences; 3. The statistical energy landscape; 4. The energy landscape of long evolved proteins; 5. Local vs. global descriptions of the folding landscape
There have been considerable attempts in the past to relate phenotypic trait —habitat temperature of organisms— to their genotypes, most importantly compositions of their genomes and proteomes. However, despite accumulation of anecdotal evidence, an exact and conclusive relationship between the former and the latter have been elusive.
We present an exhaustive study of the relationship between amino acid composition of proteomes, nucleotide composition of DNA, and optimal growth temperature of prokaryotes. Based on 204 complete proteomes of archaea and bacteria spanning the temperature range from −10°C to +110°C, we performed an exhaustive enumeration of all possible sets of amino acids and found a set of amino acids whose total fraction in a proteome is correlated, to a remarkable extent, with the optimal growth temperature. The universal set is Ile, Val, Tyr, Trp, Arg, Glu, Leu (IVYWREL), and the correlation coefficient is as high as 0.93. We also found that the G+C content in 204 complete genomes does not exhibit a significant correlation with optimal growth temperature (R=−0.10). On the other hand, the fraction of A+G in coding DNA is correlated with temperature, to a considerable extent, due to codon patterns of IVYWREL amino acids. Further, we found strong and independent correlation between OGT and frequency with which pairs of A and G nucleotides appear as nearest neighbors in genome sequences. This adaptation is achieved via codon bias. Further we analyze the physical reason for the observed amino acid composition bias and determine that this is due to positive design —that seeks to lower native state of proteins— and negative design that increases the energy of misfolded conformations. Together these two factors work to increase energy gap in proteins and therefore increase its stability. These findings present a direct link between principles of proteins structure and stability and evolutionary mechanisms of thermophylic adaptation.
1. Introduction; 2. The lattice model; 2.1 Inverse folding problem: the design of good folders; 2.2 Role of the different amino acids in the folding process; 2.3. Extension of the inverse folding strategy; 2.4. How many mutations can a designed protein tollerate?; 3. Hierarchical folding of a model protein; 4. Solving the protein folding problem in the case of a notional protein (three-step-strategy (3SS)); 5. Lattice model design of resistance proof, folding-inhibitor peptides; 6. Drug resistance; 7. Design and folding of dimeric proteins; 8. Conclusions
1. Introduction; 2. Simulation methods; 2.1 Monte Carlo simulations: the energy model; 2.2. Monte Carlo simulations: simulated tempering dynamics; 2.3. Binding free energy calculations; 3. Results and discussion
Proteins are molecular machines which play a vital role in life. The study of proteins has proved to be a daunting problem because of its sheer complexity. We discuss how two physics-based ideas, spin glasses and the phase behavior of a compact flexible tube, are useful for the development of a framework for understanding proteins. The tube picture provides a simple explanation for how geometry and symmetry determine the menu of possible native state folds of proteins, whereas the spin glass paradigm is useful for determining how one of the folds from this predetermined menu is selected as the native-state structure of a given protein sequence.
1. Introduction; 2. Laser tweezers; 3. Synthesis of molecular constructs for use in mechanical manipulation studies; 4. Mechanical manipulation of single RNase H molecules by laser tweezers; 5. Conclusion
1. History of the hierarchical view of folding; 2. The folding of the SH3 domain: A computational model; 3. Equilibrium thermodynamics; 4. Folding kinetics; 5. The uneven distribution of energy; 6. Inhibition of the folding of SH3; 7. Conclusions
1. Introduction; 2. Methods; 3. Results; 3.1. Solvation; 3.2. Association; 4. Conclusions
In this report, we demonstrate the use of semiempirical quantum mechanics (QM) and molecular dynamics simulations (MD) in conjunction with the Frohlich-Kirkwood theory to calculate the dielectric permittivity of proteins. The proteins staphylococcus nuclease and T4 lysozyme were examined in order to investigate the structural basis of the macroscopic dielectric permittivity from microscopic simulations. Use of QM allowed a realistic representation of electronic polarizability of the proteins, which is otherwise inaccessible because of the use of fixed point charge models in the classical force fields which are typically used to study proteins. The key findings of this study were: dielectric permittivity is not a constant but varies with region of the protein, and its structural and electronic features. It is the highest in the surface and boundary regions and drops off sharply towards the interior of the protein. Electronic polarization, whether due to solvent, or to the protein environment significantly influences permittivity.
1. Introduction; 2. Calculation of electrostatic potentials in biomolecular systems; 3. Examples of the use of electrostatic potentials in biomolecular systems; 4. Comparison of protein electrostatic potentials: Protein Interaction Property Similarity Analysis (PIPSA); 5. Electrostatic potentials in the ubiquitin and ubiquitin like systems; 6. High-throughput modelling of protein electrostatic potentials; 7. Conclusions
1. Introduction; 2. Materials and methods; 2.1. Structural classification of protein tyrosine kinases; 2.2. A hierarchical model of biomolecular recognition; 2.3. Monte Carlo binding simulations: simulated tempering dynamics; 3. Results and discussion; 4. Conclusions
1. Materials and methods; 1.1. Peptide molecular-dynamics simulations; 1.2. Docking procedure; 1.3. Pharmacophore generation; 2. Results; 2.1. Shepherdin and Shepherdin RV; 2.2. Simulations of Shepherdin-RV mutants; 2.3. Shepherdin[79-83]; 2.4. Characterization of Hsp90/shepherdin binding interface; 2.5. Pharmacophoric hypotheses and small molecule identification; 3. Discussion
1. Energy landscapes and molecular recognition; 2. Binding hot spots and convergent solutions at protein-protein interfaces; 3. Targeting P53-MDM2 interfaces with molecular modulators; 4. Materials and methods; 4.1. Structural analysis; 4.2. Hierarchical models of biomolecular recognition; 5.Results and discussion; 5.1. Conformational landscape of MDM2 and specific binding with small molecular mimics; 5.2. The energy landscape analysis of a hot spot at the consensus binding site of the constant fragment (Fc) of human immunoglobulin G.
Bioinformatics and MD approaches are highly complementary tools for structural predictions. We have used these two approaches together to investigate structural and functional properties of tetrameric voltage-gated ion channels for which structural information is completely (CNG channels) or partially (HCN channels) lacking. Besides their predicting power, MD simulations have allowed to provide insights on functional properties of HCN channels. In addition, bioinformatics has allowed extending many of our findings obtained for a protein from one specific organism to the entire class of HCN and CNG channels, and MD to other proteins featuring the same fold but with totally different function.
1. Introduction; 1.1. The Michaelis-Menten framework; 1.2. Spectrophotometic assay; 1.3. Circular Dichroism; 2. Materials and methods; 3. Discussion; 4. Conclusions
1. Introduction; 2. Experimental strategy; 3. Results; 4. Conclusions