The project will provide new technology to accelerate the discovery of protein structures and assemblies from humans and other species. Knowing the 3D structures of proteins has important biomedical implications for the development of protein-based therapies and targeted therapeutic drugs, but the 3D structures of proteins of thousands of important protein types remain unsolved. The project aims to close this gap, based on two recent advances: (1) the rapid development of new DNA sequencing technologies and (2) a recent breakthrough in protein 3D structure prediction using statistical physics and bio-molecular computation. The new structure prediction method, developed by the applicant team, extracts evolutionary residue-residue couplings from multiple sequence alignments, using a maximum entropy method. The team will use the evolutionary couplings as distance constraints to predict the structure of many single domains, of multidomain proteins and of protein complexes, and to map functional sites on known and predicted structures, with potentially broad impact on diverse biological research areas. The team will also aim to aid the development of hybrid computational- experimental technologies for structure determination. For X-ray crystallography, the aim is bridge the gap between the predicted 3D structures and the basin of convergence for molecular replacement, allowing structure determination from a single native data set without the need for anomalous or derivative diffraction datasets. For NMR, the aim is to add evolutionary couplings to NMR-derived backbone and residue-residue distance information and thus reduce experimental effort and/or permit the solution of larger structures. The project is a close collaboration between the Computational Biology Program at Memorial Sloan-Kettering Cancer Center and the Department of Systems Biology at Harvard Medical School. Experimental collaborations with PSI:Biology centers and the international structural genomics effort will aim to implement a more efficient technology for the determination of biomedically relevant protein structures.

Public Health Relevance

Knowing the 3D structures of proteins has important biomedical implications for the development of protein-based therapies and targeted therapeutic drugs. Currently, the 3D structures of proteins of thousands of important protein types remain unsolved. The project will provide new technology, by combining advances in genomic sequencing with statistical physics and biomolecular computation, to accelerate the discovery of protein structures from humans, as well as pathogenic microorganisms, such as bacteria, viruses and fungi.

Agency
National Institute of Health (NIH)
Type
Research Project (R01)
Project #
5R01GM106303-02
Application #
8658439
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Edmonds, Charles G
Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
2014
Total Cost
Indirect Cost
City
New York
State
NY
Country
United States
Zip Code
10065
Michel, Mirco; Hayat, Sikander; Skwark, Marcin J et al. (2014) PconsFold: improved contact predictions improve protein models. Bioinformatics 30:i482-8
Hopf, Thomas A; Schärfe, Charlotta P I; Rodrigues, João P G L M et al. (2014) Sequence co-evolution gives 3D contacts and structures of protein complexes. Elife 3: