Proteins are the workhorses of the cell, involved in virtually every process in life. Many proteins are flexible molecules that undergo structural changes as part of their function. In other words ? they can assume various possible structures (conformations) via changes that range from small-scale movements to large domain motions. The question of how the structure and dynamics of proteins relate to their function has challenged scientists for several decades but still remains largely open. Existing computational methods for simulating protein dynamics can sample atomic level dynamic processes, yet their usefulness is limited as they require large computational resources, and they only allow for modeling of interactions that take place on very small time scales (e.g., several hundreds of nanoseconds). There is promise that understanding the connection between protein structure, dynamics and function can contribute a lot to the understanding of how molecular machines function and may aid in drug design and functional analysis.

A computational framework for an efficient large-scale exploration of protein conformational changes is proposed in this work. Given a protein structure, the aim is to efficiently generate a diverse set of conformations representing the low energy landscape of this protein under physiological conditions. The suggested methodology can be used to explore the conformational space of proteins and protein complexes and gain better understanding of protein dynamics and function.

To overcome the computational demands of a full scale conformational search, the search will be done in two stages: first, conduct a fast and approximate geometry-based exploration of the low energy landscape of proteins and protein complexes in an efficient way, temporarily sacrificing small-scale details for efficiency. The approximate search is enhanced with a novel biasing scheme that drives the search towards more flexible regions of the protein, reducing the huge search space into a manageable size. The reduced representation of the conformational landscape will be enhanced and complemented with detailed, physics based simulations applied to interesting and important regions in the proteins or to intermediate structures. This last stage will take advantage of massive parallel computing. The combination of fast, approximate search techniques and detailed physics-based simulation methods will create an enhanced, more complete picture of the low-energy landscape of those proteins and will improve understanding about how proteins perform their function. The methodology can be applied to problems related to protein interactions and rational drug design.

The broader impact of this project is partly due to the central role of proteins in virtually every basic biological function. This project addresses a significant question of the biological research community. Educational and outreach activities will be implemented through the following: a) Interdisciplinary collaborations with members of the CS department and other departments in the College of Science and Mathematics at UMass Boston. b) Training and mentoring the research of undergraduate and graduate students, including women and students from under-represented groups in science. c) Help setting up a Bioinformatics research and teaching program at UMass Boston.

Project Report

Proteins are the workhorses of the cell, involved in virtually every process in life. Many proteins are flexible molecules that undergo structural changes as part of their function. In other words – they can assume various possible structures (conformations) via changes that range from small-scale movements to large domain motions. The question of how the structure and dynamics of proteins relate to their function has challenged scientists for several decades but still remains largely open. Existing computational methods for simulating protein dynamics can sample atomic level dynamic processes, yet their usefulness is limited as they require large computational resources, and they only allow for modeling of interactions that take place on very small time scales (e.g., several hundreds of nanoseconds). Understanding the connection between protein structure, dynamics and function can contribute a lot to our understanding of how molecular machines function and may aid in drug design and functional analysis. In this work we proposed a computational framework for an efficient large-scale exploration of protein conformational changes. Given a protein structure, we aim to efficiently generate a diverse set of conformations representing the low energy landscape of this protein under physiological conditions. The suggested methodology can be used to explore the conformational space of proteins and protein complexes and gain better understanding of protein dynamics and function. To overcome the computational demands of a full scale conformational search, we implemented a two-stage search-scheme: In the first two years of the award we devised approximate search techniques, enhanced with a novel biasing scheme that drives the search towards more flexible regions of the protein, reducing the huge search space into a manageable size. We used graph-based rigidity analysis techniques to reliably model protein flexibility. This last stage takes advantage of massive parallel computing. The combination of fast, approximate search techniques and detailed physics-based simulation methods helps creating an enhanced, more complete picture of the low-energy landscape of those proteins. Starting from the second year, we used clustering and filtering techniques to help discovering low-energy regions in the protein conformational landscape, that may correspond to interesting intermediate regions. Lastly, we developed machine learning based methods that used rigidity analysis and evolutionary information to further explore protein conformational preferences, with applications to folding and protein-protein interactions. The broader impact of this project is partly due to the central role of proteins in virtually every basic biological function. This project addresses a significant question of the biological research community. Educational and outreach activities are implemented through the following: a) Interdisciplinary collaborations with members of the College of Science and Mathematics at UMass Boston and with researchers from other universities. b) Training and mentoring the research of undergraduate and graduate students, including women and students from under-represented groups in science: A female PhD student, who recently graduated, was fully funded by the grant. Another PhD student was partially funded. c) Help setting up a Bioinformatics research and teaching program at UMass Boston: I am now offering a graduate Algorithms in Bioinformatics course (CS612). Initially as a special topics, and starting in Spring 2012 as a regular course. This is the only bioinformatics course in the Computer Science department and one of the only ones in the university. Dissemination: During the funding period we published 3 peer-reviewed journal papers, 9 peer-reviewed conference papers and presentations, and several conference abstracts. Equipment: The grant money was used to purchase a computational cluster for UMass Boston. Part of the research was performed on this cluster.

Project Start
Project End
Budget Start
2011-09-01
Budget End
2014-12-31
Support Year
Fiscal Year
2011
Total Cost
$249,774
Indirect Cost
Name
University of Massachusetts Boston
Department
Type
DUNS #
City
Dorchester
State
MA
Country
United States
Zip Code
02125