This project will connect information about the DNA sequence of genes to what that means for the function of the proteins that the genes make. Structural information on proteins and how proteins interact is essential for understanding the effects of genetic variation on the function of cells. Because of all the large DNA sequencing projects, we have huge amounts of information on DNA changes that lead to changes in the building blocks of proteins (the amino acids) for all types of organisms, but understanding how this affects protein and cell function is much farther behind. This project will provide new and improved methods for determining how proteins are folded, and how they interact with other proteins, which affects how cells behave. This research will allow researchers to connect structural properties of all the proteins occurring in a single organism to changes in appearance, behavior, and responses to the environment of that organism. By publishing the results through Web-available databases the total effect on the organism from the structural and interaction changes will be available to other researchers and the public. The long-term research goals are to understand the molecular properties that drive life processes, and to choose changes that will result in making the intended product having the expected function in biotechnology applications. The research outcomes will be included in graduate and undergraduate courses. Minority students, women, undergraduates, including those at primarily teaching institutions, and high-school students will be involved in different parts of the research. The participants of the project present their results at scientific conferences.

The project will result in an integrated approach for large-scale prediction of protein structures and their association. A database of predicted structures and complexes for model organisms will be established upon which genetic variants will be mapped and their phenotypic effect assessed. The research objectives are: (1) to develop high-throughput structure-based methods to predict interactions of experimentally determined and modeled proteins; (2) to develop advanced methodology for high throughput modeling of individual proteins; (3) to generate genome-wide database of protein structures and protein-protein complexes for model organisms; and (4) to assess phenotypic effects of genetic variation. Approaches will be developed to discriminate non-interacting from interacting proteins, and to model the structures of protein-protein complexes, based on similarity to experimentally determined protein-protein complexes and on properties of the intermolecular energy landscape. A novel approach to fold detection will extend the number of proteins that can be modeled. A pipeline will be developed to integrate protein structure prediction with the prediction of protein complexes, and use structure-based approaches to predict the effects of SAVs. This collaborative project combines highly complementary areas of expertise of the US team, on high-throughput modeling of protein interactions, and the UK team, on protein structure prediction and SAV effects. The results of the project will be available at http://vakser.compbio.ku.edu and www.sbg.bio.ic.ac.uk.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Type
Standard Grant (Standard)
Application #
1565107
Program Officer
Peter McCartney
Project Start
Project End
Budget Start
2016-09-01
Budget End
2020-08-31
Support Year
Fiscal Year
2015
Total Cost
$873,278
Indirect Cost
Name
University of Kansas
Department
Type
DUNS #
City
Lawrence
State
KS
Country
United States
Zip Code
66045