This award is for an interdisciplinary project to develop efficient computational methods and tools for determination of protein structures from 3D images in the presence of errors. Understanding the machinery of large molecular assemblies has been significantly enhanced by the cryo-electron microscopy (cryo-EM) technique. This technique allows the visualization of tens to over a thousand interacting molecules in an assembly and hence provides direct insights into important biological functions and mechanisms that cause abnormal function. The Electron Microscopy Data Bank (EMDB) archives density images of molecular assemblies at resolutions ranging from 2Ã… to 100Ã… and the rapidly accumulating data are valuable resources for data mining for enhanced accuracy in interpretation. The research will concretely demonstrate the applicability of data mining the EMDB and the use of sophisticated graph algorithmic methods for better protein structure prediction. The computational methods and tools being developed can be used to interpret large molecular assemblies that have important biological functions such as viruses, membrane-bound channels and cellular machines. Understanding these mechanisms is crucial for designing drugs and vaccines. The data mining and algorithmic techniques developed are very likely to be applicable for solving other computational problems, especially those dealing with searching large spaces and where one has to deal with data errors. Through the involvement of undergraduate researchers and summer high school students in the project, particularly women and those to be recruited from the African-American community, the project will contribute to the NSF goal of strengthening the pipeline and broadening participation in STEM fields.

The long term goal of this project is to develop sequential and parallel computational methods and tools to derive atomic structures for EMDB density images at the mid-resolution range of 4-10Ã…. Such methods will be generally applicable to determine the structure of proteins for which no existing templates can be identified. Novel ideas will be implemented in two difficult but critical steps in this problem. The first one addresses the long-standing challenge of accurately identifying β-strands from density images. The second addresses the problem of finding the correct mapping of secondary structures (helices and β-strands) in a density map to those identified on the protein sequence in presence of errors, making use of advanced computational methods. This project is a multidisciplinary effort bringing together expertise from computational structural biology, data mining, efficient algorithm design, and parallel computing. The methodology to be developed will be validated using experimentally derived cryo-EM density images and will be incorporated into Chimera, a popular 3D molecular viewer, for the cryo-EM community. The outcome of the project will be made available at the following website: www.cs.odu.edu/~jhe/

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
1356621
Program Officer
Jennifer Weller
Project Start
Project End
Budget Start
2014-07-01
Budget End
2018-06-30
Support Year
Fiscal Year
2013
Total Cost
$589,703
Indirect Cost
Name
Old Dominion University Research Foundation
Department
Type
DUNS #
City
Norfolk
State
VA
Country
United States
Zip Code
23508