This research involves development of new probabilistic and Bayesian statistical models for the analysis of three-dimensional structures of biological macromolecules. A secondary focus is the development of statistical methods for characterizing the biomechanical behavior of molecules under applied forces such as arise in the study of cellular proteins (e.g. molecular motors) and in problems of bioengineering (e.g. nano-scale sensing") and materials science (e.g. smart materials and biologically inspired materials). The emphasis is on comparative modeling and classification of protein structures and protein force-response curves. Stochastic models of three-dimensional shapes are explored motivated by the difficulties in quantitative comparison of complex structural information of proteins. Bayesian techniques are developed for unsolved problems of matching unlabeled point sets using extensions of techniques from the area of statistical shape analysis. Computational algorithms for making these techniques practical for high-throughput searching by biomolecular scientists are also a major focus of this research, and several directions are developed involving exact algorithms, Monte Carlo sampling techniques, and rapid approximations based on geometric algorithms. A second component of this research involves Bayesian models for errors-in-variables regression and functional data analysis techniques for force-extension curves measured on single molecules. The development of Bayesian hierarchical random-effects models for functional data analysis, along with development of efficient algorithms for Bayesian computations, is carried out.

This statistical methodology research is of broad applicability, but all work is clearly motivated by and directly relevant to significant interdisciplinary applications in biomedical science and biomolecular engineering. The research constitutes important advances in stochastic modeling for applications in computational biology, bioinformatics, and computational chemistry, and will provide both unified theoretical frameworks and practical data analysis techniques and software tools for scientific researchers and engineers involved in these areas. Such advances will be important to statisticians working in this area, as well as domain scientists in need of new methodology and tools for data analysis and predictive modeling.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0605141
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2006-09-01
Budget End
2007-08-31
Support Year
Fiscal Year
2006
Total Cost
$39,870
Indirect Cost
Name
Duke University
Department
Type
DUNS #
City
Durham
State
NC
Country
United States
Zip Code
27705