Nonparametric Bayesian Approaches to Modeling Protein Structure

Dahl, David

Abstract

This proposal's objective is to develop a new class of statistical models to advance scientific knowledge of protein tertiary structure and to extend template-based modeling to protein loop regions. As advancement in basic science, the improved modeling of protein structure will broadly impact biomedical fields. The following specific aims will be accomplished.
The first aim (Random Partition Models Indexed by Pairwise Information) is to develop probability models for partitions that are explicitly non-exchangeable, utilizing available pairwise information to influence the clustering of data. Four distributions ar proposed, each using the pairwise information by modifying identities from the Chinese Restaurant Process, a popular probability model for clustering. Hierarchical clustering uses pairwise distance, but current methods for protein structure modeling do not. The proposed method provides a means to incorporate this type of information into Bayesian nonparametric models for protein structure.
The second aim (Template-Based Modeling of Loop Conformation Space Using Partition Models) applies the proposed random partition models in loop modeling. This proposal will improve our previous estimation approach by accounting for the influences of individual amino acids as well as for influences from neighboring residues. New methods based on the random partition models will provide rigorous statistical modeling at and between residue positions allowing one to limit and precisely sample the conformational space. This will in turn allow for a clearer understanding of roles of loops in catalytic sites and protein signaling.
The final aim (New Paradigm for Protein Packing and Higher-Order Structure Using Partition Models) applies the statistical modeling to estimate the propensities of a new model of protein packing called the ball/socket. Statistical modeling of the amino acid propensities within the ball/socket motifs and between patterns of motifs will allow insights into the rules governing packing, filling a substantial gap in current understanding of protein structure. The statistical model estimating these propensities will exploit the known pairwise information by using the proposed random partition models. Such analysis is currently not available to the scientific community.

Public Health Relevance

More accurate and improved modeling of protein structure from sequence will greatly aid the biomedical community in a better understanding of disease states. Moreover, producing accurate models of protein structure directly from sequence leverages the vast amounts of genetic information produced by the many genome projects. Accurate protein structure modeling also informs drug discovery by prioritizing targets.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM104972-04
Application #: 8839256
Study Section: Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer: Wehrle, Janna P

Project Start: 2012-07-01
Project End: 2016-04-30
Budget Start: 2015-05-01
Budget End: 2016-04-30
Support Year: 4
Fiscal Year: 2015
Total Cost: $350,928
Indirect Cost: $31,548

Institution

Name: Brigham Young University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 009094012

City: Provo
State: UT
Country: United States
Zip Code: 84602

Related projects


NIH 2015 R01 GM	Nonparametric Bayesian Approaches to Modeling Protein Structure Dahl, David B. / Brigham Young University	$350,928
NIH 2014 R01 GM	Nonparametric Bayesian Approaches to Modeling Protein Structure Dahl, David B. / Brigham Young University	$350,928
NIH 2013 R01 GM	Nonparametric Bayesian Approaches to Modeling Protein Structure Dahl, David B. / Brigham Young University	$338,645
NIH 2012 R01 GM	Nonparametric Bayesian Approaches to Modeling Protein Structure Dahl, David B. / Brigham Young University	$350,796

Publications

Dahl, David B; Day, Ryan; Tsai, Jerry W (2017) Random Partition Distribution Indexed by Pairwise Information. J Am Stat Assoc 112:721-732

Li, Qiwei; Dahl, David B; Vannucci, Marina et al. (2016) KScons: a Bayesian approach for protein residue contact prediction using the knob-socket model of protein tertiary structure. Bioinformatics 32:3774-3781

Fraga, Keith J; Joo, Hyun; Tsai, Jerry (2016) An amino acid code to define a protein's tertiary packing surface. Proteins 84:201-16

Joo, Hyun; Chavan, Archana G; Fraga, Keith J et al. (2015) An amino acid code for irregular and mixed protein packing. Proteins 83:2147-61

Li, Qiwei; Dahl, David B; Vannucci, Marina et al. (2014) Bayesian model of protein primary sequence for secondary structure prediction. PLoS One 9:e109832

Joo, Hyun; Tsai, Jerry (2014) An amino acid code for ?-sheet packing structure. Proteins 82:2128-40

Day, Ryan; Joo, Hyun; Chavan, Archana C et al. (2013) Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. Comput Biol Chem 42:40-8

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: