Computational Approaches to Protein Structure Prediction

Goldstein, Richard

Abstract

The large and growing databases of known DNA sequences represent a knowledge base with the power to revolutionize biology, biochemistry, and biotechnology. Without a knowledge of a protein's folded conformation, however, it is difficult to answer the most basic questions of what the protein does and how it does it, let alone develop a rational approach to drug design. While the determination of the protein sequence is relatively straightforward, the experimental determination of the protein structure using X-ray crystallography or multidimensional NMR, is complicated, time consuming, and uncertain. In spite of decades of theoretical, computational, and experimental effort, we can neither understand how the protein is able to find its final folded state, or predict a priori what this final state will be. Even partial successes on these endeavors, such as the prediction of a limited set of biologically important proteins, would be highly significant. Energy functions will be developed and optimized for the prediction of the structure of a set of training proteins, using a criterion based on a Bayesian analysis. Generalization to proteins not included in the training set will allow the prediction of tertiary structure of proteins of biochemical interest, such as the VHR protein, a member of the family of protein phosphatases with dual specificity for tyrosine and serine. As the Bayesian optimization strategy is a general approach, the use of optimized energy functions represents a powerful and flexible way to combine traditional physicochemical interactions with other interactions that may not have a purely physicochemical interpretation, such as those based on information obtained through the analysis of evolutionary patterns or experimental observations, and those whose purpose it is to restrict the conformation space to be searched. The Bayesian approach will also be used to address an important but conceptually simpler problem, the cost function for the optimal alignment of two sequences, providing a testbed for exploring optimization strategies. By altering the energetics and dynamics, it will be possible to explore various models of protein folding, in an attempt to ascertain the circumstances under which various behavior is observed, and what consequences of the models might be experimentally observable.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: First Independent Research Support & Transition (FIRST) Awards (R29)
Project #: 1R29LM005770-01
Application #: 2238157
Study Section: Biomedical Library and Informatics Review Committee (BLR)

Project Start: 1995-04-01
Project End: 2000-03-31
Budget Start: 1995-04-01
Budget End: 1996-03-31
Support Year: 1
Fiscal Year: 1995
Total Cost
Indirect Cost

Institution

Name: University of Michigan Ann Arbor
Department: Chemistry
Type: Schools of Arts and Sciences
DUNS #: 791277940

City: Ann Arbor
State: MI
Country: United States
Zip Code: 48109

Related projects


NIH 1999 R29 LM	Computational Approaches to Protein Structure Prediction Goldstein, Richard A. / University of Michigan Ann Arbor
NIH 1998 R29 LM	Computational Approaches to Protein Structure Prediction Goldstein, Richard A. / University of Michigan Ann Arbor
NIH 1997 R29 LM	Computational Approaches to Protein Structure Prediction Goldstein, Richard A. / University of Michigan Ann Arbor
NIH 1996 R29 LM	Computational Approaches to Protein Structure Prediction Goldstein, Richard A. / University of Michigan Ann Arbor
NIH 1995 R29 LM	Computational Approaches to Protein Structure Prediction Goldstein, Richard A. / University of Michigan Ann Arbor

Publications

Koshi, J M; Goldstein, R A (1996) Probabilistic reconstruction of ancestral protein sequences. J Mol Evol 42:313-20

Govindarajan, S; Goldstein, R A (1996) Why are some proteins structures so common? Proc Natl Acad Sci U S A 93:3341-5

Thompson, M J; Goldstein, R A (1996) Constructing amino acid residue substitution classes maximally indicative of local protein structure. Proteins 25:28-37

Thompson, M J; Goldstein, R A (1996) Predicting solvent accessibility: higher accuracy using Bayesian statistics and optimized residue substitution classes. Proteins 25:38-47

Koshi, J M; Goldstein, R A (1995) Context-dependent optimal substitution matrices. Protein Eng 8:641-5

Comments

Be the first to comment on Richard Goldstein's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: