A basic tenet of the protein folding problem is that the information contained in the amino acid sequence is sufficient to dictate the three-dimensional, folded structure of a protein. The goal of the present study is to understand and quantify this idea using techniques of information and complexity theory. Potential applications of these approaches to protein structure determination and prediction will also be explored. From an information theoretical point of view, protein folding can be envisioned as a communication process by which the sequence information is transmitted to the three-dimensional structure. There are a number of questions one can ask regarding such information transfer. How much information is transferred from sequence to structure? How redundant is the information? Is information transfer via protein folding, a """"""""noisy or noiseless"""""""" communication channel? These questions are approached by taking advantages of recent advances in our understanding of the relationship between thermodynamic entropy, information entropy and algorithmic complexity. The information content of the sequence is determined from the information entropy, and the content of the three-dimensional structure is related to the algorithmic complexity. The algorithmic complexity is a measure of the shortest computational representation of a structure. With these quantities, the information content (or data compression) of sequence and structural data will be determined. Using maximum entropy techniques, the shared or mutual information between sequence and structure will also be determined. Knowledge of this shared information will be used to develop models of the """"""""communication channel"""""""" of protein folding. This approach can also be used to quantitatively compare structure prediction algorithms. A long term goal is to incorporate this shared information into a maximum entropy algorithm for X-ray and NMR structure determination. This approach will also provide an algorithm for determining structures by jointly optimizing X-ray and NMR data.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Academic Research Enhancement Awards (AREA) (R15)
Project #
1R15GM055910-01
Application #
2024459
Study Section
Molecular and Cellular Biophysics Study Section (BBCA)
Project Start
1997-06-01
Project End
1999-08-31
Budget Start
1997-06-01
Budget End
1999-08-31
Support Year
1
Fiscal Year
1997
Total Cost
Indirect Cost
Name
University of Denver
Department
Chemistry
Type
Schools of Arts and Sciences
DUNS #
City
Denver
State
CO
Country
United States
Zip Code
80208
Dewey, T G (2001) A sequence alignment algorithm with an arbitrary gap penalty function. J Comput Biol 8:177-90
Dewey, T G (2000) Information dynamics of in vitro selection-amplification systems. Pac Symp Biocomput :602-13
Dewey, T G (1999) Statistical mechanics of protein sequences. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics 60:4652-8
Dewey, T G; Delle Donne, M (1998) Non-equilibrium thermodynamics of molecular evolution. J Theor Biol 193:593-9