The goal of this project is to automate and improve the prediction of protein structure and function based on multiple sequence data.
Specific aims are to: 1) partition the entire sequence database into a comprehensive set of homologous protein domains; 2) design advanced, family-specific multiple alignment models of these domains; 3) develop statistical procedures for estimating the significance of sequence-to-model similarity scores; 4) devise corresponding alignment optimization procedures; and 5) develop tools for predicting aspects of protein structure and function based on these alignments. Methods include Gibbs sampling and hidden Markov model procedures for multiple sequence alignment, structural threading methods, dynamic programming and BLAST-like database search procedures, and other statistical and algorithmic methods. A comprehensive database of protein domains will be made available to the biomedical community through the National Center for Biotechnology Information.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
1R01LM006747-01
Application #
2740163
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Bean, Carol A
Project Start
1998-09-30
Project End
2001-08-31
Budget Start
1998-09-30
Budget End
1999-08-31
Support Year
1
Fiscal Year
1998
Total Cost
Indirect Cost
Name
Cold Spring Harbor Laboratory
Department
Type
DUNS #
065968786
City
Cold Spring Harbor
State
NY
Country
United States
Zip Code
11724