Large and rapidly growing sequence and structural databases provide a vast new resource for the biomedical sciences. The usefulness of computational approaches to extract information from these databases to address some of the most difficult and important problems in molecular and structural biology has become increasingly apparent. However, these data often contain several characteristics that are well known to render them resistant to analysis, including presentations of missing data, the existence of likelihood or posterior surfaces with multiple local extremes, or the need to control the dimensional size of models used to describe these complex data. Progress has been made on some of these issues, most notably the missing data problem, through the use of Bayesian recursive algorithms, expectations maximization algorithms, and hidden Markov models (HMM) and MCMC sampling algorithms. However, the other issues remain largely unsolved. Recent advances in MCMC technology have opened up fresh approaches to these difficult data analysis problems. Specifically, the recent emergence of multi-scales MCMC algorithms which are effective in identifying optima in rough landscapes, and the development of reversible jump MCMC algorithms for inferences on the dimension of a problem, have initiated changes in this arena. In the last few months, a class of multistage MCMC algorithms, called simulated sintering, which permit Bayesian inference on rough landscapes including those inherent in many reversible jumping algorithms, present an opportunity for a breakthrough for these very difficult data analysis challenges.
The aims of this research are to explore the development, adaptation, and application of these methods to some of the grand challenges of molecular and structural biology.

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21RR014036-02
Application #
6188486
Study Section
Special Emphasis Panel (ZRR1-BT-4 (01))
Program Officer
Marron, Michael T
Project Start
1999-05-01
Project End
2002-04-30
Budget Start
2000-05-01
Budget End
2002-04-30
Support Year
2
Fiscal Year
2000
Total Cost
$93,125
Indirect Cost
Name
Wadsworth Center
Department
Type
DUNS #
153695478
City
Menands
State
NY
Country
United States
Zip Code
12204
McCue, Lee Ann; Thompson, William; Carmack, C Steven et al. (2002) Factors influencing the identification of transcription factor binding sites by cross-species comparison. Genome Res 12:1523-32
Webb, Bobbie-Jo M; Liu, Jun S; Lawrence, Charles E (2002) BALSA: Bayesian algorithm for local sequence alignment. Nucleic Acids Res 30:1268-77
McCue, L; Thompson, W; Carmack, C et al. (2001) Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res 29:774-82