The goal of the project is to develop innovative technology for analyzing complex structured databases, particularly in biological domains, that can be used to find interesting and useful patterns and regularities in these rich domains. The work is motivated by two challenge problems from molecular biology: gene expression data and tuberculosis data. The former data are the key to understanding the function of certain genes in an organism and therefore how those genes manifest themselves as properties of the organism. The latter data are useful in understanding the infection patterns and eventually controlling the spread of TB. Similar data can provide deep scientific understanding of many other biological processes. This project will develop languages for statistical modeling of biological processes, techniques for learning the models from data, and algorithms for reasoning with the resulting models. The project will be based on probabilistic relational models, an extension of Bayesian networks recently developed by the Principal Investigator.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Application #
0082554
Program Officer
Kevin L. Thompson
Project Start
Project End
Budget Start
2000-09-01
Budget End
2003-12-31
Support Year
Fiscal Year
2000
Total Cost
$494,034
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Palo Alto
State
CA
Country
United States
Zip Code
94304