Advancing knowledge in the biological sciences involves experimentally testing hypothesesand interpreting the results based on prior scientific work. Researchers face the challenge of collecting, evaluating and integrating large amounts of different kinds of information about organisms, cells, genes and proteins to generate a valid hypothesis. And once a hypothesis is generated, the challenge is to evaluate the hypothesis with respect to what is already known.

Our proposed research will:

1. Test the scalability and extensibility of a novel computer system that allows biologists to construct and evaluate alternative hypotheses against a knowledge base on the yeast Saccharomyces cerevisiae.

2. Test a knowledge archive that supports the archiving and search of validated hypotheses.

In our work, we define a hypothesis as a statement about relationships between components of a biological system that are intended to explain experimental observations. A set of validated hypotheses can be used as building blocks to construct larger, more complex models such as pathways. We plan to develop and test a new paradigm: that of hypothesis-driven querying of model organism knowledgebases.

The proposed work, called HyQue (for Hypothesis-based Querying of pathway models), will take as input working hypotheses about pathway models expressed in a knowledge-based formalism, evaluate their consistency using existing data in a knowledgebase, and provide as output contradictory evidence and suggestions for improving hypotheses. HyQue will incorporate formal knowledge representations based upon Semantic Web standards and an ontology to represent biological objects and relationships. HyQue will also contain a library of rules that determine counts of support and contradiction for a given hypothesis.

We will prototype an archive of hypotheses that allows users to compare their hypothesis with other hypotheses submitted by their peers. We will explore the capability to:

(1) express working hypotheses about the yeast cell cycle; (2) provide integration of data in the Saccharomyces Genome Database to evaluate/test pathway-specific hypotheses; and (3) archive these results.

As analytical tools and database resources proliferate, biologists require facilities to integrate existing data into knowledge that can create a shared understanding of biological models. Our work will explore the expressivity and scalability of unique and novel querying and contradiction based reasoning methods that use rich formal knowledge specifications of biological events in order to accomplish information integration. Our proposed work will lead to a novel paradigm of querying biological knowledge that can dynamically retrieve, integrate and interpret information in terms of biologically relevant relationships asserted as pathway models. Our work will examine the value of Semantic Web technologies in building a knowledgebase for such querying and reasoning and will aid in standardizing models of biological knowledge and add momentum to a range of ongoing ontology building efforts.

Further information on the project can be found at the project web page: http://nigam.web.stanford.edu/hyque

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0849207
Program Officer
Vasant G. Honavar
Project Start
Project End
Budget Start
2008-09-01
Budget End
2010-08-31
Support Year
Fiscal Year
2008
Total Cost
$200,000
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Palo Alto
State
CA
Country
United States
Zip Code
94304