The Institute of Systems Biology is awarded a grant to design an informatics solution for the integration of systems biology data to expedite network inference and knowledge discovery. Specific aims include (1) definition of a flexible data standard for capturing experiment and sample information in digital form (2) development of tools to assist data curation, retrieval, and statistical analysis; and (3) creation of education -oriented database extensions that facilitate high-school training programs. As part of the project, the PI's will develop an inquiry based education program in systems biology that brings mathematics, natural science and computer science to high-school curricula.

The central strategy of the project is motivated by a key observation that, although complex, each data type (for example, microarray ratios, tandem mass spectra, protein/DNA networks, sequences, annotations etc.) is usually well-defined in its 'native' context, and that complexity arises mostly when these data are combined. Tools and approaches developed in this project will, therefore, capitalize on virtues of independent small, specialized and robust databases and software, and enable downstream integration by loosely connecting these sub-systems using carefully crafted rules and technologies such as Java RMI. An important provision of this simple approach will allow researchers to dynamically relate their data to other valuable remotely managed databases through an evolving set of rules, and an inference engine that applies these rules. Furthermore, because experiment design and sample processing are indispensable while analyzing data to draw meaningful insights, data integration will be guided by a flexible experiment meta-information schema, which will be formalized in this project. To extend practicality of this data schema for day-to-day use, user-friendly software tools developed in this project will interface laboratory scientists directly with automated data processing pipelines to help capture experiment details and create this meta-information schema.

Agency
National Science Foundation (NSF)
Institute
Division of Biological Infrastructure (DBI)
Application #
0640950
Program Officer
Peter H. McCartney
Project Start
Project End
Budget Start
2007-07-01
Budget End
2012-09-30
Support Year
Fiscal Year
2006
Total Cost
$1,845,336
Indirect Cost
Name
Institute for Systems Biology
Department
Type
DUNS #
City
Seattle
State
WA
Country
United States
Zip Code
98109