This project will develop a system that integrates a web-based experiment workspace environment with a flexible scientific workflow automation system supporting rapid prototyping and user-transparent distributed computing. The experiment-workspace web interface will provide biologists an extensible virtual laboratory for defining computational protocols, executing "experiments" based on those protocols, visualizing data in the context of their provenance, and managing projects. The KEPLER scientific workflow system will be extended to automate the enactment of these experiments on available distributed resources and will record workflow and data provenance. Bioinformatics specialists and software engineers will develop new experiment templates using the system and will share these templates with experiment-workspace users. The system will support users who can make effective use of both interfaces and will facilitate productive collaboration between biologists, bioinformatics specialists, and software developers. A prototype of this system will integrate and accelerate computational and experimental components of research projects employing a sequential chromatin immunoprecipitation followed by DNA microarray analysis (ChIP-chip) for identifying direct transcription factor targets. The IT and CS challenges in automating ChIP-chip workflows are typical of those plaguing complex, genome-scale analyses (e.g., dealing with multiple, alternate or related analysis modules simultaneously; managing thousands of workflow runs and resulting datasets; recording workflow, data and parameter dependencies; etc.) Implementing this system will require innovative approaches for managing nested collections of scientific data, combining functional programming and stream-processing methodologies for scientific workflows, and generalizing data-typing and integration approaches for supporting interoperability of workflow components developed by different organizations.

This proposal brings together investigators recognized for their expertise in scientific workflow modeling, design, and automation; scientific data management; ChIP-chip research and data analysis; and development of collaborative computational environments for scientific research, which together will enhance student engagement in the research.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0612326
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
2006-07-01
Budget End
2010-06-30
Support Year
Fiscal Year
2006
Total Cost
$600,139
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618