Workflow-based systems have emerged as an alternative to ad-hoc approaches to data exploration that are widely used in the scientific community. Workflows can capture computational tasks at various levels of detail and systematically record the provenance (history) information necessary for reproducibility, result publication and sharing. Although the benefits of using scientific workflow systems are well known, the fact that workflows are hard to create and maintain has been a major barrier to wider adoption of the technology in the scientific domain.

The goal of this project is to produce new algorithms and techniques for exploring and re-using useful knowledge embedded in workflow specifications and in the provenance of the data they manipulate. This project addresses key limitations in existing workflow systems. First, it develops a set of usable tools that enable casual users (who do not necessarily have programming expertise) to perform exploratory tasks and solve problems through workflows. These include intuitive user interfaces to manipulate collections of workflow and to query workflows by example. Second, it builds a scalable provenance management infrastructure to support the efficient execution of these operations.

The research results of this project advance the state of the art and build fundamental knowledge in storing, querying, and re-using provenance of computational tasks. This project has the potential to impact a variety of applications where the creation and maintenance of workflows is currently a major bottleneck. This includes large computational science projects and portals. Furthermore, it makes workflows and workflow technology more accessible to casual users. Through our interdisciplinary collaborations, this project will have immediate impact in helping improve the scientific discovery process. The involvement of graduate and undergraduate students in the project will provide mentoring opportunities. The PI is committed to recruiting minority students. The results of this project will be disseminated as research papers and as freely available tools at the project website: www.cs.utah.edu/~juliana/projects/NSF-IIS-0746500

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0746500
Program Officer
Xiaoyang Wang
Project Start
Project End
Budget Start
2008-04-15
Budget End
2011-07-31
Support Year
Fiscal Year
2007
Total Cost
$434,507
Indirect Cost
Name
University of Utah
Department
Type
DUNS #
City
Salt Lake City
State
UT
Country
United States
Zip Code
84112