Workflows have recently emerged as a paradigm for conducting large-scale scientific analyses. The structure of a workflow specifies what analysis routines need to be executed, the data flow amongst them, and relevant execution details. These workflows often need to be executed in distributed environments, where data sources may be available in different physical locations and the steps may have execution requirements calling for high-end computing and memory resources at remote locations. Workflows help manage the coordinated execution of related tasks. They also provide a systematic way to capture scientific methodology and provide provenance information for their results. Yet, robust and flexible workflow creation, mapping, and execution are largely open research problems.
Scientific workflows present new challenges over business workflows and other kinds of process models. They typically use very large, distributed data sets, employ computationally intensive tasks, and require high-end and distributed computing technology. They are also often iteratively and interactively designed, since that is the nature of the scientific exploration and analysis process they reflect. On the other hand, scientific workflows also have simplified requirements in terms of their data flow structure, execution management, or security/privacy constraints. Currently, scientific workflows are mostly designed without formal principles and are rarely optimized, scalable or reusable.
The aim of this workshop is to bring together IT researchers and practitioners as well as domain scientists. Application scientists will be asked to describe requirements and desired new analyses and computations that are not possible with today's technologies. IT researchers will be asked to identify problems in their specific areas of expertise. Discussions will focus on four main topics: (1) applications and requirements; (2) dynamic workflows and user steering; (3) data and workflow descriptions; and (4) system-level management to support large-scale workflows.
The outcome of the workshop will be a report outlining research directions and activities that will bring the needed communities together to work on producing a new paradigm for scientific workflows. Easy-to-use tools for building efficient, scalable and reusable scientific workflows are likely to bring benefits to many fields, and can raise the pace and quality of research work in many areas.
The workshop Web site (http://vtcpc.isi.edu/wiki) provides further information about the workshop and will be used for disseminating the workshop report and other results.