DC: Medium: Intelligent Data Placement in Support of Scientific Workflows

Chervenak, Ann; Deelman, Ewa

Abstract

Transformative research is conducted via computational analyses of large data sets in the terabyte and petabyte range. These analyses are often enabled by scientific workflows, which provide automation and efficient and reliable execution on campus and national cyberinfrastructure resources. Workflows face many issues related to data management such as locating input data, finding necessary storage co-located with computing capabilities, and efficiently staging data so that the computation progresses but storage resources do not fill up. Such data placement decisions need to be made within the context of individual workflows and across multiple concurrent workflows. Scientific collaborations also need to perform data placement operations to disseminate and replicate key data sets. Additional challenges arise when multiple scientific collaborations share cyberinfrastructure and compete for limited storage and compute resources. This project will explore the interplay between data management and computation management for these scenarios. The project will include the design of algorithms and methodologies that support large-scale data management for efficient workflow-based computations composed of individual analyses and workflow ensembles while preserving policies governing data storage and access. The algorithms will be evaluated regarding their impact on performance of synthetic and real-world workflows running in simulated and physical cyberinfrastructures. New approaches to data and computation management can potentially transform how scientific analyses are conducted at the petascale. Besides advancing computer science, this work will have direct impact on data and computation management for a range of scientific disciplines that manage large data sets and use them in complex analyses running on cyberinfrastructure.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 0905032
Program Officer: Mohamed G. Gouda

Project Start
Project End
Budget Start: 2009-09-01
Budget End: 2013-08-31
Support Year
Fiscal Year: 2009
Total Cost: $810,000
Indirect Cost

DC: Medium: Intelligent Data Placement in Support of Scientific Workflows
Chervenak, Ann Deelman, Ewa
University of Southern California, Los Angeles, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments