Data-intensive applications such as web information retrieval and scientific data-driven applications consume significant computational and I/O resources. With the ubiquity of computer networks and advances in server clustering technologies, it has become a reachable goal to make large-scale datasets and related services available to many concurrent users in real-time. System support for online applications must achieve high performance in the presence of concurrent execution and runtime workload fluctuation. Extending such support for data intensive applications presents significant challenges, including those concerning highly concurrent I/O at the host level and multi-dimensional resource allocation (e.g., CPU, memory, network, and disk I/O) on distributed platforms.

The overarching goal of this project is to develop system-level support to achieve high throughput and efficient resource utilization for data intensive online applications. Large-scale online applications often contain multiple application components and data partitions, hosted on server and storage clusters. The resource utilization efficiencies at both the host-level and the cluster-level are important for achieving high system throughput. There are two parallel research thrusts in this project. At the host level, the project will investigate operating system support (particularly I/O prefetching and memory management) for I/O-intensive concurrent application execution. At the cluster level, the proposed research will explore efficient resource allocation to application and data components on heterogeneous distributed platforms using a profile-driven approach.

Techniques developed in this project will be evaluated through experimentation with real applications, drawn from both the scientific domain and web-based commercial applications. Developed software artifacts and experimental datasets will be released for public use. In parallel to research, this project will enhance the curriculum of relevant systems-area courses at the PI's institution. This effort includes the development of a new operating system instructional laboratory platform focusing on performance-oriented system design and implementation. Additionally, a new graduate course on scalable online systems and applications will be created, using results produced in this research and by other related research.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0448413
Program Officer
Almadena Y. Chtchelkanova
Project Start
Project End
Budget Start
2005-01-15
Budget End
2011-12-31
Support Year
Fiscal Year
2004
Total Cost
$401,459
Indirect Cost
Name
University of Rochester
Department
Type
DUNS #
City
Rochester
State
NY
Country
United States
Zip Code
14627