Advances in technology have enabled individuals, organizations and government agencies to collect and store massive amounts of data across all walks of human endeavor. A critical challenge here is to extract actionable information from such tera- and peta-scale data stores in as efficient manner as possible so that domain scientists can make critical advances in various fields including the sciences, engineering, medicine and homeland security.

Toward this objective, the PI seeks to employ an architecture-conscious approach to scalable data analysis on modern cluster systems interconnected through a high speed network. The central thesis of this work is that current day algorithms for data analysis often grossly under-utilize architectural resources (processors, memory, disk and network). This project seeks to address this limitation in the context of key application drivers drawn from scientific simulations, bioinformatics and security applications. Specifically locality enhancing techniques, the ability to leverage new features of modern architectures, the ability to efficiently work with large out-of-core data structures, multi-level load balancing and distribution of work among cluster nodes and mechanisms that support remote memory paging on modern clusters will be investigated and leveraged in this context.

The main scientific outcomes of this research will include the ability to process and analyze hitherto intractably large datasets enabling new scientific discoveries in the corresponding domains and the ability to engage and fully utilize the underlying parallel architecture to respond and react to domain expert queries efficiently. Another expected outcome of this work will be from specific solutions obtained to deploy generic runtime abstractions that can be used by a host of data-intensive applications. The broader outcomes of this work will be to train capable undergraduate and graduate students. Women and minorities will be especially encouraged to participate and existing interactions with a local HBCU will be strengthened through various initiatives.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0702587
Program Officer
Almadena Y. Chtchelkanova
Project Start
Project End
Budget Start
2007-06-01
Budget End
2011-05-31
Support Year
Fiscal Year
2007
Total Cost
$325,000
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210