Data-intensive science requires tera- and peta-scale computing. Hardware demands are high, and software, from system to application level, is highly specialized. As a result, scientific investigation is constrained. New and more accessible models for large-scale computing are required. The proposed program seeks to leverage new off-the-shelf computing technologies to develop concepts and tools (e.g., programming strategies, general-purpose and domain-specific libraries) to enable practical and transparent Scalable Heterogeneous Computing (SHC). Results will be applied to three cutting-edge challenges in radio astronomy, quantum chemistry, and neuroscience. These three applications share the need to process massive data streams. They broadly span a parameter space of processing challenges, defined by complexity in computation, data volume, and throughput. Addressing these requires scalable algorithms for massive datasets, systems with high-bandwidth memory access, and the ability to process high-throughput data streams. Proposed SHC strategies, tools, and optimizations would develop, for general scientific computing, use of massively parallel graphics processing units (GPUs) and fast, low-power, large-volume, solid-state storage (SSS) devices that are commercially available. The tools developed will be applied to the analysis of radioastronomy data generated by the Muchison Wide-Field Array, the development of a SHC-enabled molecular quantum chemistry code, and to the Connectome project, an effort to make a complete map of the neuron connectivity of mammalian brains. Education is tightly integrated into the SHC program at the undergraduate and graduate levels. Initiatives to disseminate results will include tutorials and documented open-source libraries as well as workshops.

LAY ABSTRACT

Scientists engaged in data-intensive research, from astronomy to neuroscience, are in desperate need of new strategies and tools. This project approaches the challenge by leveraging new off-the-shelf hardware and software technologies in unique combinations. The project will bring massively parallel graphics processing units and fast solid-state storage devices together with traditional central processing units. Project staff will develop scalable algorithms that leverage this commercially available hardware to process massive data sets and streams of data. The project will use three major scientific challenges as testbeds for development of these new approaches: a radioastronomy telescope called the Murchison Wide-Field Array; exploration of chemistry at the quantum level; and the Connectome, an effort to make a complete map of the neuron connectivity in mammalian brains. Postdoctoral researchers from astronomy, chemistry, neuroscience, and computer science will work together at Harvard?s Initiative in Innovative Computing, where they will combine experience from these domains to develop algorithms and code to broadly enable advances in science. The project will include tutorials, workshops and documented code libraries and a strong educational component.

Agency
National Science Foundation (NSF)
Institute
Division of Physics (PHY)
Type
Standard Grant (Standard)
Application #
0835713
Program Officer
Richard Houghton Pratt
Project Start
Project End
Budget Start
2008-10-01
Budget End
2013-09-30
Support Year
Fiscal Year
2008
Total Cost
$2,000,251
Indirect Cost
Name
Harvard University
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02138