This project will develop theoretical and algorithmic foundations for writing programs that efficiently share limited memory on multi-core computers. On a multi-core computer, each process is allocated some share of the memory, but its share can fluctuate over time as other processes start, stop, and change their demands for memory. Most of today's programs do not cope well with memory fluctuations -- they have difficulty taking full advantage of additional memory freed up by other processes and they can slow to a crawl when their memory allocation decreases.

By enabling programmers to write software that can adapt to memory fluctuations, this research will provide new levels of flexibility, performance, and resource utilization for scientific and commercial applications running on shared-memory multi-core infrastructure. Cloud services will respond more rapidly to changes in workload. High-performance-computing applications will achieve higher memory utilization, enabling scientists to do more with less hardware. By creating a more efficient and flexible computing infrastructure, this project has the potential to accelerate the pace of discovery in other scientific fields. For example, biological applications such as protein docking are likely to benefit from this research because the performance of current software is limited by contention for memory.

This project will build upon the PIs' recently proposed notion of cache-adaptive algorithms, i.e., algorithms that automatically adapt to memory fluctuations. This project will develop cache-adaptive theory and applications in four ways:

1. The PIs will extend cache-adaptive analytical techniques to apply to more algorithms, such as cache-oblivious FFT and cache-oblivious serial and parallel dynamic programs. 2. The PIs will develop the foundations of cache-adaptive data structures, such as cache-adaptive priority queues. 3. The PIs will measure the impact of adaptivity on actual performance, focusing on cache-adaptive sorting, serial and parallel dynamic programs, and stencil computations. 4. The PIs will implement cache-adaptive parallel software for computational biology applications, such as protein-protein docking, dynamic programs, and other HPC simulations.

The PIs will offer courses on parallel algorithms, parallel programming, cache-efficient and external-memory algorithms as part of a new degree program in computational sciences that is being launched at Stony Brook through its recently established Institute for Advanced Computational Sciences (IACS). These courses are designed to disseminate high-performance computing research results to students and faculty in other fields, such as physics, chemistry, biology and math. The PIs will also design a course, targeted at computer science students, on theoretical and systems aspects of external memory computing in the context of big data, databases, and file systems. The PIs will use super-computing resources from the XSEDE program, giving students access to some of the world?s fastest supercomputing clusters for their programming assignments and course projects.

The PIs will engage in outreach and dissemination by organizing parallel programming workshops as part of the IACS and in collaboration for Brookhaven National Labs. PIs will also give tutorials on parallel computing, memory-efficient computing, and big data and at conferences and at other universities.

Project Start
Project End
Budget Start
2014-08-01
Budget End
2019-07-31
Support Year
Fiscal Year
2014
Total Cost
$799,999
Indirect Cost
Name
State University New York Stony Brook
Department
Type
DUNS #
City
Stony Brook
State
NY
Country
United States
Zip Code
11794