Computer models of physical systems are a vital part of modern scientific and engineering research and development. Large scale computational models of the Earth?s weather, climate, and geological activity; models of biological systems; astronomical models of galaxies; and even macroeconomic models require immense computing resources. These simulations run on many thousands of processors for several months at a time, utilizing tens or hundreds of millions of CPU hours before completion. This project takes a radically new approach to the design and implementation of next generation, exascale supercomputing by leveraging recent developments at the intersection of conventional integrated circuit technology, and emerging resistive random access memory (RRAM) devices. The goal of this project is the acceleration of solvers for large linear systems, which form the backbone of modern scientific computing. In this project a novel set of digital and analog hardware primitives is co-designed with a new class of algorithms that exploit the proposed accelerator. A small-scale prototype is being designed, fabricated, and tested through the EAGER program to demonstrate the feasibility of the fundamental building blocks.

RRAM is a non-volatile memory technology that avoids the scalability challenges of static and dynamic random access memories (SRAM and DRAM), and is a promising "universal memory" candidate, offering read speeds as fast as SRAM and DRAM, and densities comparable to FLASH memory. Beyond simply relying on RRAM for storage, the project integrates circuit, architecture, and algorithm level innovations in developing a qualitatively new hardware accelerator with orders of magnitude greater performance per watt than classical digital computers. Digital memristor-based circuits avoid data movement by performing bitwise matrix vector multiplication in parallel across an entire dataset. Analog hardware quickly provides an accurate, initial seed to an iterative solver, wherein error free digital circuits refine the initial estimate to solve a system of linear equations. A novel, iterative solver algorithm uniquely adapted to the proposed hardware compensates for the inaccuracies and random variations introduced by the analog circuits, systematically reducing the error through a small number of digital iterations.

This combination of digital computation and analog memristor circuits, within high-density RRAM configurations, is expected to have a transformative effect on high performance computing. The system under investigation has the potential to reduce execution time from months to hours, enabling solutions to scientific problems heretofore beyond the reach of modern HPC systems. The project brings together researchers in computer architecture, high performance integrated circuit design, numerical algorithms, and scientific computing to accomplish this multi-disciplinary effort. Algorithm, architecture, and circuit level innovations are being disseminated to the broader research community through published papers, as well as tutorials on the simulation tools. The educational component of the project involves 1) training students in VLSI, architecture, and optimization; and 2) incorporating resistive memories into the architecture and circuits curricula. The PIs are also personally involved in local programs promoting the participation of women and underrepresented minorities in computer science and engineering, and will initiate an effort to increase the enrollment of local minorities in the University of Rochester CS and ECE programs.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1548078
Program Officer
Marilyn McClure
Project Start
Project End
Budget Start
2015-08-01
Budget End
2016-07-31
Support Year
Fiscal Year
2015
Total Cost
$93,721
Indirect Cost
Name
University of Rochester
Department
Type
DUNS #
City
Rochester
State
NY
Country
United States
Zip Code
14627