As transistor sizes continue to scale, we are about to witness stunning levels of chip integration, with 1,000 cores on a single die. At these processor counts, it has been an accepted tenet among many researchers that shared memory does not scale. The reason is the high hardware overhead of supporting fine-grain synchronization and communication between so many cores. This research seeks to disprove this. It shows that fine-grain data sharing is scalable with the use of fast on-chip wireless communication. The Principal Investigators (PIs) augment each core with a transceiver that enables on-chip broadcast in 5-7 nanoseconds, and design a multicore architecture that supports it. Then, the PIs implement synchronization and communication primitives and libraries that support fine-grain data sharing with an unprecendented low overhead. With these primitives, the PIs redesign popular runtimes such as OpenMP or Cilk, and rethink algorithms and applications. This effort contributes to multidisciplinary research and education on scalable parallel computing at the University of Illinois. The PIs will enhance the courses that they teach in parallel computer architecture, parallel compilation techniques, and parallel programming. They are also working with the department to broaden the course offerings with multidisciplinary undergraduate courses in the general area of parallel computing --- as part of Illinois' CS+X major structure, where X can be a large number of other disciplines. The work will also have an impact on industry, since it addresses a real, very timely technical problem: extraordinary on-chip integration coupled with unscalable fine-grain data sharing. The ability to work closely with IBM Research and Microsoft Research will be crucial.

This is a cross-disciplinary effort that cuts across three areas: architecture, programming systems, and algorithms and applications. The architecture work focuses on supporting on-chip wireless communication by extending cache coherence transactions, trading-off wireless power for error rate, and supporting multiple wireless channels. The programming systems work focuses on redesigning the MPI communication primitives, making shared memory scalable for OpenMP and Cilk, and designing a best-effort API for application resiliency. The algorithms and applications work focuses on studying and developing algorithms and applications  that can take advantage of the architecture. The PIs study problems in the areas of graphs, numerics, dynamic programming, recognition-mining-synthesis, and MapReduce.

Project Start
Project End
Budget Start
2016-07-01
Budget End
2020-06-30
Support Year
Fiscal Year
2016
Total Cost
$879,180
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820