Computing substrates such as multi-core processors or Field Programmable Gate Arrays (FPGAs) share the characteristic of having two-dimensional arrays of processing elements interconnected by a routing fabric. At one end of the spectrum, FPGAs have single-output programmable logic functions, and at the other end, multi-core chips have complex 32/64-bit processing cores. For different applications, different programmable substrates produce the best area-power-performance tradeoffs. This project is developing a large-scale multi-core substrate that has hundreds or thousands of simple processing cores along with a compilation system that maps computations onto this fabric. This many-core architecture, named Diastolic Array, is coarser-grained than FPGAs but finer-grained than conventional multi-cores. To efficiently exploit such a large number of processing cores, the architecture needs spatially mapping a computation to processing cores and communication to the point-to-point interconnect network. To be practically viable, this mapping process must be automated and effective. The project addresses this challenge by simultaneously developing hardware architecture and a compilation system.
A diastolic array chip is expected to outperform FPGAs or general-purpose processors on an interesting class of applications, enabling more efficient prototyping and low-volume production. The outcomes of this project such as statically-configured interconnection architecture with associated algorithms for routing and resource allocation will also be applicable to other multi-core designs. Finally, the project is developing a new parallel processing module for an undergraduate computer architecture class to give sophomores early exposure to parallel hardware, experience with writing parallel programs and using compilers that exploit parallelism.