Society is witnessing an explosion in the use of Deep Neural Networks (DNNs) across all facets of daily life including health, finances, entertainment and transportation. DNNs are used by performing DNN inference, which queries the DNN with an input (for example, an image) to get an answer (for example, a classification). Society relies on inference every day, where it is run on devices ranging from cloud servers to personal computers. The goal of this project is to develop new ways to make inference efficient (fast, low power) on these devices.
The technical approach is to explore how a new phenomenon, called weight repetition, can be applied to general-purpose devices such as Central Processing Units (CPUs) and Graphical Processing Units (GPUs). The idea is, when a DNN weight is repeated, DNN inference operations can be simplified. The first project thrust will develop high-efficiency weight repetition-aware software kernels that can run on un-modified hardware. The second thrust will develop novel training techniques to co-design the DNN with the weight repetition-aware kernels. Finally, the third thrust will explore what point hardware modifications can be made to further improve efficiency in the first two thrusts.
By proving weight repetition's effectiveness on general-purpose devices, this project will unlock innovation in software, algorithms and hardware. The project will also amplify the improvement possible from related, but orthogonal, techniques such as weight quantization and weight sparsity. To support the cross-stack approach, the project will train a new class of students and researchers who can work across high-performance software, hardware and DNN training algorithms to build co-designed Machine Learning stacks and, in the future, apply the lessons learned to other high-impact problems that require cross-layer solutions.
The project will store all publications, code and data-sets on public-facing websites, hosted at the University of Illinois for at least 3 years after the end of the project. This information will be made available via commercial websites. Links to these websites will be mirrored at http://cwfletcher.net/weightrepetition.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.