Deep neural networks (DNNs) have been employed in wide application domains thanks to their extraordinary performance. Hardware implementations of DNNs are of critical importance for the ubiquitous embedded and Internet of Things (IoT) devices, which call for high performance in energy and resource constrained systems. This project aims to address the challenges when mapping complicated DNN models into hardware for energy-efficient and performance-driven implementations. The proposed techniques will promote wider adoptions of deep learning into both high-performance and low-power computing systems. The project will also enhance economic opportunities and have significant societal benefits via solutions that support broader adoption of intelligent systems for big data analytics, weather modeling and forecasting, disease diagnosis and drug delivery, and medical image processing. The research advances will be incorporated into coursework taught by the investigators. Activities on engaging underrepresented, undergraduate, and K12 students will be designed in collaboration with the Northeastern University Center of STEM Education and University of Southern California's Viterbi Center for Engineering Diversity. All software code from the project will be released via GitHub and educational modules and tutorials will be make available to the research community, industry, and government.

Exploring the inherent model redundancy of DNNs, this project will develop an algorithm-hardware co-optimization framework for greatly reducing DNN computation and storage requirements by leveraging ADMM (alternating direction method of multipliers), a powerful optimization technique. This project first solves the challenge in the application of ADMM due to the non-convex objective function in DNN training, and thereby lack of guarantees on solution feasibility, solution quality, and low runtime. Therefore, an integrated framework of ADMM regularization and masked mapping and retraining will be developed and further improvements on solution quality, performance-driven computation/storage reduction, and hardware feasibility will be investigated. Next, the project proposes a unified weight and intermediate result pruning and quantization technique that explores all four redundancy sources of DNN models. Due to the impact on energy efficiency of hardware implementations of DNNs, nearly all DNN models, or at least the most computationally intensive convolutional layers can be then placed on a single chip. Finally, design-time parameterization and algorithm-hardware co-design solutions will be developed for efficient utilization of available hardware resources, achieving high performance, energy efficiency, and adaptation capability. Extensive experimentation and evaluation will be performed to validate and tune the proposed technique with prototype systems using FPGA devices.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1901378
Program Officer
Sankar Basu
Project Start
Project End
Budget Start
2019-06-01
Budget End
2022-05-31
Support Year
Fiscal Year
2019
Total Cost
$750,000
Indirect Cost
Name
Northeastern University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02115