Deep neural network (DNN) is an important Artificial Intelligence (AI) technique and it has recently gained widespread applications in numerous fields such as image recognition, machine translation, autonomous vehicles and healthcare diagnosis. Conventional DNNs are implemented using cloud computing, where a large amount of computing resource is available in a centrally-pooled manner. In order to achieve stronger data privacy, less response time and relaxed data transmission burden, deploying DNN functionality in a distributed manner at the edges of the network has become a very attractive proposition. However, DNN-learning on mobile devices that are at the edge of the network is very challenging due to conflicting requirements of large time and energy consumption, and limited on-device resources. In order to address this challenge, this project leverages low-rank tensors as a powerful mathematical tool for representing and compressing tensor-format data, to form a new family of ultra-low cost deep neural networks. This brings an order-of-magnitude reduction in time and energy consumption for deep neural network learning. Investigations in many areas of BigData research will benefit as well. This project involves graduate and undergraduate students, especially from underrepresented groups, through summer research experiences, and senior design projects to broaden the participation of computing. The outcomes of this project will be disseminated to the community in the format of technical publications, talks and tutorials in both academic institutions and industry.

In order to remove the barriers of realizing real-time energy-efficient DNN-learning on the resource and energy-constrained embedded devices, this project considers innovations at three levels: 1) at theory level, it develops a novel redundancy-free matrix-vector multiplication scheme to reduce computational cost, including a new online update scheme for low-rank tensors to enable fast compressed data update; 2) at algorithm level, it develops low-rank tensor-based forward and backward propagation schemes to support low-cost accelerated inference and training, including catastrophic forgetting-resilient training scheme and training-aware compression scheme to improve the learning robustness and memory efficiency; and 3) at hardware design level, it proposes efficient hardware architecture that fully utilize the benefits provided by low-rank tensors to achieve improved hardware performance for on-device DNN inference and learning. Finally, the efficacy of the proposed research will be validated and evaluated, via software implementations on different DNN models in different target applications. A field-programmable gate array (FPGA)-based hardware prototype will also be developed.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1955909
Program Officer
Sankar Basu
Project Start
Project End
Budget Start
2020-07-01
Budget End
2023-06-30
Support Year
Fiscal Year
2019
Total Cost
$262,975
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
Piscataway
State
NJ
Country
United States
Zip Code
08854