SHF: Medium: TensorNN: An Algorithm and Hardware Co-design Framework for On-device Deep Neural Network Learning using Low-rank Tensors

Yuan, Bo

Abstract

Deep neural network (DNN) is an important Artificial Intelligence (AI) technique and it has recently gained widespread applications in numerous fields such as image recognition, machine translation, autonomous vehicles and healthcare diagnosis. Conventional DNNs are implemented using cloud computing, where a large amount of computing resource is available in a centrally-pooled manner. In order to achieve stronger data privacy, less response time and relaxed data transmission burden, deploying DNN functionality in a distributed manner at the edges of the network has become a very attractive proposition. However, DNN-learning on mobile devices that are at the edge of the network is very challenging due to conflicting requirements of large time and energy consumption, and limited on-device resources. In order to address this challenge, this project leverages low-rank tensors as a powerful mathematical tool for representing and compressing tensor-format data, to form a new family of ultra-low cost deep neural networks. This brings an order-of-magnitude reduction in time and energy consumption for deep neural network learning. Investigations in many areas of BigData research will benefit as well. This project involves graduate and undergraduate students, especially from underrepresented groups, through summer research experiences, and senior design projects to broaden the participation of computing. The outcomes of this project will be disseminated to the community in the format of technical publications, talks and tutorials in both academic institutions and industry.

In order to remove the barriers of realizing real-time energy-efficient DNN-learning on the resource and energy-constrained embedded devices, this project considers innovations at three levels: 1) at theory level, it develops a novel redundancy-free matrix-vector multiplication scheme to reduce computational cost, including a new online update scheme for low-rank tensors to enable fast compressed data update; 2) at algorithm level, it develops low-rank tensor-based forward and backward propagation schemes to support low-cost accelerated inference and training, including catastrophic forgetting-resilient training scheme and training-aware compression scheme to improve the learning robustness and memory efficiency; and 3) at hardware design level, it proposes efficient hardware architecture that fully utilize the benefits provided by low-rank tensors to achieve improved hardware performance for on-device DNN inference and learning. Finally, the efficacy of the proposed research will be validated and evaluated, via software implementations on different DNN models in different target applications. A field-programmable gate array (FPGA)-based hardware prototype will also be developed.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Application #: 1955909
Program Officer: Sankar Basu

Project Start
Project End
Budget Start: 2020-07-01
Budget End: 2023-06-30
Support Year
Fiscal Year: 2019
Total Cost: $262,975
Indirect Cost

SHF: Medium: TensorNN: An Algorithm and Hardware Co-design Framework for On-device Deep Neural Network Learning using Low-rank Tensors
Yuan, Bo
Rutgers University, Piscataway, NJ, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments