SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform

Qian, Xuehai; Prasanna, Viktor

Abstract

With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used because of their high accuracy, excellent scalability, and self-adaptiveness properties. Many applications employ DNNs as the core technology, such as face detection, speech recognition, scene parsing. To meet the high accuracy requirement of various applications, DNN models are becoming deeper and larger, and are evolving at a fast pace. They are computation and memory intensive and pose intensive challenges to the conventional Von Neumann architecture used in computing. The key problem addressed by the project is how to accelerate deep learning, not only inference, but also training and model compression, which have not received enough attention in the prior research. This endeavor has the potential to enable the design of fast and energy-efficient deep learning systems, applications of which are found in our daily lives -- ranging from autonomous driving, through mobile devices, to IoT systems, thus benefiting the society at large.

The outcome of this project is FASTLEAP - an Field Programmable Gate Array (FPGA)-based platform for accelerating deep learning. The platform takes in a dataset as an input and outputs a model which is trained, pruned, and mapped on FPGA, optimized for fast inferencing. The project will utilize the emerging FPGA technologies that have access to High Bandwidth Memory (HBM) and consist of floating-point DSP units. In a vertical perspective, FASTLEAP integrates innovations from multiple levels of the whole system stack algorithm, architecture and down to efficient FPGA hardware implementation. In a horizontal perspective, it embraces systematic DNN model compression and associated FPGA-based training, as well as FPGA-based inference acceleration of compressed DNN models. The platform will be delivered as a complete solution, with both the software tool chain and hardware implementation to ensure the ease of use. At algorithm level of FASTLEAP, the proposed Alternating Direction Method of Multipliers for Neural Networks (ADMM-NN) framework, will perform unified weight pruning and quantization, given training data, target accuracy, and target FPGA platform characteristics (performance models, inter-accelerator communication). The training procedure in ADMM-NN is performed on a platform with multiple FPGA accelerators, dictated by the architecture-level optimizations on communication and parallelism. Finally, the optimized FPGA inference design is generated based on the trained DNN model with compression, accounting for FPGA performance modeling. The project will address the following SPX research areas: 1) Algorithms: Bridging the gap between deep learning developments in theory and their system implementations cognizant of performance model of the platform. 2) Applications: Scaling of deep learning for domains such as image processing. 3) Architecture and Systems: Automatic generation of deep learning designs on FPGA optimizing area, energy-efficiency, latency, and throughput.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 1919289
Program Officer: Yuanyuan Yang

Project Start
Project End
Budget Start: 2019-10-01
Budget End: 2022-09-30
Support Year
Fiscal Year: 2019
Total Cost: $848,718
Indirect Cost

SPX: Collaborative Research: FASTLEAP: FPGA based compact Deep Learning Platform
Qian, Xuehai Prasanna, Viktor
University of Southern California, Los Angeles, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments