Deep learning has led to remarkable artificial intelligence breakthroughs on many important problems such as object recognition for autonomous vehicles, voice-activated assistants, and automated machine translation. At the heart of these breakthroughs is the design of complex, domain-specific deep neural network architectures. However, only a small set of highly-trained researchers are equipped with the resources and expertise to undertake this arduous, ad-hoc design process. Moreover, design efforts have been largely limited to applications in a handful of domains, most notably computer vision and natural language processing. While the burgeoning field of neural architecture search (NAS) aims to automate the design of neural network architectures, existing work on NAS has to date narrowly focused on these same well-studied domains. This project aims to develop, analyze, and implement novel methods that enable automated architecture design beyond these restricted domains. The project involves collaborations with practitioners in new domains to empower them to develop new architectures for their applications. The project will also include extensive educational efforts to create a new course that covers the complete lifecycle of machine learning workflows, including extensive treatment on the automated design and tuning of neural networks. The course material will be freely distributed to facilitate worldwide adoption and adapted to create a short course for high school students.

The focus of this project is to develop principled, efficient, and automated Neural Architecture Search (NAS) capabilities to enable practitioners to seamlessly create novel architectures for new problems. To achieve this goal, the project proposes a fundamentally new NAS paradigm driven by the co-design of the two core components of NAS, namely architecture search spaces and methods to search through these spaces. The researchers will demonstrate the effectiveness of their proposed techniques across numerous domains, including those where expert-designed architectures do not exist. The technical problems being tackled blend ideas from optimization, learning theory, signal processing, and machine learning systems, and draw connections to the problems of compressed sensing, weak supervision, and meta-learning. The proposed NAS work will be transformational in unlocking the potential of novel deep learning applications in new domains, and will provide training opportunities for graduate students. Moreover, the project's proposed activities emphasize accessibility and broad dissemination via: foundational educational material disseminated to data scientists worldwide; growing and promoting diversity in the Machine Learning Systems research community; and widespread industry adoption via open-source activities, contributions to Carnegie Mellon University's Machine Learning blog, and a recurring podcast.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
2046613
Program Officer
Rebecca Hwa
Project Start
Project End
Budget Start
2021-04-01
Budget End
2026-03-31
Support Year
Fiscal Year
2020
Total Cost
$120,643
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213