CAREER: Understanding the Inductive Biases in Modern Machine Learning

Arora, Raman

Abstract

Recent advances in modern machine learning (deep learning in particular) are ushering in the era of artificial intelligence, which has the potential to revolutionize every aspect of our daily lives. However, much like the early days of the steam engine, a satisfactory understanding of deep learning has so far been elusive. We currently lack a formal theory of deep learning, one that could explain why we can train overly complex models with seemingly not enough training data and still find solutions that generalize to previously unseen data, or why models trained for one task also perform well on another related task, or why trained models are so vulnerable to slight, nearly imperceptible, corruptions of data. This project aims to address this need by developing an explanatory and prescriptive theory of deep learning that is tightly integrated with and motivated by the practice. Rather than view learning as simply a black-box optimization problem, the approach investigates the inner workings by shedding light on algorithmic heuristics that potentially play an equally important role in endowing the trained models with excellent generalization properties. Given the broad applicability of deep learning and the complementary nature of theoretical analyses and empirical studies in the proposed research, the project is particularly suited for integrating research into education and outreach. The proposed educational activities include curriculum development, summer internships, hackathons, and instructor's outreach through local Baltimore programs.

The project investigates the role of explicit algorithmic regularization in the form of early stopping, batch normalization, and dropout, as well as the choice of optimization algorithms and network architecture in providing an adequate inductive bias that helps with generalization. A second overarching goal of the project is to understand, more broadly, the generalization phenomenon in deep learning. It seeks to understand why systems that memorize the training data can still generalize well, how the neural network architecture enables transfer learning, and how to design robust algorithms that will guarantee that deep learning solutions generalize despite adversarial corruption to data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 1943251
Program Officer: Rebecca Hwa

Project Start
Project End
Budget Start: 2020-02-15
Budget End: 2025-01-31
Support Year
Fiscal Year: 2019
Total Cost: $192,804
Indirect Cost

CAREER: Understanding the Inductive Biases in Modern Machine Learning
Arora, Raman
Johns Hopkins University, Baltimore, MD, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments