Unsupervised learning of useful features, or representations, is one of the most basic challenges of machine learning. Unsupervised representation learning techniques capitalize on unlabeled data which is often cheap and abundant and sometimes virtually unlimited. The goal of these ubiquitous techniques is to learn a representation that reveals intrinsic low-dimensional structure in data, disentangles underlying factors of variation by incorporating universal AI priors such as smoothness and sparsity, and is useful across multiple tasks and domains.

This project aims to develop new theory and methods for representation learning that can easily scale to large datasets. In particular, this project is concerned with methods for large-scale unsupervised feature learning, including Principal Component Analysis (PCA) and Partial Least Squares (PLS). To capitalize on massive amounts of unlabeled data, this project will develop appropriate computational approaches and study them in the ?data laden? regime. Therefore, instead of viewing representation learning as dimensionality reduction techniques and focusing on an empirical objective on finite data, these methods are studied with the goal of optimizing a population objective based on sample. This view suggests using Stochastic Approximation approaches, such as Stochastic Gradient Descent (SGD) and Stochastic Mirror Descent, that are incremental in nature and process each new sample with a computationally cheap update. Furthermore, this view enables a rigorous analysis of benefits of stochastic approximation algorithms over traditional finite-data methods. The project aims to develop stochastic approximation approaches to PCA and PLS and related problems and extensions, including deep, and sparse variants, and analyze these problems in the data-laden regime.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1546462
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2015-09-01
Budget End
2018-08-31
Support Year
Fiscal Year
2015
Total Cost
$400,810
Indirect Cost
Name
Princeton University
Department
Type
DUNS #
City
Princeton
State
NJ
Country
United States
Zip Code
08544