Machine Learning (ML) has been described as the fuel of the next industrial revolution. Yet, current state-of-the-art ML algorithms still heavily rely on having a human in the loop in order to work properly. Indeed, the training process of a ML algorithm requires significant human intervention through twisting and tuning of the many knobs of the algorithm. Moreover, most of the time this tweaking process is carried out without any theoretical guidelines. The most common options for the ML practitioners are to follow their intuitions or to exhaustively evaluate all the possibilities, making the overall time needed to train these algorithms difficult to predict. In this project, the investigator aims at designing truly parameter-free training algorithms for ML, to remove the burden of human parameter-tuning and make ML algorithms more scalable. The investigator also proposes an Education Plan that is designed to fill the gap between theoretical and applied ML in the minds of most students. It will involve a collaboration with the PROgram in Mathematics for Young Scientists (PROMYS), to introduce the study of very basic concepts of the theory of machine learning to both high school students and teachers, and with the women’s Smith College.
The project stems from new methods the investigator has recently introduced to design optimal parameter-free optimization algorithms, e.g., Stochastic Gradient Descent without learning rates, for the particular case of convex functions. In this project, the investigator proposes to go beyond convex functions. In particular, the investigator plans to pursue the following inter-related aims: 1. Computational complexity of learning without prior knowledge. The objective is to fully characterize the computational complexity of learning, without assuming knowledge of unknown quantities, in worst and "easy" cases. In other words, if a learning algorithm needs parameters to achieve optimal performance, the goal is to fully characterize the overall computational complexity, including the search for the optimal parameters. 2. Non-convex stochastic optimization without parameters. The project's goal is to generalize recent results on parameter-free optimization to deal with the non-convex problems in modern ML algorithms. The investigator plans to use a novel optimization framework that mixes elements of Online Mirror Descent with Dual Averaging, the two frameworks used to design optimization algorithms. 3. Reducing parameters in optimization for Deep Learning. Deep Learning models are the most parameter-heavy in ML. The project considers three applications of the project's research to widely used deep-learning algorithms: adaptive scaling heuristics, saddle-point optimization, and Bayesian Optimization with Gaussian Processes.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.