This Phase I project forms an NSF TRIPODS Institute, based at Lehigh University and in collaboration with Stony Brook and Northwestern Universities, with a focus on new advances in tools for machine learning applications. A critical component for machine learning is mathematical optimization, where one uses historical data to train tools for making future predictions and decisions. Traditionally, optimization techniques for machine learning have focused on simplified models and algorithms. However, recent revolutionary leaps in the successes of machine learning tools---e.g., for image and speech recognition---have in many cases been made possible by a shift toward using more complicated techniques, often involving deep neural networks. Continued advances in the use of such techniques require combined efforts between statisticians, computer scientists, and applied mathematicians to develop more sophisticated models and algorithms along with more comprehensive theoretical guarantees that support their use. In addition to its research goals, the institute trains Ph.D. students and postdoctoral fellows in statistics, computer science, and applied mathematics, and hosts interdisciplinary workshops and Winter/Summer schools.
The research efforts in Phase I are on the analysis of nonconvex machine learning models, the design of optimization algorithms for training them, and on the development of nonparametric models and associated algorithms. The focus is on deep neural networks (DNNs), mostly in general, but also with respect to specific architectures of interest. The institute's research efforts emphasize the need to develop connections between state-of-the-art approaches for training DNNs and statistical performance guarantees (e.g., on generalization errors), which are currently not well understood. Optimization algorithms development centers on second-order-derivative-type techniques, including (Hessian-free) Newton, quasi-Newton, Gauss-Newton, and their limited memory variants. Recent advances have been made in the design of such methods; the PIs' work builds upon these efforts with their broad expertise in the design and implementation (including in parallel and distributed computing environments) of such methods. The development of nonparametric models promises to free machine learning approaches from restrictions imposed by large numbers of user-defined parameters (e.g., defining a network structure or learning rate of an optimization algorithm). Such models could lead to great advances in machine learning, and the institute's work in this area also draws on the PIs expertise in derivative-free optimization methods, which are needed for training in nonparametric settings.
In this TRIPODS institute, the PIs approach all of these research directions with a unified perspective in the three disciplines of statistics, computer science, and applied mathematics. Indeed, as machine learning draws so heavily from these areas, future progress requires close collaborations between optimization experts, learning theorists, and statisticians---communities of researchers that, as yet, have tended to operate separately with differing terminology and publication venues. With an emphasis on deep learning, this institute aims to foster intercollegiate and interdisciplinary collaborations that overcome these hindrances.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.