The success of deep learning has had a major impact across industry, commerce, science and society. But there are many aspects of this technology that are very different from classical methodology and that are poorly understood. Gaining a theoretical understanding will be crucial for overcoming its drawbacks. The Collaboration on the Theoretical Foundations of Deep Learning aims to address these challenges: understanding the mathematical mechanisms that underpin the practical success of deep learning, using this understanding to elucidate the limitations of current methods and extending them beyond the domains where they are currently applicable, and initiating the study of the array of mathematical problems that emerge. The team has planned a range of mechanisms to facilitate collaboration, including teleconference and in-person research meetings, a centrally organized postdoc program, and a program for visits between institutions by postdocs and graduate students. Research outcomes from the collaboration have strong potential to directly impact the many application domains for deep learning. The project will also have broad impacts through its education, human resource development and broadening participation programs, in particular through training a diverse cohort of graduate students and postdocs using an approach that emphasizes strong mentorship, flexibility, and breadth of collaboration opportunities; through an annual summer school that will deliver curriculum in the theoretical foundations of deep learning to a diverse group of graduate students, postdocs, and junior faculty; and through targeting broader participation in the collaboration’s research workshops and summer schools.
The collaboration’s research agenda is built on the following hypotheses: that overparametrization allows efficient optimization; that interpolation with implicit regularization enables generalization; and that depth confers representational richness through compositionality. The team aims to formulate and rigorously study these hypotheses as general mathematical phenomena, with the objective of understanding deep learning, extending its applicability, and developing new methods. Beyond enabling the development of improved deep learning methods based on principled design techniques, understanding the mathematical mechanisms that underlie the success of deep learning will also have repercussions on statistics and mathematics, including a new point of view of classical statistical methods, such as reproducing kernel Hilbert spaces and decision forests, and new research directions in nonlinear matrix theory and in understanding random landscapes. In addition, the research workshops that the collaboration will organize will be open to the public and will serve the broader research community in addressing these key challenges.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.