In many fields of science and engineering, probabilistic latent variable models are a powerful and widely-used tool for drawing inferences from complex data. They provide a flexible framework by modeling the complexity in observed data as arising from interactions between simpler random and unobserved quantities. Latent variable models used in modern applications are often high-dimensional, and this leads to both statistical and computational challenges for inference: Surprising phenomena emerge in which structure in one latent variable can create spurious and problematic artifacts in classical inference procedures for another. These classical procedures also commonly lead to non-convex optimization problems over a large number of parameters, which are difficult to computationally solve.
This research will study a flexible framework by modeling the complexity in observed data as arising from interactions between simpler random and unobserved quantities. The aim is in answering the following questions: How and why can one source of latent variation lead to artifacts in classical statistical estimates for another? What are the geometric properties of objective function landscapes in these models that render them difficult to optimize? And, can we design improved inferential procedures that correct for these artifacts and are easier to compute? The research will apply techniques from random matrix theory, free probability theory, and statistical physics to obtain a better understanding of these questions in high-dimensional settings.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.