Emerging data science applications require efficient extraction of actionable insights from large and messy datasets. The number of relevant features often overwhelms the volume of data that is available, which dramatically complicates the statistical inference tasks and subsequent decision making. In the existing statistical literature, most of theory aims at understanding the average or global behavior of a statistical estimator in high dimensions. In many applications, however, it is often the case that the goal is not to explore the global behavior of a parameter estimator, but rather to perform inference and reasoning on its local, yet important, operational properties. The techniques and methods developed in the project will further advance the interplay between a broad range of areas including high-dimensional statistics, harmonic analysis, statistical physics, optimization, complex analysis, and statistical machine learning. The project provides research training opportunities for graduate students.

This project pursues fine-grained inferential procedures and theory, aimed at enlarging the uncertainty assessment toolbox for various low-complexity models in high dimensions. Focusing on a few stylized problems, this research program consists of four major thrusts: (1) construct optimal confidence intervals for linear functionals of eigenvectors in low-rank matrix estimation; (2) design fine-grained hypothesis testing procedures for sparse regression under general designs; (3) develop entry-wise inference schemes for principal component analysis with missing data; and (4) conduct reliable and adaptive statistical eigen-analysis under minimal eigen-gaps. Emphasis is placed on algorithms that are model-agnostic and fully adaptive to data heteroscedasticity. Addressing these issues calls for the development of new statistical theory that enables reliable inference for a broad class of local properties underlying the unknown parameters.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
2014279
Program Officer
Pena Edsel
Project Start
Project End
Budget Start
2020-07-01
Budget End
2023-06-30
Support Year
Fiscal Year
2020
Total Cost
$100,000
Indirect Cost
Name
Princeton University
Department
Type
DUNS #
City
Princeton
State
NJ
Country
United States
Zip Code
08544