Structured, high-dimensional regression problems is a current topic of major interest due recent applications in modern scientific fields. Such structured, high-dimensional data arise naturally in bioinformatics, signal processing, quantum mechanics and networks. Handling this massive high-dimensional data is impossible unless the underlying parameter of interest has additional structure. Examples of additional structure that arise in applications include sparsity or low-rankness and powerful estimation methods have been developed in the last decade to successfully recover the true parameter whenever such additional structure is present in the data. However, identifying whether such additional structure is present in the data is more challenging than the corresponding estimation problem. New phase transitions are seen and novel methods are needed. The goal of the current project is to develop statistical methodology aimed at identifying the existence of such additional structure, and to study the theoretical phase transitions of the problem.

The high-dimensional setting considers the situation where the number of unknown parameters outnumber the number of samples. Estimation procedures have successfully solves this high-dimensional setting under additional structural assumption on the problem: Common structural assumptions include sparsity or low-rankness, where the unknown parameter possesses some low-dimensional property. In the so called sparse or low-rank regime, consistent estimation of the unknown parameter becomes possible even in the high-dimensional "large p small n" settings. The proposed research will focus on uncertainty quantification in such structured, high-dimensional settings. Methodologies will be developed to construct confidence sets and confidence bands tailored to structured high-dimensional problems. A major challenge that will be studied both theoretically and empirically is identifying whether the sparse or low-rank regime actually occurs, or equivalently the construction of adaptive confidence sets. An adaptive confidence set captures the true parameter with high probability, and its size shrinks optimally with respect to the unknown sparsity or low-rankness of the problem. The optimality properties of adaptive confidence sets will be studied, as well as the corresponding phase transitions with respect to the problem parameters. The proposed research is motivated by and will be directly applicable to scientific fields where high-dimensional data arise. An adaptive confidence set has major applications in practice as it provides a certificate that the sparse or low-rank regime actually occurs, and such certificate asserts that the high-dimensional estimation and inferential methods used in practice are actually accurate. Identifying whether the sparse regime actually occurs has direct applications in bioinformatics and signal processing, while identifying whether the low-rank regime occurs has direct applications in quantum tomography and matrix completion.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1811976
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2018-06-01
Budget End
2021-05-31
Support Year
Fiscal Year
2018
Total Cost
$179,997
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
Piscataway
State
NJ
Country
United States
Zip Code
08854