Contemporary scientific problems involving large, high-dimensional datasets have ushered in a new era for statisticians. A largely unexplored area of research concerns incorporating ideas from robust statistics into the growing collection of methods for high-dimensional estimation and inference. Preliminary results hold much promise, but a plethora of theoretical and philosophical challenges abound when attempting to generalize notions from classical robust statistics to high-dimensional settings. The investigator will develop new statistical methodology for high-dimensional robust estimation and derive rigorous theory for the proposed estimators. This research project is highly interdisciplinary in nature, cutting across statistics, engineering, and computer science. Results of the research will be disseminated broadly, leading to cross-pollination between fields and revitalized interest in robust statistics. In addition, the investigator will refine and test her methods in radiology applications, instigating new scientific collaborations and leading to more robust medical imaging procedures for deployment in medical research. The investigator will also develop new educational material based on the research that will be incorporated into cross-listed classes in machine learning at the graduate and undergraduate levels. The investigator will work to improve the image of statistics and data science by engaging the wider community through public speaking engagements and visits to high school math circles across the state of Wisconsin.

Questions to be explored in this research project include: (1) How do existing notions of robustness apply to high-dimensional settings? (2) How should high-dimensional estimation procedures be modified to protect against deviations from distributional assumptions? (3) How might one quantify the relative robustness of various proposals? The project aims to generate novel theoretical results that advance the frontiers of both statistics and optimization theory. New algorithms will be devised for high-dimensional statistical estimation with guaranteed accuracy under a broader set of model assumptions. The theoretical analysis will involve studying a variety of non-convex estimators of independent interest in the optimization community, with emphasis on objectives and optimization algorithms that give rise to statistically consistent solutions. The research project will also address long-standing open questions in robust statistics involving optimization of low-dimensional non-convex objective functions, and the investigator will examine a variety of new problem settings arising in machine learning applications, including adversarially contaminated data, non-iid observations, and mislabeled datasets dichotomized into training and testing data.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1749857
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2018-09-01
Budget End
2023-08-31
Support Year
Fiscal Year
2017
Total Cost
$233,130
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715