Resampling methods are a broad class of tools that serve to measure the variability of statistical results, for example, allowing a researcher to determine whether or not the outcome of an experiment is significant. Over the course of the last few decades, these methods have been extensively studied, and they have become fundamental to the practice of statistics - in large part because they can solve complex problems while relying on relatively few assumptions. Nevertheless, much remains to be understood about the performance of resampling methods in the context of modern data analysis, where observations tend to have large numbers of features (high-dimensional data), or where the quantity of data is so large that it outstrips computational resources (large-scale data). In both of these challenging settings, the proposed research will extend the applicability of resampling methods, and these efforts will be guided by two research themes discussed below.

First, in the setting of high-dimensional data, the understanding of inference problems, including tests and confidence intervals, remains underdeveloped in comparison with estimation and prediction problems. Given that resampling methods are a general-purpose approach to inference, it is important to know how they are influenced by the effects of low-dimensional structure and regularization. In particular, the proposed research will study the performance of resampling methods in high-dimensional models involving structured covariance matrices. Second, in the setting of large-scale data, randomized algorithms have received growing attention for their ability to produce fast approximate solutions. Although the outputs of such algorithms are random, their fluctuations can often be reduced at the expense of greater computation. This general trait of randomized algorithms leads to the problem of optimizing a tradeoff between precision and computational cost. Towards a solution, the proposed research will investigate how resampling methods can be used to measure this tradeoff for a collection of popular randomized algorithms.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1613218
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2016-07-01
Budget End
2020-06-30
Support Year
Fiscal Year
2016
Total Cost
$150,000
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618