Beran Modern statistical methods for large data sets rely on rotation of the data in high dimensions (as in Fourier analysis, wavelet transforms, or projection pursuit), on smoothing (as in nonparametric regression), on shrinkage (as in Stein estimation or ridge regression), on variable selection (as in linear regression), and on combinations of these ideas (such as thresholding of wavelet or Fourier coefficients). Concurrently, statisticians have introduced computer-aided techniques, such as cross-validation or the bootstrap, for assessing the uncertainty in patterns recovered though data analyses. This research project develops: (a) necessary and sufficient conditions under which bootstrap distributions converge correctly plus diagnostic methods for detecting bootstrap failure in data analyses; (b) modulation estimators that recover a signal from noise by adaptively tapering the rotated data plus confidence regions for the signal that are centered at the modulation estimators; (c) nonparametric bootstrap confidence sets for all pairwise rotational differences among the mean directions (or mean axes) of several independent samples of directional (or axial) data. The computer revolution in scientific and social measurement has created large, complex data sets. In response, data analysts have devised computer- assisted methods for recovering patterns from data. However, incompleteness of the data as well as measurement errors induce possible errors in the conclusions reached. The recent controversy about Census undercounts in the cities is a prominent example. How much uncertainty is there in patterns recovered through sophisticated data-analyses? The statistical technique called the bootstrap, which relies on fast computers, has grown since 1979 into the most widely applicable method for assessing uncertainties inherent in data-analyses. Unfortunately, bootstrap methods, as currently used, can sometimes give a misleading asse ssment of uncertainty. Part (a) of this research project provides computer-intensive ways to detect and to correct such bootstrap failure. This portion of the work contributes to the Federal Strategic Area of high-performance computing. Part (b) of the project develops uncertainty assessments for signals recovered from noisy measurements. Electronic images recorded by a satellite camera are an instance of such measurements. This portion of the work provides statistical methodology for analyzing satellite data in the Federal Strategic Area of global change. Part (c) of the project develops uncertainty assessments for analyses of directional and axial data sets. Geophysical measurements in earthquake studies, oil exploration, and studies of volcanic activity are examples of such directional and axial data.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
9530492
Program Officer
Joseph M. Rosenblatt
Project Start
Project End
Budget Start
1996-07-01
Budget End
2000-06-30
Support Year
Fiscal Year
1995
Total Cost
$171,000
Indirect Cost
Name
University of California Berkeley
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704