The investigator develops new methods and theory for problems in multiple testing and simultaneous inference. The classical approach to dealing with multiplicity is to require decision rules that control the familywise error rate, the probability of rejecting at least one true null hypothesis. But when the number of tests is large, control of the familywise error rate is so stringent that alternative hypotheses have little chance of being detected. In response, the false discovery rate and other measures of error control have gained wide use. For each measure of error control, it is desired to construct procedures that exhibit error control under the weakest possible assumptions. Resampling methods offer viable approaches to obtaining valid distributional approximations while assuming very little about the stochastic mechanism generating the data. While many new methods have been developed, many more questions remain and are studied. Some of the technical challenges include: asymptotics that grow with both sample size and number of tests; orders of error in approximation; uniformity in approximation; optimality theory; direction errors. Related problems are also studied, such as the statistical evaluation of bioequivalence across multiple measures, testing for stochastic dominance, and inference for partially identified econometric models.

Virtually any scientific experiment sets out to answer questions about the process under investigation, which often can be translated formally into a set of hypotheses. It is the exception that a single hypothesis is considered. For example, in clinical trials, even a single treatment may be evaluated using multiple outcome measures, multiple time points, multiple doses, and multiple subgroups. Moreover, due to effects of ``data snooping'' (or ``data mining''), additional hypotheses arise as well. The statistician is then faced with the challenge of accounting for all possible errors resulting from a complex data analysis, so that any resulting inferences or interesting conclusions can reliably be viewed as real structure rather than artifacts of random data. In general, the philosophical approach is the development of practical methods that may be applied in increasingly complex situations as the scope of modern data analysis continues to grow. The broader impact of this work is potentially quite large because the resulting inferential tools can be applied to such diverse fields as genetics, bioengineering, image processing and neuroimaging, clinical trials, education, astronomy, finance and econometrics. For example, current methods in biotechnology and genomics generate DNA microarray experiments, where expression levels in cells for thousands of genes must be analyzed simultaneously. The many burgeoning fields of applications demand new statistical methods, creating challenging and exciting opportunities for young scholars under the direction of the PI.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1007732
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2010-07-01
Budget End
2015-06-30
Support Year
Fiscal Year
2010
Total Cost
$369,997
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305