This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
The investigator will develop designs for statistical experiments that adapt over time to incoming data -- so-called "adaptive designs" -- and plans for analyzing data coming out of such experiments for two general classes of problems: (I) optimal parameter estimation, control, and design in multiperiod regression problems with nonlinear models (e.g., generalized linear models), and (II) time-sequential tests of multiple hypotheses. In Part I, recent computational advances known as approximate dynamic programming will be harnessed that hold promise for developing optimal or nearly-optimal estimation and control procedures in nonlinear regression models. In Part II, recent methodological advances will be used and extended to develop a unified approach to testing multiple hypotheses over time or in stages in a statistically optimal way, with either strong (FWER) or weak (FDR) error control. For both parts, the performance of the resulting procedures will be studied analytically, assessed through extensive numerical simulations, and applied to real data. Applications in economics, DNA microarray data, psychometric testing, biomedical trials, and engineering control problems will be addressed, and practical algorithms will be developed for real, on-line implementation in these areas.
Many statistical challenges of great societal importance require adapting one's actions quickly and intelligently as new information arrives over time. Some of these areas include solar energy, robotics, automobile emissions, homeland security, and health care. In this project the investigator will develop the statistical procedures and algorithms that underlie some of the most challenging problems in these areas. Part I of this project concerns regression problems, where the observer can influence the settings under which statistical information is generated, and how to design and change these settings over time in order to most efficiently learn about unknown parameters and control the effects of the chosen settings. Recent computational advances known as approximate dynamic programming hold the potential to solve previously intractable problems of this form. Part II concerns how to most efficiently combine statistical information from many disparate sources over time in order to reach justified conclusions, and how to avoid the multiple testing fallacy of false discoveries. The theory underlying these problems will be studied to develop data analysis and computational algorithms for real, on-line implementation, and software packages will be developed to facilitate applications.