The investigator proposes two distinct threads of research. One thread involves estimation of the distributional components of random effects and mixed effects models, and the other is an investigation of BayesSim, a new and potentially more efficient way of doing statistical simulation studies. Random and mixed effects models are extremely useful statistical models that have wide applications. The investigator studies minimum distance methods for nonparametrically estimating the distributions of the random components of such models. This study includes an investigation of bootstrap methods for placing error bounds on these nonparametric estimates. The investigator also studies the use of minimum distance methods for testing common assumptions, such as normality and independence of random effects and errors, associated with mixed models. Complicated models in the modern statistics world have made simulation the most often used means of investigating new statistical methodology. Almost every simulation study published in a statistics journal proceeds as follows. A few models are selected, hundreds or thousands of data sets are generated from each model, the methodology of interest is applied to every data set, and the results are summarized. A different simulation strategy, called BayesSim, is considered by the investigator. The main idea is to do a better job of sampling all the relevant models. The strategy is to generate one data set (or at most a few data sets) from each of hundreds or thousands of randomly selected models. This approach has the potential of providing more complete information about a statistical method, while doing so at a reduced computational cost relative to the traditional simulation method.

Random effects models are often used for microarray data in genetics. Determining whether certain genes express more for diseased patients than for healthy ones is often the goal of a microarray study. The incorrect specification of distributions in the random effects model could mean that such genes go undetected. Part of the research in this proposal is aimed at improving methods of determining these distributions. Mixed effect models are used in small area estimation, an enor- mously important technique in the field of survey sampling. Estimation of quantities in small areas, such as counties, from surveys taken in the small area are often unreliable, due to small sample sizes. Small area techniques use information from nearby larger areas to infer or predict quantities of interest in the small area. For example, the U.S. Census Bureau has a special program, called Small Area Income and Poverty Estimates, that provides more current estimates of certain income and poverty statistics than those from the most recent decennial census. Small area estimates made by the U.S. Census Bureau affect allocation of federal funds to local jurisdictions, and hence have a major impact on U.S. society. The investigator's research on random and mixed effects models could improve methods for small area estimation. BayesSim has the potential of improving the answers obtained from any statistical simulation study, and hence could have a large impact on the entire field of statistics.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1106694
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2011-08-01
Budget End
2014-07-31
Support Year
Fiscal Year
2011
Total Cost
$170,665
Indirect Cost
Name
Texas A&M Research Foundation
Department
Type
DUNS #
City
College Station
State
TX
Country
United States
Zip Code
77845