Prediction and Model Selection for New Challenging Problems with Complex Data+

Jiang, Jiming

Abstract

Mixed model prediction, that is, prediction based on a class of statistical models known as mixed effects models, has a fairly long history. The traditional fields of applications have included genetics, agriculture, education, and surveys. Nowadays, new and challenging problems have emerged from such fields as business and health sciences, in addition to the traditional fields, to which methods of mixed model prediction are potentially applicable, but not without further methodology and computational developments. Some of these problems occur when interest is at subject level, such as personalized medicine, or (small) sub-population level, such as small communities, rather than at large population level. In such cases, it is possible to make substantial gains in prediction accuracy by identifying a class that a new subject belongs to. Other challenging problems occur when applying existing model search strategies in situations of incomplete or missing data, in model search or selection when prediction is of primary interest, and in making statistical inference based on the result of model search or selection. This collaborative research project aims at solving these challenging problems in prediction and model selection in situations of complex data, such as incomplete or missing data, and data that are correlated due to presence of random effects.

In this collaborative research project the PIs develop a novel statistical method, called classified mixed model prediction, to identify the subject class. This way, the new subject is associated with a random effect corresponding to the same class in the training data, so that the mixed model prediction method can be used to make the best prediction. Furthermore, the PIs develop a recently proposed method, called E-MS algorithm, for model selection in the presence of incomplete or missing data. The PIs also develop an idea called predictive model selection by deriving a predictive measure of lack-of-fit, and combining this measure with a recently developed class of strategies of model selection, called the fence methods. Finally, the PIs develop a unified Jackknife method to accurately assess uncertainty in mixed model analysis after model selection. Theories will be established for these new methods, and their performance and potential gains through extensive Monte-Carlo simulations will be studied. The new methods will be implemented in the R language/environment for statistical computing and graphics. All of the developed methodologies will be applied and tested in a number of applications via a series of close collaborations with experts who will provide access to the data and also guidance in interpretation and dissemination of findings. The fields of applications include genetics, health and medicine, agriculture, education, business and economy. The research project will also promote teaching, training and learning that involve under-represented groups, and build research networks between our institutions.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Mathematical Sciences (DMS)
Type: Standard Grant (Standard)
Application #: 1510219
Program Officer: Gabor Szekely

Project Start
Project End
Budget Start: 2015-08-15
Budget End: 2018-07-31
Support Year
Fiscal Year: 2015
Total Cost: $112,340
Indirect Cost

Prediction and Model Selection for New Challenging Problems with Complex Data+
Jiang, Jiming
University of California Davis, Davis, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments