Inference under Selection and Model Uncertainty

Young, Linda; Casella, George

Abstract

This project has two distinct parts, each suggested by problems of inference in genomic experiments. The first problem arises because, typically, thousands of genes are screened, and a smaller number are selected for further study. Statistical inference must take this selection mechanism into account, otherwise the actual confidence coefficient is smaller than the nominal level, and approaches zero as the number of genes increases. The goal is to construct valid frequentist confidence intervals for the means of the selected populations. This will provide a confidence interval alternative to the False Discovery Rate. The second problem deals with inference under model uncertainty, where the goal is to account for the variability induced by the collection of models. Here a Bayesian approach is taken, seeking to construct intervals accounting for model uncertainty, investigate the impact of the choice of priors on model space, and construct new search algorithms that take advantage of parallel processing and can be used in the case when there are more covariates than observations.

The work will have impact in both genomic studies and high performance computing. First, for inference from genomic studies, a valid statistical procedure to screen results will be provided. Insuring that the inferences are valid is of crucial importance, as illustrated by a recent NY Times article where a genomic disease therapy was found to be useless, because of faulty statistical inference (``How Bright Promise in Cancer Testing Fell Apart", NY Times, July 7, 2011). Second, parallel processing algorithms, using high performance computing, will be developed. These algorithms take advantage of the abundance of processors typically available, and split the large genomic selection problem across the many processors. This results in answers from these statistical procedures that can be available in real time, and thus be relevant in a clinical setting.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Mathematical Sciences (DMS)
Type: Standard Grant (Standard)
Application #: 1105127
Program Officer: Gabor Szekely

Project Start
Project End
Budget Start: 2011-09-15
Budget End: 2015-08-31
Support Year
Fiscal Year: 2011
Total Cost: $172,353
Indirect Cost

Inference under Selection and Model Uncertainty
Young, Linda Casella, George
University of Florida, Gainesville, FL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments