The use of Bayesian statistics in the applied sciences has increased dramatically over the last decade, largely because of the availability of Markov chain Monte Carlo methods to estimate the posterior distributions. Along with this increased use, researchers are now routinely considering more complex models, for example hierarchical models with many levels and regression models with many potential predictors. Consideration of more sophisticated models has two consequences. Because the parameters live in larger spaces, there is a stronger need for the development of MCMC methods that give accurate estimates of the relevant posterior distributions and expectations for a given model, and there is a stronger need for methods for doing model diagnostics and selection. An important component of this project is the development of tools for model assessment and sensitivity analysis. The investigators develop methods for efficiently calculating a very large number of Bayes factors, and plotting them. They consider situations in which there is a set of models indexed by several continuous hyperparameters. Calculation of the Bayes factors helps determine whether a subset of hyperparameter values constitutes a class of reasonable choices. The investigators also develop a set of computationally efficient schemes for estimating the posterior expectation of a function of a parameter as the prior is varied continuously. This enables users to determine which aspects of the prior have the biggest impact on the posterior. Markov chains used in complex settings often involve enhancements designed to speed up convergence. But little is known theoretically regarding the effect of these enhancements. In this project techniques from operator theory are used to analyze the long-term behavior of Markov chains. The operator theory framework provides tools to better understand the behavior of these chains and this understanding enables the development of results regarding the accuracy of estimates produced by these chains and also suggests ways to improve these chains or design better ones. The investigators apply the theoretical results obtained from operator theory to very concrete problems of model selection and assessment.

Model selection in complex situations is an important and pervasive problem in scientific and medical research. It includes in particular variable selection in regression, where a few important variables are to be selected from many candidates and used for understanding, prediction and decision making. Different models can lead to different conclusions, with potential impact on public policy. Whereas for frequentist methods there is available an extensive body of material for doing diagnostics, for Bayesian methods the methods that exist are much more limited. The tools for Bayesian model assessment and sensitivity analysis, together with the theoretical results regarding Markov chains that will be obtained in this project, will enable researchers to correctly evaluate the accuracy of estimates produced from Markov chains that explore very large spaces, and will enable them to correctly determine how long chains need to be run in order to provide a required level of accuracy.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0805860
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2008-08-01
Budget End
2012-07-31
Support Year
Fiscal Year
2008
Total Cost
$179,992
Indirect Cost
Name
University of Florida
Department
Type
DUNS #
City
Gainesville
State
FL
Country
United States
Zip Code
32611