In recent years, a new trend has been growing in applied statistics---it is becoming ever more feasible to build application specific models which are designed to account for the structure inherent in any particular data generation mechanism. Such models have long been advocated on theoretical grounds, but recently the development of new computational tools (e.g., hardware, software, and algorithms) for statistical analysis has begun to bring such model fitting into routine practice. Of course, much work remains to be done. The flexibility of such methods comes at a cost---they require problem specific coding, long computation times, and present difficulties in ascertaining convergence. This proposal aims to tackle some of these difficulties using newly developed efficient Monte Carlo techniques. The PIs plan to study a number of outstanding theoretical questions concerning the behavior and extended application of these efficient methods by developing new algorithms for a number of important models which are prime candidates for these methods. The PIs are involved in several on-going substantive data analytic projects (e.g., in computational biology and high energy astrophysics) which both help to clarify the relevant theoretical questions and stand to benefit from the new methodology. The computational goals of this research are by no means an end unto themselves, but rather a means to improved data analysis and statistical inference. As has been so clearly illustrated in recent years improved computational tools can open up whole new areas of statistical application, as well as increase reliability, thus improving statistical inference.

Research will focus on such newly developed Monte Carlo techniques as multi-point Metropolis and the methods of conditional, joint, and marginal data augmentation. Multi-point Metropolis generalizes the Metropolis-Hastings algorithm by allowing multiple dependent proposals at each iteration. As a consequence the multi-point method is more able to jump further, is less likely to be caught in a local mode, and thus can substantially improve mixing. The methods of conditional, joint, and marginal augmentation have already substantially improved performance of the EM and Data Augmentation algorithms in a wide range of models (e.g., mixed-effects models, finite mixture models, multivariate t-models, probit generalized linear models and generalized linear mixed model, Poisson image models, etc.). In particular, these new algorithms maintain the stable convergence properties of EM and DA while sometimes reducing the required computation time by over 99%. These methods, especially in tandem, have the potential to significantly improve and extend Markov Chain Monte Carlo in statistical practice. This program is being jointly funded by the Division of Mathematical Sciences and Astronomical Sciences and the Office of Multidisciplinary Activities from the Directorate of Mathematical and Physical Sciences.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0438240
Program Officer
Grace Yang
Project Start
Project End
Budget Start
2003-08-01
Budget End
2005-07-31
Support Year
Fiscal Year
2004
Total Cost
$152,100
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697