Highly complex stochastic models arise frequently in scientific applications. They often lead to statistical inference problems with analytically or computationally intractable likelihood functions. Such problems lie beyond the limit of current Monte Carlo methods. The goal of this proposal is to develop efficient Monte Carlo algorithms for statistical inference problems with intractable likelihoods. Proposed research considers two categories of problems without likelihoods. In the first category, the likelihood is available in analytical forms if the problem is put into an appropriate augmented space. For such problems, a new data augmentation scheme is proposed which leads to a more efficient Markov chain Monte Carlo algorithm. In the second category, the model is a "black box" and only a generating stochastic mechanism is available to simulate data from the model. For such problems, several likelihood-free Monte Carlo algorithms are proposed which extend the power of current Monte Carlo methods. The proposed methods are applied to inference problems in population genetics, panel studies, and hydrological models.

The proposed research addresses the urgent need to develop innovative Monte Carlo methodology for problems with intractable likelihood functions. This is of fundamental importance in statistics. It allows researchers to concentrate on scientifically plausible statistical models without worrying about mathematical intractability. Applications of the proposed methods include statistical inference in molecular population genetics which can help locating genes that are responsible for genetic diseases, and Bayesian calibration of hydrological models which can be used to predict ground-water flow. The proposed research has significant impact on education through involvement of graduate and undergraduate students directly in the proposed research, incorporation of proposed algorithms into related courses, and dissemination of research results to the scientific communities.

Project Report

This grant supported research in developing efficient Monte Carlo algorithms for statistical inference problems with intractable likelihoods. Two categories of problems without likelihoods have been considered. In the first category, the likelihood is available in analytical forms if the problem is put into an appropriate augmented space. For such problems, a new data augmentation scheme has been developed which leads to a more efficient Monte Carlo algorithm. In the second category, the model is a 'black box' and only a generating stochastic mechanism is available to simulate data from the model. A specific problem in this category that the PI and his collaborators have studied is the calibration of hydrological and environmental simulation models. A new Monte Carlo algorithm was developed to improve the efficiency in the calibration of computationally expensive simulation models. A series of papers were published or submitted for publication. The new algorithms address the urgent need to extend the power and range of applicability of Monte Carlo methods in order to meet the increasing demands of the scientific research community. This is of fundamental importance in statistics. It extends the range of models which can be fitted to the observed data, and allows researchers to concentrate on scientifically plausible statistical models without worrying about mathematical intractability. Applications of the new algorithms include statistical inference in molecular population genetics which can help locating genes that are responsible for genetic diseases, and calibration of hydrological models which can be used to investigate the watershed hydrology, agriculture, sedimentation, nutrient cycle, and pesticide transport. The new algorithms will improve the current methods used in these areas, and new theories arising from these applications will be of interest across a broad range of areas in statistics and other disciplines. The grant provided partial support for several Ph.D. students. All of them have been exposed to some of the new development of Monte Carlo methodologies resulting from the grant. In addition, the new algorithms and their applications have been incorporated into graduate and undergraduate courses. Finally, the PI has presented results of research under this grant at several conferences and departmental seminars.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0806175
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2008-09-01
Budget End
2011-08-31
Support Year
Fiscal Year
2008
Total Cost
$259,888
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820