Bayesian inference is a powerful statistical inference method that allows statisticians and other scientists to combine existing knowledge with new data samples for better inferences and decisions. The difficulty of sampling from posterior distributions is one of the biggest impediments to a wider adoption of Bayesian procedures in high-dimensional/big data analysis. There is a need for fast and accurate posterior approximation methods to assist with the practical implementation of Bayesian statistics in high-dimensional problems. This research project will use ideas from the related field of optimization to develop a Bayesian posterior approximation method that satisfies these requirements. The methodology will find applications in a wide-range of areas such as finance, marketing science, epidemiology, biology, medical sciences, and others.

More specifically, there is a need in statistics for posterior approximation methods in high-dimensional problems that: (a) produce approximations that are easier to explore by Markov Chain Monte Carlo (MCMC), and (b) are well-understood from a theoretical viewpoint. This project will use the Moreau-Yosida approximation and related tools from optimization and variational analysis to develop a Bayesian posterior approximation method that satisfies the above two conditions. The research from this project will help clarify similarities and differences between optimization and simulation problems. This research will also contributes to the theoretical analysis of Markov Chain Monte Carlo algorithms, with the special focus on understanding the mixing time of MCMC algorithms in high-dimensional settings. The project will also address open problems in high-dimensional Bayesian variable selection and will develop some novel modeling and computational solutions. There are many applied research areas, including biomedical research, epidemiology, marketing science, and social science research, where variable selection plays an important role. Hence, results from this research will allow researchers in those areas to better handle available data and gain new insights into relevant scientific questions. On the educational side, the material from this research will form a key component of the doctoral dissertation of the Ph.D. students supported by this grant. The project will also enable the PI to use the related scientific problems and datasets to enrich the learning experience of students in his classes and possibly other classes taught by his colleagues. Furthermore, novel methodologies from this research will be widely disseminated to the scientific community through presentation of academic seminars as well as presentations at high-visibility conferences in statistical computing.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1854545
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2018-07-01
Budget End
2019-06-30
Support Year
Fiscal Year
2018
Total Cost
$221,874
Indirect Cost
Name
Boston University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215