Bayesian analysis is a widely used technique in data science for estimation, prediction, and interpretation. The new era of complex and big data imposes unprecedented challenges to Bayesian statistics. This research project addresses these new challenges from three different perspectives. First, the investigator will study the relation between prior knowledge and scientific conclusion by conducting rigorous mathematical analysis in the framework of Bayesian statistics. Second, the investigator aims to find novel ways of modeling data sets that can take into account new features of modern big data. Finally, the investigator intends to push the boundary of Bayesian computation by inventing new algorithms that are both fast and theoretically sound. The results of the research are expected to have a positive impact in areas that apply Bayesian statistics on a routine basis, including population genetics, astronomy, computer vision, political science, social science, and animal science.
Bayesian analysis is an important statistical framework for both modeling and computation. However, applying Bayesian procedures correctly when encountering a specific problem is non-trivial. The selections of prior, likelihood, and algorithm all influence the final conclusion drawn from a posterior distribution. Despite many successful applications of Bayesian analysis in various scientific areas, solid theoretical foundations on how to perform Bayesian inference are still lacking. The goal of this project is to develop a coherent theory on optimal Bayesian inference. Specifically, the investigator will study: 1) Bayesian theory: optimal posterior contraction in parametric, nonparametric and high-dimensional models; 2) Bayesian modeling: likelihood functions that are free of nuisance parameters and Bayesian edge-exchangeable network analysis; 3) Bayesian computation: algorithmic and statistical properties of variational inference; and 4) applications to single-cell RNA sequencing analysis.