This proposal consists of three complementary themes with a particular research focus on statistical inference for data arising from dynamical systems, partial differential equations (often representing physical phenomena) and diffusions. The research questions in the proposal are motivated by the genuine need for novel statistical inference in these areas where the data is naturally high dimensional. A highlight of this proposal is the interdisciplinary nature of the problems which requires the integration of techniques from a wide spectrum of fields in applied mathematics, probability and statistics. The main themes are 1) stability of Markov Chain Monte Carlo algorithms in high dimensions, 2) statistical inference for inverse problems from diffusions and dynamical systems, and 3) applications to climate science and temperature prediction. The first two themes aim at developing methods and improving the theoretical understanding of statistical inference procedures whilst the third will directly implement the insights gained from the first two to answer a few concrete relevant and open problems in climate science. The main thread connecting the above three themes of the research is the development and theoretical analysis of novel and efficient Markov Chain Monte Carlo techniques.
Advances in technology and computing power have made many historically intractable problems in statistics amenable to routine implementation using certain probabilistic algorithms. Despite two decades of intense research, our theoretical understanding of the behavior of these complex algorithms in high dimensions is still primitive. The PI proposes to study these algorithms and quantify their behavior in high dimensions, and apply them to solve concrete problems in climate science.
Intellectual Merit: The PI was supported by the NSF grant DMS-1107070 for the period 2010-2013. Funded by this grant, the PI has written 16 papers in 5 dierent areas. These areas include (1) High Dimensional Bayesian Inference (2) Random Matrices (3) Dynamical Systems (4) Markov Chain Monte Carlo (5) Stochastic processes. Of these, almost all of them are papers are published in premier journals including (Annals of Statistics, Proceedings of the National Academy of Sciences, Annals of Probability, Annals of Applied Probability, Bernoulli, Stochastic Processes and Applications, Journal of Computational and Graphical Statistics), 7 papers are submitted and several others are in progress. Broader Impact: This project required assembling new tools from modern probability, computer science and spectral theory for solving concrete problems in statistics. The PI also successfully collaborate with a distinguished group of senior scientists, both nationally and internationally. The PI recently chaired a panel discussion aimed at graduate students at Harvard University entitled Why being an excellent teacher is necessary for being an excellent researcher". At Harvard University, he has solely designed and taught two courses: Stochastic Calculus (a rigorous course on Brownian motion and martingales aimed at advanced graduate students) and Generalized Linear Models (a very applied modeling course accessible to freshmen undergraduate students). He co-taught a course on missing data methods with Prof. Donald Rubin. The PI ran the Statistics Colloquium at Harvard University for the year 2010-11 and also for the academic year 2014-15. In Spring 2014, he will co-teach a course named "The Art andPractice of Teaching Statistics" with Prof. Xiao-Li Meng; this course focussed exclusively on developing innovative classroom techniques for teaching statistics to a wide audience.