The primary objective of this proposal is to develop a flexible framework for design and modeling of large-scale simulations with broad applications to other areas of statistics. The investigator studies a multi-step method for fitting massive data from computer simulations that can simultaneously mitigate singularity and improve accuracy of interpolation. Theoretical bounds on numeric and nominal accuracy of this method will be derived. The investigator also proposes new designs inspired by Sudoku to efficiently pool data from multiple sources and employs sliced Latin hypercube designs to enhance stochastic optimization and cross-validation.
Large-scale simulations are widely used for studying complex phenomena in sciences and engineering. The trend of replacing physical experiments with simulations to save cost and time has accelerated recently. The proposed research draws impetus from computer simulations but applies broadly to other areas of statistics for modeling massive data and for borrowing information from multiple sources. Beyond statistics, the research will make significant contributions to discrete mathematics, computer science and high-performance computing. Dissemination through journal publications, industrial collaborations and release of open source software will result in broad adoption of the developed research to significantly improve the use of complex simulations in U.S. industries. The research will be of considerable added value to rigorous uncertainty quantification efforts by national laboratories to support national security. Training students from under-represented groups will be accomplished through a puzzle based learning approach. Graduate students will benefit through multidisciplinary training on the interface between statistics and optimization. Ph.D. students will be supervised by following the Wisconsin model of balancing statistical theory and practice, and obtain first-hand research experience they can draw on for their careers.