This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. The subproject and investigator (PI) may have received primary funding from another NIH source, and thus could be represented in other CRISP entries. The institution listed is for the Center, which is not necessarily the institution for the investigator. Computer models of disease take a systems biology approach toward understanding pathogen-host interactions and mechanisms of the immune response. For computer models that are expensive, in time or money, the number of available runs will be limited, and methods such as calibration and sensitivity analysis demand a fast emulator of the computer code. Gaussian processes (GPs) are statistical models commonly used to predict output from (i.e., emulate) complex computer models. Importantly, current GP models assume that the response of interest has constant variance, which is not true of all stochastic computer models, particularly those used in systems biology. The primary goal of the current work is to improve the accuracy of GP emulators of computer models that do not have constant variance. We propose a GP fitting scheme that uses 'plug-in'estimates of input-dependent variance components and where the GP is fit to a collection of sample mean outputs when replicate observations are made. For responses with both constant and non-constant variance, the proposed method is computationally more efficient and often more accurate than the standard approach which assumes constant variance and includes replicate observations directly in the GP fitting process. We implement our new model in the R package mlegp, and also add parallel support for fitting multiple, independent GPs. The package mlegp is publicly available on the Comprehensive R Archive Network (CRAN).
Showing the most recent 10 out of 179 publications