Many procedures of statistical estimation and inference in economics and in other disciplines rely on intensive computation, which include the use of computer simulations and resampling methods that reuse a random set of the data that is being used for economic analysis. This proposal consists mainly of two projects that study the statistical properties of simulation and resampling based estimators, which are applicable to highly nonlinear and computationally intensive models. This is important because ignoring the statistical uncertainty introduced by simulation and resampling methods can lead to erroneous conclusion in the statistical and economic analysis. Generating the random numbers is easy but computing the moment condition or the likelihood function is typically difficult. Whether using overlapping simulations for all observations presents an improvement in computational efficiency depends on the specific model.

The goal of the first project is to study the large sample distribution of a particular type of simulation based estimator procedures where the same set of simulation draws are used for all observations.

Two important cases are considered. These include estimators that solve a system of simulated moments (MSM) and estimators that maximize a simulated likelihood (MSL). The theory being developed in this project applies to many simulation estimators used in empirical work which involve both overlapping simulation draws and non-differentiable moment functions. It is proven that both MSM and MSL are consistent when both the sample size and the number of simulations increase without bound. Under suitable regularity conditions, both MSM and MSL converge at the rate of the square root of the minimum of the number of observations and the number of simulation draws, to a limiting normal distribution.

The conditions differ between MSM and MSL. For MSL, on the one hand, the condition that the number of simulations has to increase faster than the square root of the sample size is needed for asymptotic normality with independent random draws. On the other hand, with overlapping draws, asymptotic normality holds as long as both the number of simulations and the number of observations increase to infinity. It is also found that the total number of simulations has to increase without bound but can be much smaller than the total number of observations. In this case, the error in the parameter estimates is dominated by the simulation errors. This is a necessary cost of inference when the simulation model is very intensive to compute.

The second project proposes a fast resample method that can be used to provide valid inference in nonlinear parametric and semiparametric models. This method does not require recomputation of the second stage estimator during each resample iteration but still provides valid inference under very weak assumptions for a large class of nonlinear models. These models can be highly nonlinear in the parameters that need to be estimated and can also be semiparametric through dependence on a first stage nonparametric functional estimation procedure. The fast resample method directly exploits the score function representations computed on each bootstrap sample, thereby reducing computational time considerably. The method presented here can also be extended to models in which the first stage computation is more intensive than the second stage, by making use of a linear representation for the first stage when resampling the second stage estimation procedure. The desirable performance and vast improvement in the numerical speed of the fast bootstrap method are demonstrated in the Monte Carlo experiments that have thus far been conducted.

Developing sampling theorems with overlapping draws and nonsmooth functions in the first project provides an important complement to the existing results in the literature on the asymptotics of simulation estimators. The fast resampling method in the second project is used to approximate the limit distribution of parametric and semiparametric estimators, possibly simulation based, that admit an asymptotic linear representation. It can also be used for bias reduction and variance estimation, which are important components for the econometric inference of empirical models.

The results obtained from the project can provide very useful guidance to empirical researchers who make extensive use of computational intensive nonlinear models for which obtaining the estimator and conducting inference on the parameter of interest can both be numerically challenging. Beyond applications in economics, nonlinear models are also widely used in statistics and various disciplines in social sciences and natural sciences, where researchers often resort to simulation and resampling based methods for estimation and inference. This analysis can provide guidance to empirical researchers making use of these models by shedding light on understanding and accounting for the statistical uncertainty introduced by the simulation and resampling procedures.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1325805
Program Officer
Kwabena Gyimah-Brempong
Project Start
Project End
Budget Start
2013-08-15
Budget End
2016-07-31
Support Year
Fiscal Year
2013
Total Cost
$176,707
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305