Structural economic choice models show how individual agents (firms, consumers, workers) make choices under different choice sets. There is no reason to believe that the parameters of these models are the same for all agents: agents will make different choices when faced with the same choice set. As the parameters vary across agents, the goal of empirical work with these models is to estimate the distribution of unobserved heterogeneity, or the distribution of random coefficients. This distribution is needed to calculate the welfare effects of a policy (say consumer surplus from a tax policy) or to compute demand for a new good, out of sample. In terms of statistical theory, a model with unobserved heterogeneity is called a mixtures model. The statistics literature has focused attention on mixtures of distributions in some parametric class, such as mixtures of normals. Much less attention has been paid to using the tools of mixtures to study more complex nonlinear statistical models, such as structural economic choice models. Nonparametric identification shows that one particular distribution of unobserved heterogeneity is consistent with limiting information on conditional outcome probabilities. We show identification using cross-sectional data on agents facing different choice sets. The statistics literature establishes that linear independence of the class of models being mixed over is necessary and sufficient for identification. This condition is difficult to algebraically verify for economic choice models. This project introduces a new condition, reducibility, that is sufficient for linear independence and hence identification. Reducibility is a property of economic models that can be easily verified, as is shown for a group of models of wide empirical use in economics. After showing identification, parametric or nonparametric distributions of unobserved Heterogeneity can be more confidently estimated. The most common tool for estimating mixtures models is the EM algorithm, which is certainly applicable. However, the EM algorithm has numerical issues and may be inappropriate for economic choice models that themselves require complex calculations, such as dynamic programming models. This project introduces a new, computationally simple mixtures estimator to resolve these issues. The estimator is nonparametric.
BROADER IMPACT: The main broader impact will be to make the identification and estimation of distributions of unobserved heterogeneity (random coefficients) much simpler. This is done on two fronts: making the conditions for showing identification of new models easier to verify and introducing a computationally simple estimator. Models of choice by economic agents are used every day in thousands of empirical applications. For example, the decision of a married woman to participate in the labor market induces a selection problem: wages are observed only for the women who do work. The preferences and job opportunities of those observed to work are not representative of those who do not work. Jointly estimating the participation and wages decisions is necessary to resolve this selection problem. Understanding this process is necessary for understanding changes in the gender-wage gap over the last 30 years. For another example, consider the environmental economics problem of forecasting electricity use for appliances. The only data available on an appliance?s electricity use are for those consumers who buy the appliance, a selected sample. Changing the prices (say from a tax) of the appliances or of electricity itself will shift both the set of consumers who buy the appliance and electricity use conditional on buying. Estimating the distribution of heterogeneity is necessary for computing welfare measurements of the effect of the tax. One of the investigators has used these methods to estimate the consumer welfare implications of mergers of wireless carriers, the response of teacher attendance in India to financial incentives, and the job mobility response of experienced engineers to wage offers from competing firms. Overall, we see showing identification as making economists and statisticians more comfortable with estimating complex models. We hope to place these models on firm theoretical ground, so policymakers and other researchers treat results from these models with more confidence. We also expand the class of models that will be estimated. As parametric versions of some of these methods are in constant use in empirical work, we see this work as having tremendous impact of all areas of applied economics and related fields. One of the investigators has taught a mini course to graduate students and established researchers on these techniques at INSEE-CREST / ENSAE in Paris, France. Both investigators are teaching these techniques to PhD students at their home institutions and will do so more intensively for the undergraduate and graduate students funded by this grant.