Surveys usually are designed to produce reliable estimates of various characteristics of interest for large geographic areas or socio-economic domains. However, for effective planning of health, social, and other services and for apportioning government funds, there has been a growing demand to produce similar estimates for small geographic areas and subpopulations, commonly referred to as small areas. This research project aims at developing a new method of small area estimation that potentially will lead to a dramatic improvement in accuracy over the traditional methods in practical situations. Model-based small area estimation utilizes statistical models, such as mixed effects models, to "borrow strength." In particular, the empirical best linear unbiased prediction (EBLUP) is a well-known model-based method that has had dominant influence in small area estimation. From a practical point of view, however, any proposed model is subject to model misspecification. When the proposed statistical model is incorrect, EBLUP is no longer efficient or even effective. In such cases, a new method, known as observed best prediction (OBP), may be superior. This project involves several important research topics on OBP, including theoretical developments, assessment of uncertainties under weak model assumptions, and implementation of the OBP via user-friendly software. The research largely will expand the results of our earlier studies, and contribute to making the OBP method more effective, practical, and easy to use.

The research introduces a completely new idea and method to model-based statistical methods in survey sampling. It is expected to impact other scientific areas where statistical methods have been used for prediction problems. The project will develop and freely disseminate R code to implement the OBP method. The education component of the project will introduce the OBP method into courses at the investigators' institutes. These courses are expected to draw students and researchers from statistics, biostatistics, genetic epidemiology, animal and plant sciences, educational research, social sciences, and government agencies. The project is supported by the Methodology, Measurement, and Statistics Program and a consortium of federal statistical agencies as part of a joint activity to support research on survey and statistical methodology.

Project Report

Surveys are designed to provide reliable estimates of quantities of interest at a geographical level like Census tracts, states or regions. The reliability comes from the fact that an adequate sample size exists at the geographical level and hence can produce a direct estimate. If sample sizes are too small, the geographical area is termed a small area and direct estimates derived from these areas are known to be unreliable. To circumvent this problem and provide reliable small area estimates, statistical models can be built which borrow information across areas via the use of auxiliary variables. These models are however assumed to be correctly specified - an assumption which is often dubious. Our work has shown that misspecification of the underlying small area model can lead to significant degradation of the small area estimates. Our research in this grant developed a new parameter estimation technique called observed best prediction (OBP) for robust estimation of the small area estimates, where robustification is with respect to misspecification of two of the most commonly used small area models - the Fay-Herriot model and the nested error regression model. These estimates were studied theoretically and empirically and shown to have far superior performance when the underlying small area model was misspecified. Interestingly, if enough areas are studied, then even if the model is correct, nothing is lost by using the new OBP estimates. The research then went a step further and developed accuracy measures for the new OBP estimates which is required of any small area estimate in practice. Application areas included a study on kidney transplant failure rates at hospitals, and a large smoking cessation study. This research will have major impact on the official statistics communities as small area estimates are constantly being used. The research will also open up new avenues of future methodological research and application because a brand new estimation technique has been introduced. In fact, the research in this grant has already been recognized as a significant advance in small area estimation in a prominent review paper written for the journal Statistical Science. One graduate student is being trained on her doctoral dissertation on ideas related to the research in this grant and the material developed has been incorporated into a graduate level Biostatsitics class on generalized linear models which is taken by a wide range of graduate students throughout the University of Miami.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1122399
Program Officer
Cheryl L. Eavey
Project Start
Project End
Budget Start
2011-10-01
Budget End
2014-09-30
Support Year
Fiscal Year
2011
Total Cost
$78,632
Indirect Cost
Name
University of Miami School of Medicine
Department
Type
DUNS #
City
Coral Gables
State
FL
Country
United States
Zip Code
33146