Data quality is an extremely important problem, especially for regularly administered instruments such as the Current Population Survey (CPS), which acquire a life of their own that resists major methodological innovations. This study implements a framework for resolving inconsistencies in the unemployment duration data from a four month panel of the CPS. The inconsistency problem that is referred to is that the change in the number of weeks of reported unemployment does not equal the time between surveys in some data records, nor does the reported data clearly indicate that one unemployment spell has ended and another one has begun. In many records, there is insufficient information to resolve ambiguities about the exact sequence of labor force states, and the true duration of unemployment. The method proposed for cleaning up unemployment duration data from the monthly CPS panels is probabilistic. For each data record containing ambiguities about the true labor force state and true duration of unemployment, a joint probability distribution is formulated over possible consistent sequences of labor force states and true durations. Then using Monte Carlo sampling, consistent sequences are sampled and used as inputs into a model describing unemployment durations in a population. Repeated sampling of consistent sequences for each record yields different estimates of the population distribution, and allows a measure of information loss due to inconsistencies in the data to be calculated. A proportional hazards Markov chain model which allows for the inclusion of covariates parametrically, and for analysis of duration dependence nonparametrically is developed. This model is well suited to examine short, data rich CPS panels, and can provide a dynamic picture of changes in unemployment spell lengths of members of various labor force groups in response to seasonal or other changes in the economic climate. Two empirical studies are planned. One will use the 4- 8-4 sampling scheme of the CPS to estimate the effects of occurrence dependence on the length of unemployment spells. The second will use data from the CPS for all 12 months in 1981 to examine how labor market prospects of various groups are affected by the onset of a recession. This is a very promising proposal by a young researcher on an important topic. If successful the study could have a major effect on data analysis using the CPS, with possible extensions of the approach to the Panel Study on Income Dynamics (PSID) and other panel data bases that confront similar problems of inconsistencies.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
9113095
Program Officer
James H. Blackman
Project Start
Project End
Budget Start
1991-07-15
Budget End
1993-03-31
Support Year
Fiscal Year
1991
Total Cost
$82,114
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901