This proposal seeks funding to create and distribute a nationally representative 1 in 100 Public Use Microdata Sample of the 1920 United States Population Census. Public Use Samples already exist or are in preparation for the census years 1850, 1880, 1900, 1910, 1940, 1950, 1960, 1970, 1980 and 1990. Although the historical samples for the period 1880- 1950 have only been available for a few years, they have already led to an outpouring of new research on the nature of long-term social change. As each new sample is created, the value of the previous census files has been enhanced, because they become increasingly useful for cohort analysis and studies of social change. Microdata files allow researchers to make tabulations tailored to their specific research Questions and to avoid many of the incompatibilities in the published data for different census years. In addition, the public use samples have allowed researchers to move beyond simple tabular analysis and apply increasingly sophisticated multivariate techniques. These data haven dramatically increased the power of quantitative social science research. A new public use sample for the 1920 census will bridge the existing gap between the 1910 and the 1940 public use samples. When the 1920 sample is complete, we will have a continuous series of microdata for every census year in the twentieth century, with the sole exception of the 1930 census, which is still protected by the Census confidentiality rules. The case for a sample from 1920 is especially compelling because the enumerator's manuscripts include a great deal of information on demography and social structure that can only be taken advantage of through the creation of a new microdata set. The period from 1910 to 1940 is critical for the study of topics such as fertility decline, urbanization, immigration, household composition, and occupational structure. Besides converting a sample of the 1920 population into machine-readable form, the project will evaluate the sample quality through consistency checks, random blind verification, and comparison with aggregate statistics in the published census volumes; edit and allocate missing, illegible, and inconsistent data through logical rules and imputation procedures; construct new variables on household composition and relationships within families; create alternative coding schemes to ensure that the 1920 sample is compatible with all other public use samples currently available; and prepare documentation for the user file, including detailed descriptions of the sampling and data processing procedures, and a guide to the use of the sample.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Project (R01)
Project #
Application #
Study Section
Social Sciences and Population Study Section (SSP)
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Minnesota Twin Cities
Schools of Arts and Sciences
United States
Zip Code