The CENSOC project ? so named because it links 1940 Census data with Social Security Administration death records ? will construct and share a new, large-scale, public microdata data set to be used for advancing understanding of mortality disparities in the United States. The project uses record linkage techniques to match deaths aged 65-and-over observed from 1975 to 2009 back to individual, family, and neighborhood characteristics in the census. Building on preliminary studies, we estimate that the use of modern data-linkage techniques will allow us to construct a data set of about 15 million deaths, more than 30 times the size of the largest existing sample surveys. The unprecedented scale and detail of CENSOC data will allow researchers to make new discoveries in areas such as (a) mortality disparities by education, national origin, and race; (b) early life conditions and later-life mortality; and (c) geographic variation and the neighborhood determinants of mortality. These topics are of increasing importance in understanding increases in disparities in life expectancy in the United States. The creation and distribution of population-level administrative mortality data with individual characteristics is central to the goal of promoting rigorous and replicable scientific research in mortality in the United States. Following the model of the Human Mortality Database (HMD) and the Integrated Public Use Microdata Series (IPUMS), the CENSOC data will be available for analysis on a distribution website. For those wishing to work with identifiable records and the complete count census, access will be possible using the existing network of more than 50 Complete Count Census repositories that already exist as part of licensing agreement between U.S. academic institutions and the University of Minnesota. This secure access will allow additional data sets with individual identifiers to be linked to the CENSOC data. To facilitate usage of this rich dataset, the project will include development of new methods for estimating mortality rates especially appropriate for linked data. We will also carry out a set of `high resolution' studies on mortality disparities and longevity determinants that will serve to advance knowledge as well as demonstrate the potential uses of the CENSOC data set. By taking advantage of existing administrative records, the CENSOC project has the potential to provide a vast, richly detailed, public ?big data? resource for researchers studying old-age mortality disparities and the determinants of longevity.

Public Health Relevance

This project will use existing administrative mortality and census data to create a new public, linked-record dataset for studying mortality determinants in the United States. The resulting data, linking methods, and estimation techniques will enable a more accurate and more detailed investigation of mortality trends and determinants in the US, informing public health policy priorities.

National Institute of Health (NIH)
National Institute on Aging (NIA)
Research Project (R01)
Project #
Application #
Study Section
Social Sciences and Population Studies B Study Section (SSPB)
Program Officer
Karraker, Amelia Wilkes
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California Berkeley
Social Sciences
Schools of Arts and Sciences
United States
Zip Code