PI: Susan Leonard University of Michigan

Dramatically falling death rates lead to increased life expectancy in the US between the late 19th century and the early 20th century. The widely accepted explanation is that deaths from infectious diseases such as smallpox and cholera were on the wane and deaths from degenerative diseases like heart disease and cancer were on the rise, but changes in the names of diseases make testing this hypothesis difficult. This study aims to develop a database that systematically standardizes the causes of death, groups deaths into categories, and provides information about the quality, reliability, and importance of standardization and classification to understanding long-term trends in death rates. The database will contain individual death records from Holyoke and Northampton, Massachusetts from 1850 to 1912 and two alternate cause of death classifications, one based on a standardization of the causes of death given in the original records and the other based on assigning deaths to the early International Classification of Causes of Death. The data will also include measurements of how reliable the classifications are, and whether classification makes a significant difference in mortality trends.

BROADER IMPACTS: The database will be made available to the public in a form suitable for use by historical mortality researchers and public users for example, genealogists. With this data, researchers will be able to better understand the role of controlling infectious disease in lowering overall death rates. The methods used to create the database will also be useful for researchers working in other settings where the meaning of causes of death may be unclear.

Project Report

Dramatically falling death rates in the US in the late 19th century and the early 20th century usually led to dramatically increased life expectancy. The generally accepted explanation for this decline is that deaths from infectious diseases like smallpox and cholera were falling swiftly while deaths from degenerative diseases like heart disease and cancer were rising slowly, although changes in how doctors named and described disease has always made this hypothesis difficult to test. However, data from Northampton and Holyoke, Massachusetts, show just the opposite of the trend towards longer life. The death rate rose as the rural towns grew into industrial cities and the population increased, and there was no dramatic and continued decline in the death rate. To understand this phenomenon, death records from the period need to be entered into a database, and grouped into categories by standardizing and classifying the causes of death. Our project team developed a database for Holyoke and Northampton, Massachusetts, from 1850 to 1912. We classified deaths to both the early International Classification of Causes of Death (ICD) and the mid-19th century classification system used by Massachusetts. Classification is necessary because so many causes appear in death records (over 10,000 unique causes of death in our database). Also, over time diseases were called by different names and more information was recorded about deaths. Using the techniques of text mining and principle component analysis of matrices of the words used in recorded causes of death, we developed a clustering of deaths based on the ‘natural’ language used in describing deaths. With these techniques we identified clusters, first by period, revealing profound changes in cultural views of deaths that impacted cause-specific trends in death rates over time. The clusters were also computed across periods, allowing comparison with ICD based categories and providing a measure (correlation of clusters with taxonomies) of the extent to which various cause of death classifications used in analysis are consistent with natural language discrimination in causes of death. With this information, we discovered strong time trends in how words used to describe deaths were used, trends corresponding to changing medical education, new licensing requirements for physicians and confirming changes in the cultural treatment of specific causes of death over time. Together, these different classifications reveal how different views of death changed, and which diseases contributed to trends in death rates. In brief, we found that the rate of deaths in Northampton and Holyoke that were due to infectious diseases associated with inadequate sewerage systems and contaminated water were dropping from the 1870s on but not yet completely under control by 1912. As importantly, declines in tuberculosis and other infectious respiratory diseases were offset by increase in major chronic degenerative diseases. Together we believe these trends explain why death rates did not continue to fall in these emerging industrial cities, as they had elsewhere. Our data also confirm a decline in acceptability of 'natural cause' and non-preventable 'child wastage' as causes of death, and primary causes becoming more definite and standardized with less detail regarding secondary causes and contributing circumstances. For researchers and the general public, we have posted an accessible database on the web linking the causes of death and their associated ICD codes with a search function for interested users: https://sites.google.com/a/umich.edu/grammars-of-death/. The site will continue to be updated and improved. Other researchers can use this database to understand causes of death in their own data as well as the crucial role controlling infectious disease plays in lowering overall death rates. The methods used to create the database will also be useful for researchers working in other settings where the meaning of causes of death may be unclear (those results will be published). More broadly, the database can help the public understand historical causes of death in a larger perspective.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
0961304
Program Officer
Patricia White
Project Start
Project End
Budget Start
2010-04-01
Budget End
2012-03-31
Support Year
Fiscal Year
2009
Total Cost
$146,224
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Type
DUNS #
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109