This application seeks continued funding to expand and improve the National Historical Geographic Information System (NHGIS), the nation's most comprehensive source for statistical data, geographic data, and metadata describing spatial characteristics of the American population. Census data provide the denominators that are essential for all spatial analyses of epidemiological and demographic change; it is impossible to interpret the incidence of disease or death without knowing the population at risk. Over the past decade, NHGIS has collected area-level U.S. census data from diverse sources, formatted them consistently, developed comprehensive standardized machine-readable documentation, and created high-precision boundary files describing historical census tracts and counties. To make it feasible to assess change over time, NHGIS has developed integrated data tables that incorporate comparable data across multiple census years. To make the data broadly accessible to the research community, the project has produced powerful new dissemination tools that allow heath researchers to navigate quickly and easily through the intricacies of the U.S. statistical system. With 253 billion data points, NHGIS is the largest accessible population database in the world, and it represents an essential component of big data infrastructure for population and health research. Over the past five years, the number of researchers using the database has increased 600%, and the project is now disseminating 26 terabytes of data per year. This continuation project has four major goals: (1) To expand the database through the addition of all new American Community Survey summary files, new historical census data, new health datasets, and additional integrated time series tables. (2) To realign historical NHGIS boundaries to ensure compatibility with the 2010 census features, thereby allowing users to execute spatiotemporal analyses that include both recent and historical data. (3) To improve data infrastructure and access by implementing novel database structures, an Application Programming Interface that allows external software to access NHGIS data, and new capabilities to reduce redundant effort by researchers. (3) To ensure dissemination and preservation through the development of a long-term preservation plan, implementation of digital object identifiers (DOIs) to facilitate data citation and research replication, and user support, training, and outreach. NHGIS represents a permanent and substantial contribution to the nation's health research infrastructure. It gives investigators the power to analyze variation in human behavior simultaneously across both time and space, opening unprecedented opportunities to understand processes of change. With the addition of new data describing health and the environment, NHGIS will become an even more powerful tool for spatiotemporal analysis of population health.
The proposed expansion, improvement, and support of the database are directly relevant to the central mission of the National Institutes of Health as the steward of medical and behavioral research for the nation. Census data provide the denominators that are essential for all spatial analysis of epidemiological and demographic change; it is impossible to interpret the incidence of disease or death without knowing the population at risk. These data are advancing fundamental knowledge about human population dynamics and spatial organization, and they address key priorities of the Demographic and Behavioral Sciences Branch of NICHD.