This application seeks continued funding to expand and improve the National Historical Geographic Information System (NHGIS), the nation's most comprehensive source for statistical data, geographic data, and metadata describing spatial characteristics of the American population. Census data provide the denominators that are essential for all spatial analyses of epidemiological and demographic change;it is impossible to interpret the incidence of disease or death without knowing the population at risk. Over the past decade, NHGIS has collected area-level U.S. census data from diverse sources, formatted them consistently, developed comprehensive standardized machine-readable documentation, and created high-precision boundary files describing historical census tracts and counties. To make it feasible to assess change over time, NHGIS has developed integrated data tables that incorporate comparable data across multiple census years. To make the data broadly accessible to the research community, the project has produced powerful new dissemination tools that allow heath researchers to navigate quickly and easily through the intricacies of the U.S. statistical system. With 253 billion data points, NHGIS is the largest accessible population database in the world, and it represents an essential component of big data infrastructure for population and health research. Over the past five years, the number of researchers using the database has increased 600%, and the project is now disseminating 26 terabytes of data per year. This continuation project has four major goals: (1) To expand the database through the addition of all new American Community Survey summary files, new historical census data, new health datasets, and additional integrated time series tables. (2) To realign historical NHGIS boundaries to ensure compatibility with the 2010 census features, thereby allowing users to execute spatiotemporal analyses that include both recent and historical data. (3) To improve data infrastructure and access by implementing novel database structures, an Application Programming Interface that allows external software to access NHGIS data, and new capabilities to reduce redundant effort by researchers. (3) To ensure dissemination and preservation through the development of a long-term preservation plan, implementation of digital object identifiers (DOIs) to facilitate data citation and research replication, and user support, training, and outreach. NHGIS represents a permanent and substantial contribution to the nation's health research infrastructure. It gives investigators the power to analyze variation in human behavior simultaneously across both time and space, opening unprecedented opportunities to understand processes of change. With the addition of new data describing health and the environment, NHGIS will become an even more powerful tool for spatiotemporal analysis of population health.

Public Health Relevance

The proposed expansion, improvement, and support of the database are directly relevant to the central mission of the National Institutes of Health as the steward of medical and behavioral research for the nation. Census data provide the denominators that are essential for all spatial analysis of epidemiological and demographic change;it is impossible to interpret the incidence of disease or death without knowing the population at risk. These data are advancing fundamental knowledge about human population dynamics and spatial organization, and they address key priorities of the Demographic and Behavioral Sciences Branch of NICHD.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (SSPB)
Program Officer
Bures, Regina M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Minnesota Twin Cities
Organized Research Units
United States
Zip Code
Kugler, Tracy A; Fitch, Catherine A (2018) Interoperable and accessible census and survey data from IPUMS. Sci Data 5:180007
Schroeder, Jonathan P (2017) Hybrid Areal Interpolation of Census Counts from 2000 Blocks to 2010 Geographies. Comput Environ Urban Syst 62:53-63
Saporito, Salvatore; Van Riper, David (2016) Do Irregularly Shaped School Attendance Zones Contribute to Racial Segregation or Integration? Soc Curr 3:64-83
Schroeder, Jonathan P; Van Riper, David C (2013) Because Muncie's Densities Are Not Manhattan's: Using Geographical Weighting in the EM Algorithm for Areal Interpolation. Geogr Anal 45:216-237
Saporito, Salvatore; Van Riper, David; Wakchaure, Ashwini (2013) Building the School Attendance Boundary Information System (SABINS): Collecting, Processing, and Modeling K to 12 Educational Geography. J Urban Reg Inf Syst Assoc 25:49-62
Sobek, Matthew; Cleveland, Lara; Flood, Sarah et al. (2011) Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center. Hist Methods 44:61-68
Noble, Petra; VAN Riper, David; Ruggles, Steven et al. (2011) Harmonizing Disparate Data across Time and Place: The Integrated Spatio-Temporal Aggregate Data Series. Hist Methods 44:79-85
Ghosh, Debarchana; Manson, Steven M; McMaster, Robert B (2010) Delineating West Nile Virus Transmission Cycles at Various Scales: The Nearest Neighbor Distance-Time Model. Cartogr Geogr Inf Sci 37:149-163
Schroeder, Jonathan P (2010) Bicomponent Trend Maps: A Multivariate Approach to Visualizing Geographic Time Series. Cartogr Geogr Inf Sci 37:169-187