This project will create and freely disseminate an Integrated Spatio-Temporal Aggregate Data Series (ISTADS) for the United States covering the years between 1790 and 2010. To reduce barriers to research, the project will build an integrated database that will enable researchers to undertake consistent analyses of spatial and temporal variation across thousands of small geographic areas from the first decades of the republic to the present. The project entails five complementary tasks. 1. Expand an existing spatio-temporal database to include newly-available aggregate census data from several sources and add county-level vital statistics. 2. Design an efficient metadata system that will identify comparable data elements across censuses. This metadata will build on recent innovations in the design of data integration metadata developed at the University of Minnesota and elsewhere. 3. Construct integrated statistical tables spanning between two and twenty-three censuses with closely comparable categories in each census year, allowing easy analysis of change over time. 4. Create integrated geographic units that maximize cross-temporal comparability through aggregation and interpolation. 5. Provide a web-based interface to data and metadata so that users can easily identify data that are comparable across time and can export multi-year merged datasets suitable for statistical analysis and visualization, and incorporate the statistical data and corresponding shapefiles into a GIS system or statistical package. This is basic infrastructure for population and health research and it is urgently needed. The ready availability of integrated aggregate census data in a GIS framework will offer opportunities to address a broad range of research problems. Key areas include residential segregation and settlement patterns;suburbanization and urban sprawl;rural depopulation;concentration of poverty;causes and levels of change in ecosystems;criminology;and environmental justice. Virtually all of these topics have important implications for public health. It is now exceedingly difficult for investigators to analyze small area data in a consistent way across time. By creating the infrastructure to access this vast collection, the data series will allow researchers for the first time to address simultaneously the broad sweep of time and the detail of spatial organization. This power to analyze variation in human behavior simultaneously across both time and space will stimulate innovation across fields ranging from history to epidemiology.
The proposed database is directly relevant to the central mission of the National Institutes of Health as the steward of medical and behavioral research for the nation;this infrastructure will advance fundamental knowledge about the nature of human population dynamics and will spark new health-related research. Researchers will have access to a vast collection of aggregate data including detailed fertility and mortality statistics for the period since 1900 that will allow new studies that address simultaneously the broad sweep of time and the detail of spatial organization.