The search for a robust vector-based, multi-scale geospatial data architecture has challenged GIS and cartographic communities for decades. The successful data model must generate representations at multiple resolutions that preserve line length, local coordinate density, and correct topology. All three are prerequisite to robust spatial analysis and modeling. In particular, the solution must preserve internal and relative topologies. Internal topology means that (for example) stream channels must connect at their confluence, at every level of resolution. Relative topology preserves feature registration between layers: the road feature in one layer must register to the river feature in another layer at the bridge feature in a third layer. Preservation of internal and relative topology at multiple levels of resolution confounds many aspects of GIS-based spatial analysis: simply stated, if features are "conflated" (if they do not register spatially) it becomes difficult to assess spatial relationships that underlie most GIS investigations. This project will implement a pyramid-based vector data architecture along with algorithms for retrieval and data management. The first phase of research implements algorithms for simple vectors, compound vectors (e.g., stream tributaries) and multi-layer vector archives (e.g., roads and hydrography) and demonstrates them on a public domain website that delivers example vector files at progressive resolutions. The second phase addresses core intellectual questions about the nature of geographic information whose content and structure vary with resolution, by embedding uncertainty metrics in the data pyramid that can help to automatically determine if vector data representations at two resolutions are in fact different, and if independently compiled vector datasets contain essentially the same content or unique details across a range of resolutions.
The research will benefit human and physical geographers, by facilitating access to important vector datasets, as for example historical GIS datasets of US Census boundaries dating back to 1790. It will validate general purpose geospatial data products such as USGS National Map, by reducing costs of compiling and maintaining numerous 'single purpose' vector databases at resolutions tailored to special purpose applications. The website demonstrating progressive vector data delivery will show how the pyramid architecture can reduce delivery times for very large data archives. The research will refine current metadata practice that assumes positional accuracy is uniform throughout a vector archive. Improved understanding of internal and relative topology in scale-dependent data will support creation of robust, multi-scale, topologically correct vector databases. Broader impacts will improve choices about selecting GIS data at appropriate levels of detail, and about integrating geospatial data sets compiled at discrepant resolutions.