This proposal is a response to PA-11-238, Spatial Uncertainty: Data, Modeling and Communication (R01). Our research focuses on documenting, visualizing and utilizing data error and uncertainty information in spatial analysis. When features undergo spatial aggregation, corruptions introduced through the process are not documented. Data users are not aware of the magnitude of error in and uncertainty accompanying a given dataset. Health outcomes of geocoded individual respondents often require aggregation, either geographically or categorically, in order to preserve privacy when publishing indices, say, derived cancer rates. Properly explaining health outcomes by neighborhood-level characteristics requires knowledge as well as a utilization of the geographic distribution of individuals within areal units coupled with areal associations among these geographic distributions. On the other hand, as data quality information is becoming more readily available, existing mapping tools fail to sufficiently include data quality information. Also, data users often ignore data error and uncertainty information, treating spatial data and associated maps as error- and uncertainty-free. Thus, analyses, such as geographic cluster detection, are performed without considering the quality of data. This proposal addresses these particular data quality issues with the following specific aims: 1) formulate indices to quantify impacts of aggregation error. We would address two aspects: distributions of geocoded individuals within areal units, and impacts of attribute errors through spatial aggregation. 2) develop methods and tools to visualize attribute errors arising from sampling and spatial aggregation. We would enhance our current data quality visualization tools for a GIS, modify existing visualization frameworks, and introduce tools to support new legend designs and map classification methods. 3) introduce spatial statistical methods to incorporate error and uncertainty information into the analyses of global and local spatial pattern detection. We would evaluate the reliability of existing methods, and propose new methods to account for sampling, specification, and measurement error. We would incorporate the aggregation error measures developed through achieving our Aim 1. Ignoring error in spatial data is detrimental to the formulation of effective policies and the making of sound decisions. Our proposed work would enhance future data gathering and processing effort, enable users to consider different types of error information, improve the reliability of spatial pattern detection by incorporating data quality information, and translate uncertainty information into maps and communicate data quality information to users. Results have very general applicability.
On this project, we would develop methods to identify various types of error, including aggregation error, in spatial data. We would develop visual-analytical techniques and tools, inside and outside of GIS, to assist users to recognize sampling and aggregation errors, and to manage spatial error through spatial aggregation. We would also evaluate existing methods, and develop new methods to determine global and local clustering patterns commonly used in spatial epidemiology.