The term "local spatial statistics" refers to a set of statistics used to measure and test for clustering (e.g., of crime or disease) around a specified geographic location. These statistics have become extremely popular in testing hypotheses in both geography and other disciplines. They are used in determining whether there is raised incidence of some phenomenon in the area surrounding the location of interest. Implementation requires that the spatial extent of the area of influence be defined. In practice, this is almost always done arbitrarily, most commonly by defining it to consist of the location of interest, together with any immediately adjacent locations. However, the spatial scale of the underlying process may operate on a different scale. For instance, disease risk may extend beyond locations adjacent to a source of pollution. It is tempting to explore different spatial scales, to see which scale yields the most significant local statistic. Indeed some studies take this approach, but this yields another problem: interpretation must account for the fact that multiple scales have been examined -- one is likely to find statistical significance if enough tests at various spatial scales are carried out.
Professor Peter Rogerson in the Department of Geography at the State University of New York at Buffalo will address this problem by developing methods for assessing the statistical significance of the most significant local spatial statistic. The primary objective is to modify and extend non-spatial statistical methods that have been applied previously to the problem of finding changepoints in temporal sequences of data; these methods are based upon results from probability theory. The goal will be to bring these results to bear on the problem of assessing the significance of local spatial statistics defined over a range of spatial scales. This will be carried out for a variety of different local spatial statistics.
A consequence of the popularity of local spatial statistics has been increased visibility for the spatial sciences across a wide range of disciplines. The new methods to be developed will have the desirable feature of allowing the data to "speak for themselves" -- rather than arbitrarily prespecify the area of influence around a location, the data will be used to estimate the appropriate geographic scale at which the underlying spatial process operates. In addition to the estimation of scale, the methods provide a statistical assessment of significance, thereby allowing tests of hypotheses regarding increased risk (e.g., risk of disease or crime) around a location. The methods should have a broad impact on both geography and other fields that examine spatial processes. The research includes a plan for dissemination of the methods for use in software packages currently used for local statistics. More fundamentally, the choice of scale is fundamental to spatial analysis: the newly developed methods will (i) result in methods that provide guidance on this choice, and (ii) facilitate testing for the presence of spatial clustering around particular locations, across a range of scales. The project will result in a set of research papers that will be submitted to peer-reviewed journals in geography, training for graduate students in the field of spatial science, and a research experience for one to two undergraduates.
This project developed statistical methods to determine the extent of the geographical influence of specific locations on outcomes of interest. Examples include: (a) determining the geographic extent to which hazardous waste sites have an influence on the health of the population (b) determining the geographic extent to which alcohol outlets or a set of vacant houses might have an influence on rates of crime in the surrounding area. The methods that were developed were based upon statistical methods that had been used previously to find changes in time series data (e.g., to find temporal changes in stock prices). These methods were modified so that they could be used on geographic applications such as those described above. The development of these methods will now allow researchers to more accurately assess whether there is raised risk (e.g., of crime or disease) around locations of interest. If the methods, when employed, do indicate raised risk, they will also indicate the geographic extent of the area in which raised risk exists. The results should be of interest to statisticians interested in the development of new methodology, and to social and physical scientists who work with geographic data and who are interested in assessing the geographic range of influence of variables such as crime or disease around locations of interest. Outcomes were disseminated through four publications and five presentations at professional conferences.