Statistical methods for the analysis of spatial discrete data are relatively underdeveloped when compared to methods for continuous data. This is a notable methodological gap since the former are routinely collected in the earth and social sciences. For instance, death counts due to different causes are collected on a regular basis by government agencies throughout the entire U.S. and classified according to different demographic variables, such as age, gender and race. This project aims at filling this gap by developing a comprehensive study of models for geostatistical discrete data. The project consists of three parts. First, a class of hierarchical spatial models is developed that seeks to ameliorate some limitations identified by the investigator of currently used models. Some of these limitations, relating to the spatial association structures representable by these models, are especially severe when the data consist mostly of small counts, precisely the case when models describing the discreteness of the data are most needed. The properties of these new models and likelihood based methods to fit them are studied. Second, a class of non-hierarchical spatial models is developed that seeks to represent a wide range of spatial discrete data, not just counts, having spatial association structures that are complementary to those in the class of hierarchical spatial models. The models in this class are constructed by separately modeling the marginal and spatial association structures, using an approach akin to copulas. The properties of these models and likelihood based methods to fit them are also studied. Third, a recently proposed Bayesian method to assess goodness-of-fit of statistical models is studied and its soundness for use in the aforementioned classes of models explored. The method, based on a distributional identity between pivotal quantities evaluated at different parameter values, is applicable to both hierarchical and non-hierarchical models. Developing such methods is a pressing need since formal methods to assess model adequacy of spatial models are notoriously lacking.

Spatial data are nowadays routinely collected in many earth and social sciences, such as ecology, epidemiology, demography and geography, but methodology for the analysis of discrete data (say death counts) is much less developed than the corresponding methodology for the analysis of continuous data (say temperature). The investigator proposes to fill this gap by constructing new classes of models that on the one hand ameliorate some limitations identified by the investigator of currently used models, and on the other hand increase the data patterns represented by the models. The project will also develop methodology to assess model adequacy for the newly proposed models, a ubiquitous task in science since any model is an imperfect representation of the phenomenon under study. The statistical methodology developed in the course of this project would have immediate methodological and practical impacts on the earth and social sciences, where spatial discrete data are routinely collected but models and methods for their analysis are scarce. The proposed classes of models will substantially increase the arsenal of tools available to spatial data analysts and the possibility of representing a wide range of behaviors for spatial discrete data. Graduate students will be engaged in the project which will contribute to their statistical training in Bayesian methods and Spatial Statistics, as well as the projection into the future of the Ph.D. program in Applied Statistics at the University of Texas at San Antonio.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1208896
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2012-09-01
Budget End
2016-08-31
Support Year
Fiscal Year
2012
Total Cost
$149,991
Indirect Cost
Name
University of Texas at San Antonio
Department
Type
DUNS #
City
San Antonio
State
TX
Country
United States
Zip Code
78249