This proposal lays down a comprehensive framework for carrying out statistical inference on point-referenced spatial data that are available from a large number of locations. Statistical theory is used to develop mathematically formal but computationally feasible methods that can have a broad range of applications. Hierarchical models implemented through Markov chain Monte Carlo (MCMC) methods have become especially popular for spatial modelling, given their flexibility and power to fit models that would be infeasible with classical methods as well their avoidance of possibly inappropriate asymptotics. However, fitting hierarchical spatial models often involves expensive matrix decompositions whose computational complexity increases in cubic order with the number of spatial locations, rendering such models infeasible for large spatial data sets. This computational burden is aggravated in multivariate settings with several spatially dependent response variables and also when data is collected at frequent time points and spatiotemporal process models are used. The investigators propose a class of models based upon a stochastic process that results from projecting the original process onto a lower-dimensional subspace. The investigators term these models as predictive process models and propose to explore their theoretical properties. The long-term goal of the PI is to develop a full suite of statistical methods that estimate spatial models in a wide variety of experiments in forestry and ecology. A recurrent underlying theme of the proposed methods that distinguishes it from existing methods is that the modeler does not need to sacrifice richness in modeling as a compromise for the large datasets. This resolves the statistical irony that large datasets are precisely where statistical estimates of rich association structures are permissible. The emphasis is on models that can be executed even with moderately powerful computing tools and so would be accessible to a large number of researchers.

With the increasing popularity and availability of spatial referencing technologies such as Geographical Information Systems (GIS) and Global Positioning Systems (GPS) that can identify geographical coordinates with a simple hand-held device, scientists and researchers in a variety of disciplines today have access to large amounts of geocoded data. The broader impact of the proposed methods is best assessed by connecting the outcome of this research with the widely recognized impact of GIS on human society. From identifying spatial disparities in health standards to more precise weather predictions, GIS technology is used today in almost every sphere of society. By redeeming the investigators from using ad-hoc and qualitative methods that often bring out spurious stories, the proposed methods can have far reaching beneficial effects in environmental research that potentially touch unexpected corners of society. Consider a situation where an ecologist is unable to recognize critical symbiotic relationships between multiples species, due to inadequate models. Mathematical formalism, for all its complexities, minimizes such errors arising from qualitative techniques currently prevalent in forestry and ecological analysis. Such and several other scientific problems require formal spatial analysis, harnessing the full power of the information that large datasets carry. They include, but are not limited to, public and environmental health, meteorology, engineering, geosciences and so on, where the fundamental goal is the same: use new findings that will help improve human society.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0706870
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2007-06-15
Budget End
2010-08-31
Support Year
Fiscal Year
2007
Total Cost
$253,511
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Type
DUNS #
City
Minneapolis
State
MN
Country
United States
Zip Code
55455