Many behaviors of interest involve discrete response in a temporal and spatial context. These may be the success of plant species in a series of adjacent fields, land-use designations across 30-meter grid cells, popular election outcomes across counties, and levels of crime across neighborhoods and over time. In the transportation arena, such responses include trade-flow distributions across zones, and vehicle-ownership levels across households. All these behaviors can be measured (and/or coded) as discrete responses, dependent on various influential factors and exhibiting some degree of temporal and spatial dependence or autocorrelation. Significant uncertainty generally lingers in predictive models; unobservable yet influential factors remain. The size of such contributions varies, often in a continuous fashion over space. In contrast to time-series data, the dependencies are two dimensional. This added complexity tends to limit model specifications to the use of weight matrices, smaller data sets, and arbitrary correlation patterns. Methods are needed to capitalize on the emergence of huge and highly detailed digital data sets. This work seeks to address existing gaps by developing new statistical models for discrete response data that incorporate the effects of spatial and temporal autocorrelation. The research will develop, estimate, apply, and compare dynamic ordered and unordered probit models for spatial processes, based on a marriage of satellite imagery and more commonly available data bases for urban systems analysis. The first of these models emphasizes ordered responses (such as differing intensities of land use), while the latter recognizes unordered, categorical data (using a latent-response optimization framework). Both sets of models will apply over time and space, using a combination of LandSat satellite imagery and more readily available data sets over several years. Multiple parameter estimation techniques will be explored, including maximum simulated likelihood estimation (MSLE), Bayesian methods, generalized method of moments (GMM), and non-parametric techniques. Model application will be demonstrated using land-cover/land-use data acquired via LandSat satellite imagery for Austin, Texas, and less urbanized regions of the globe as data sets become available. The Austin imagery will be supplemented by U.S. Census data and land-use and transportation-systems data maintained by the region's planning agency.

Almost all data sets have a spatial dimension to them and the world is poised to benefit from improvements in spatial econometric methods and channels of data acquisition for a tremendous variety of applications. The first of these models will be used to better understand and anticipate changes in the intensity of land development (e.g., undeveloped, lightly developed, and highly developed), while the second will be used to appreciate variations in land use over a categorical (rather than ordered) set of designations (e.g., residential versus commercial versus undeveloped). The focus and most challenging aspects of the work are methodological in nature. Nevertheless, the use of land-use data sets offers a meaningful and highly tangible application that demonstrates the value of new spatial econometric methods and the benefits of satellite imagery in tandem with more traditional data sets. The work's primary contributions are specification and estimation techniques for wholly new statistical methods that recognize temporal and spatial dependencies in discrete multiple-response data, and the demonstration of how satellite images can be used for purposes of metropolitan planning and transportation systems modeling. The model specifications and estimation techniques to be developed will fill a key void in the fields of spatial statistics and spatial econometrics, where models of continuous response data are the norm. The generic nature of the spatial econometric methods to be developed makes them applicable to many social, environmental, and other issues, wherever outcomes are discrete in nature and observed over time and space. Their application to land-cover change will enhance current understanding of regional development and human activity patterns, facilitating public and private policy evaluation.

Project Report

PI (University of Texas): Kara Kockelman PI (Bucknell/RPI): Xiaokun Cara Wang This project resulted in several papers of the team analyzing a variety of spatial transortation data, including land use change, traffic volumes, and roadway crashes. Below is a summary of the methods and findings of the 8 publications produced by this project, both published and under review. These papers explore a wide variety of cutting-edge features of spatial data forecasting, addressing important theoretical and estimation questions while generating meaningful results regarding land development, traffic volume prediction, and crash safety that are useful for the general public. More details can be found at www.ce.utexas.edu/prof/kockelman/. Papers Wang, Y., Kockelman, K., and Damien, P. (2012) A Spatial Autoregressive Multinomial Probit Model for Anticipating Land Use Change in Austin, Texas. Proceedings of IATBR's 13th International Conference on Travel Behavior Research Board, in Toronto (2012). Under review for presentation in the 92nd Annual Meeting of the Transportation Research Board and for publication in Transportation Research Record. Wang, Y., Kockelman, K., and Wang, X. (2012) Understanding Spatial Filtering for Analysis of Land Use Data Sets. Under review for presentation in the 92nd Annual Meeting of the Transportation Research Board and for publication in Transportation Research Record. Wang, Y., and Kockelman, K. (2012) A Conditional-Autoregressive Count Model for Pedestrian Crashes across Neighborhoods. Under review for presentation in the 92nd Annual Meeting of the Transportation Research Board and for publication in Transportation Research Record. Wang, Cara., D. Zhang, and D. Magalhães (2012) Using Bicycles for Daily Commuting in Belo Horizonte, Brazil: Assessment of User Willingness Level with Spatial and Heterogeneity Considerations. Under review for presentation in the 92nd Annual Meeting of the Transportation Research Board and for publication in Transportation Research: Part A. Wang, Y., Kockelman, K., and Wang, X. (2012) The Impact of Weight Matrices on Parameter Estimation and Inference: A Case Study of Binary Response Using Land Use Data. Under final review for publication in the Journal of Transportation and Land Use. Wang, X., Kockelman, K., and Lemp, J. (2012) The Dynamic Spatial Multinomial Probit Model: Analysis of Land Use Change Using Parcel-Level Data. Journal of Transport Geography 24 (2012):77-88, 2012. Selby, B., and Kockelman, K. (2011) Spatial Prediction of AADT in Unmeasured Locations by Universal Kriging. Proceedings of the 90th Annual Meeting of the Transportation Research Board, and under final review for publication in the Journal of Transport Geography. Wang, Y., Kockelman, K., and Wang, X. (2011) Anticipation of Land Use Change through Use of Geographically Weighted Regression Models for Discrete Response. Transportation Research Record No. 2245: 111-123. Selected Abstract: The Impact of Weight Matrices on Parameter Estimation and Inference: A Case Study of Binary Response Using Land Use Data Yiyi Wang, Kara Kockelman, and X. Cara Wang This paper develops two new models and evaluates the impact of using different weight matrices on parameter estimates and inference in three distinct spatial specifications for discrete response. These specifications rely on a conventional, sparse, inverse-distance weight matrix for a spatial auto-regressive probit (SARP), a spatial autoregressive approach where the weight matrix includes an endogenous distance-decay parameter (SARPa), and a matrix exponential spatial specification for probit (MESSP). These are applied in a binary choice setting using both simulated data and parcel-level land use data. Parameters of all models are estimated using Bayesian methods. In simulated tests, adding a distance-decay parameter term to the spatial weight matrix did not alter the quality of estimation and inference, but the added sampling loop required to estimate the distance-decay parameter substantially increased computing times. By contrast, the MESSP model’s obvious advantage is its fast computing time, thanks to elimination of a log-determinant calculation for the weight matrix. In the model tests using actual land use data, the MESSP approach emerged as the clear winner, in terms of fit and computing times. Results from all three models offer consistent interpretation of parameter estimates, with locations farther away from the regional CBD and closer to roadways being more prone to (mostly residential) development (as expected). Again, the MESSP model offered the greatest computing-time savings benefits, but all three specifications yielded similar marginal effects estimates, showing how a focus on the spatial interactions and net (direct plus indirect) effects across observational units is more important than a focus on slope-parameter estimates, when properly analyzing spatial data.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Application #
0818066
Program Officer
Cheryl L. Eavey
Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2008
Total Cost
$208,436
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712