Spatio-temporal analyses can enable many discoveries including reducing traffic congestion, identifying hotspot areas to deploy mobile clinics, and urban planning. Unfortunately, the data poses many computational challenges. Standard assumptions in machine learning and data mining algorithms are violated by the complex nature of spatio-temporal data. These include spatial and temporal correlation of observations, dynamic and abrupt changes in observations, variability in measurements with respect to length and frequency, and multi-sourced data that spans multiple sources of information. In recognition of these challenges, various efforts have been undertaken to develop specialized spatiotemporal models. Yet, to date, these algorithms are predominately designed to analyze small- to medium-sized datasets. The goal of this project is to develop a comprehensive computational tensor platform to perform automated, data-driven discovery from spatio-temporal data across a broad range of applications. The project also includes a set of integrated educational activities such as a Massive Open Online Course that covers cross-disciplinary topics at the confluence of computer science and geospatial applications, annual spatio-temporal data challenges and hackathons, and an annual event at the Atlanta Science Festival to create public awareness and encourage participation by women and minorities.
The project will contain algorithmic innovations that reflect appropriate assumptions of spatio-temporal data without sacrificing real-time performance, computational scalability, and cross-site learning even under privacy constraints. The proposed platform will generalize tensor modeling to encompass the complex nature of spatio-temporal data including time irregularity, spatiotemporal correlations, and evolving distributions. It will enable the integration of multi-sourced data from heterogeneous sources to yield robust and cohesive learned patterns. The novel algorithms will also facilitate learning in decentralized settings while preserving privacy. The computational platform will contain interchangeable modules that can adapt to new spatio-temporal settings and incorporate additional contextual information. The accompanying suite of algorithms will enable predictive learning, pattern mining, and change detection from large-sized spatio-temporal data. The broad applicability of the project will be demonstrated on a diverse range of data including urban transportation services, real estate market transactions, and population health. The algorithmic innovations introduced can be used to scale other machine learning models.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.