Technological advances have enabled the collection of large, complex data that evolve over time. Such data also exhibit heterogeneity across multiple entities (e.g. countries, patients) and on many occasions are sampled collected) at different frequencies. Hence, there is a strong need for developing and tailoring data analysis techniques to the specific requirements imposed by the presence of temporal dependence across multiple variables and also address varying sampling frequency and heterogeneity issues. The statistical learning models, and associated analysis methods developed in this project would be applicable across a wide range of fields, including analysis and forecasting with macroeconomic and financial data and in neuroscience. Empirical work based on the work of this project would provide insights on functional connectivity of brain regions, but also quantify the degree of heterogeneity of subjects suffering from a common disease. They would also be useful to policy makers and financial regulators for devising monitoring schemes that assess stress conditions across markets. Further, we expect significant technology transfer to other application areas, such as environmental sciences where similar types of data, characterized by heterogeneity and mixed frequency sampling, are available.

To address the challenges of temporal dependence, heterogeneity and varying sampling frequency in the data this project would: (i) develop and investigate Bayesian versions of Vector Autoregressive (VAR) models for high-dimensional time series data, based on novel prior distributions, (ii) introduce structured sparsity in VAR models and also incorporate exogenous variables, (iii) develop approximate dynamic factor models that can accommodate strongly correlated idiosyncratic components, (iv) develop methods for joint estimation of related VAR models and finally (v) develop Bayesian methodology for handling mixed frequency time series data. A strong emphasis is placed on providing uncertainty quantification of the model parameters, which is particularly important in applications and is usually lacking in many modern methods for large data sets. This project would advance the state of the art for Big Data settings involving a large number of time series both at the modeling, computational and inference fronts. Finally, doctoral students would receive mentoring in novel, timely topics on time series modeling, analysis and forecasting and course curriculum would be advanced.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1821220
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2018-08-01
Budget End
2021-07-31
Support Year
Fiscal Year
2018
Total Cost
$199,999
Indirect Cost
Name
University of Florida
Department
Type
DUNS #
City
Gainesville
State
FL
Country
United States
Zip Code
32611