Undirected graphical models have been increasingly used for exploring or exploiting dependency structures among different random variables underlying multivariate data, representing complex systems. Graphical models are an important and useful tool for analyzing multivariate data. A graphical model is a statistical model where random variables and the conditional dependencies between them are specified via a graph. Graphical models were originally developed for random vectors with multiple independent realizations (independent and identically distributed time series). Such models have been extensively studied, and found to be useful in a wide variety of applications such as biological regulatory networks, functional brain networks, and social networks. They have also proved to be useful for clustering, semi-supervised learning and classification tasks. Graphical modeling of time-dependent data (time series) is more recent. Time series graphical models of dependent data have been applied to intensive care monitoring, financial time series, air pollution data, and analysis of functional magnetic resonance imaging data to provide insights into the functional connectivity of different brain regions. Almost all existing works on dependent time series are limited to low-dimensional series where number of variables is much smaller than the data sample size. To address high-dimensional time series where number of variables exceed, or are comparable to, the sample size, it is (almost always) assumed that the series is independent and identically distributed in choice of objective function, and algorithm design and analysis, for both synthetic and real data. This project aims to fill this gap by focusing on methods for graphical modeling of high-dimensional dependent time series. The project will also provide training and research experiences for graduate students.

Novel, innovative, general statistical signal processing approaches to graphical modeling of real-valued dependent multivariate time series in high-dimensional settings are investigated in this research. An emphasis is on frequency-domain approaches without requiring detailed parametric modeling of the underlying time series to capture any dependencies in the time domain. Frequency-domain formulation leads to consideration of complex-valued Gaussian graphical models for proper Gaussian random vectors, a topic that has received scant attention. The following thrusts form the core of this research: (1) Design, analysis and optimization of penalized log-likelihood functions to fit graphical models. (2) Analysis of theoretical properties (such as consistency and sparsistency) of the obtained solutions. (3) Application to synthetic and real data to evaluate the efficacy and computational efficiency of the considered approaches.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-09-01
Budget End
2022-08-31
Support Year
Fiscal Year
2020
Total Cost
$180,000
Indirect Cost
Name
Auburn University
Department
Type
DUNS #
City
Auburn
State
AL
Country
United States
Zip Code
36832