This research project will develop statistical methodology for complex spatio-temporal data. The project is motivated by common features found in many modern federal datasets such as the U.S. Census Bureau's American Community Survey (ACS) and the Longitudinal Employer Household Dynamics (LEHD) program. The public-use ACS and LEHD datasets are enormous and have an overwhelming amount of information on many different demographic and economic indicators, at different U.S. regions and different time periods. This project will develop statistical methods that are tailored to these types of federal data. The project will advance knowledge within the statistical sciences, and the results of this research will be of value to the work of government agencies. Because many subject-matter disciplines, such as neuroscience, demography, and econometrics, also deal with complex data, the results of this research will be broadly useful. Software packages will be developed and made publicly available. The investigators will educate and train both graduate and undergraduate students.
Using a hierarchical approach, this research project will develop Bayesian methodologies for computationally efficient statistical models for dependent multi-distributional and multiscale (in space and time) spatio-temporal data. The project has three aims. In aim 1, the investigators will develop distribution theory that allows for computationally efficient analysis of high-dimensional datasets that consist of data from multiple distributions, such as Gaussian data, counts, and Bernoulli data. In aim 2, the investigators will develop approaches to small-area estimation in the high-dimensional, multi-distributional, and multivariate spatio-temporal data setting. In aim 3, the investigators will develop approaches to mitigate aggregation error in the high-dimensional, multi-distributional, and multiscale spatio-temporal data setting. The methodologies developed in the project will use basis functions and spatial change of support to facilitate dimension reduction and to aid in computation. This project also will make use of vector auto-regressive models, the Karhunen-Loeve expansion, and conjugate multivariate distribution theory to develop principled methodologies that are useful for both the scientific and federal communities.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.