There is an acute and increasing need to adapt standard statistical methods and to develop new approaches for the analysis of very large data sets. A data set is very large if it raises very difficult or insurmountable computational problems for standard data analysis using available computing systems. The accelerated increase in size and complexity of data sets is due in part to increased computational and storage capabilities, new measurement technologies, study designs, and an increasing number of study "units." This proposal is concerned with statistical methods for the analysis of an emerging type of very large data set, where very high dimensional outcomes and predictors, such as images or densely sampled biosignals, are recorded at multiple visits on hundreds or thousands of subjects. The methods proposed will describe the cross-sectional, longitudinal and measurement error variability in longitudinal studies where observed data are functions or images. Methods for scalar on function/image regression analysis will also be addressed for the case of very highly dimensional predictors. The proposed methodology is inspired by and applied to very large studies of sleep and Diffusion Tensor Imaging (DTI) brain tractography.

Public Health Relevance

The project provides statistical analysis methods for very large data sets where images or densely sampled biological signals are measured at multiple visits. Methods are applied to longitudinal sleep electroencephalogram (EEG) data and brain tractography obtained from Diffusion Tensor Imaging (DTI) in Multiple Sclerosis (MS) and healthy subjects.

Agency
National Institute of Health (NIH)
Institute
National Institute of Neurological Disorders and Stroke (NINDS)
Type
Research Project (R01)
Project #
5R01NS060910-05
Application #
8425037
Study Section
Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer
Gnadt, James W
Project Start
2007-11-30
Project End
2017-03-31
Budget Start
2013-04-01
Budget End
2014-03-31
Support Year
5
Fiscal Year
2013
Total Cost
$341,943
Indirect Cost
$130,849
Name
Johns Hopkins University
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
001910777
City
Baltimore
State
MD
Country
United States
Zip Code
21218
Airan, Raag D; Vogelstein, Joshua T; Pillai, Jay J et al. (2016) Factors affecting characterization and localization of interindividual differences in functional connectivity using MRI. Hum Brain Mapp 37:1986-97
Fisher, Aaron; Caffo, Brian; Schwartz, Brian et al. (2016) Fast, Exact Bootstrap Principal Component Analysis for p > 1 million. J Am Stat Assoc 111:846-860
Qiu, Huitong; Han, Fang; Liu, Han et al. (2016) Joint Estimation of Multiple Graphical Models from High Dimensional Time Series. J R Stat Soc Series B Stat Methodol 78:487-504
Tudorascu, Dana L; Karim, Helmet T; Maronge, Jacob M et al. (2016) Reproducibility and Bias in Healthy Brain Segmentation: Comparison of Two Popular Neuroimaging Platforms. Front Neurosci 10:503
Sweeney, Elizabeth M; Shinohara, Russell T; Dewey, Blake E et al. (2016) Relating multi-sequence longitudinal intensity profiles and clinical covariates in incident multiple sclerosis lesions. Neuroimage Clin 10:1-17
Xiao, Luo; He, Bing; Koster, Annemarie et al. (2016) Movement prediction using accelerometers in a human population. Biometrics 72:513-24
Xiao, Luo; Zipunnikov, Vadim; Ruppert, David et al. (2016) Fast Covariance Estimation for High-dimensional Functional Data. Stat Comput 26:409-421
Mejia, Amanda F; Sweeney, Elizabeth M; Dewey, Blake et al. (2016) Statistical estimation of T1 relaxation times using conventional magnetic resonance imaging. Neuroimage 133:176-88
Ho, Vu; Crainiceanu, Ciprian M; Punjabi, Naresh M et al. (2015) Calibration Model for Apnea-Hypopnea Indices: Impact of Alternative Criteria for Hypopneas. Sleep 38:1887-92
Swihart, Bruce J; Punjabi, Naresh M; Crainiceanu, Ciprian M (2015) Modeling sleep fragmentation in sleep hypnograms: An instance of fast, scalable discrete-state, discrete-time analyses. Comput Stat Data Anal 89:1-11

Showing the most recent 10 out of 76 publications