- TR&D1: DATA SCIENCE Large-scale data aggregation has generated considerable interest within the neuroscience community, both for its potential to increase the statistical significance of research results as well as for the reuse of data that has already been collected. New federated approaches are needed to bring together research studies that operate independently from one another and to manage the complex needs of data access, aggregation, harmonization, and analysis.
In Aim 1 we build upon our extensive experience in developing federated database systems and propose a one-time application process that simplifies data access by consolidating disparate applications across multiple research institutions. We also propose a single secure and unified pathway for downloading binary and tabular files from different research studies that would significantly reduce the effort required to retrieve these files.
In Aim 2 we introduce a new approach for harmonizing data collected by different research studies that incrementally applies transformations and provides immediate visual feedback with tabular updates and interactive summaries.
In Aim 3 we propose to integrate recent Docker technologies into our framework and establish an archive for analyses that can be transferred to and executed on any Linux computer. Input and output data will be linked to their respective analyses and used as query criteria when searching the archive.
In Aim 4 we propose a new mediator that acts as a bridge that connects all the components of our framework. This Analysis Assembler utilizes the unified pathway of Aim 1 and automatically downloads all files needed for an analysis. After retrieving the analysis itself from the archive in Aim 3, the Assembler proceeds to execute the analysis on the data files. After the analysis has completed, the Assembler records the provenance of all output data, which will be made accessible in visual queries of our federated search system.
In Aim 5 we propose to extend our quality control system to use machine learning to automatically assign ?poor? and ?good? quality ratings to neuroimaging MRI data. With the goal of locating hard-to-see artifacts, we also propose to implement interactive 3D visualizations to more accurately assess image quality. All five of our aims provide a framework upon which neuroscience can be conducted, shared, and replicated ? comprising a foundation for reproducible science.

Public Health Relevance

- TR&D1: DATA SCIENCE Large-scale analyses of neuroscientific information require an environment where data from multiple research studies around the world can be easily discovered, aggregated, and reused. Independent and geographically separated data sets must be harmonized into large sample sizes and scientific analyses must be executed, quality-checked, and accurately recorded to be shared and reproduced by others. Our proposed framework lays a foundation for reproducible science through reliable data provenance and federated data analyses.

Agency
National Institute of Health (NIH)
Institute
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Type
Biotechnology Resource Grants (P41)
Project #
5P41EB015922-22
Application #
9700672
Study Section
Special Emphasis Panel (ZEB1)
Project Start
Project End
Budget Start
2019-03-01
Budget End
2020-02-29
Support Year
22
Fiscal Year
2019
Total Cost
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90089
Kim, Hosung; Caldairou, Benoit; Bernasconi, Andrea et al. (2018) Multi-Template Mesiotemporal Lobe Segmentation: Effects of Surface and Volume Feature Modeling. Front Neuroinform 12:39
Duncan, Dominique; Vespa, Paul; Toga, Arthur W (2018) DETECTING FEATURES OF EPILEPTOGENESIS IN EEG AFTER TBI USING UNSUPERVISED DIFFUSION COMPONENT ANALYSIS. Discrete Continuous Dyn Syst Ser B 23:161-172
Azevedo, Christina J; Cen, Steven Y; Khadka, Sankalpa et al. (2018) Thalamic atrophy in multiple sclerosis: A magnetic resonance imaging marker of neurodegeneration throughout disease. Ann Neurol 83:223-234
Ning, Kaida; Chen, Bo; Sun, Fengzhu et al. (2018) Classifying Alzheimer's disease with brain imaging and genetic data using a neural network framework. Neurobiol Aging 68:151-158
Coletti, Amanda M; Singh, Deepinder; Kumar, Saurabh et al. (2018) Characterization of the ventricular-subventricular stem cell niche during human brain development. Development 145:
Aydogan, Dogu Baran; Jacobs, Russell; Dulawa, Stephanie et al. (2018) When tractography meets tracer injections: a systematic study of trends and variation sources of diffusion-based connectivity. Brain Struct Funct 223:2841-2858
Gahm, Jin Kyu; Shi, Yonggang; Alzheimer’s Disease Neuroimaging Initiative (2018) Riemannian metric optimization on surfaces (RMOS) for intrinsic brain mapping in the Laplace-Beltrami embedding space. Med Image Anal 46:189-201
Sepehrband, Farshid; Lynch, Kirsten M; Cabeen, Ryan P et al. (2018) Neuroanatomical morphometric characterization of sex differences in youth using statistical learning. Neuroimage 172:217-227
Tang, Yuchun; Sun, Wei; Toga, Arthur W et al. (2018) A probabilistic atlas of human brainstem pathways based on connectome imaging data. Neuroimage 169:227-239
Li, Junning; Gahm, Jin Kyu; Shi, Yonggang et al. (2018) Topological false discovery rates for brain mapping based on signal height. Neuroimage 167:478-487

Showing the most recent 10 out of 273 publications