DANDI: Distributed Archives for Neurophysiology Data Integration

Ghosh, Satrajit; Halchenko, Yaroslav

Abstract

Neuroscientific data contain information from an incredible diversity of species, are generated by a plethora of devices, and encapsulate the results of scientific thinking and decision making. Most of this generated data remains confined within laboratories and is not accessible to the broader scientific community. The research projects awarded under the Brain Initiative are generating a diverse collection of data that can transform and accelerate the pace of discovery. These datasets are large--ranging in size from GBs to PBs-- and represent diverse data types and assorted metadata. To integrate, rather than further isolate, these numerous efforts there is a need to archive, preserve, share, and process data in a way that is meaningful to neuroscience researchers. Any technological solution should reduce redundancy of storage and computation, allow computing near data, and provide easy, but protected when appropriate, access to researchers or citizen scientists. Given the scale of these initiatives and the range of sample sizes and data types, any solution should also consider the broad range of individual technical expertise in the community and therefore allow easy engagement with and ingestion into an archive, while supporting education and training of the scientists in using these technologies. To solve these problems, we propose ?DANDI: Distributed Archives for Neurophysiology Data Integration.?We leverage our team?s extensive experience in informatics, standards development, software engineering, community building, and leverage a robust open-source software stack to create this archive. The archive will lower barriers for neuroscientists by using the ?Neurodata Without Borders (NWB; ?http://nwb.org?) standard as a consistent data format, by providing interoperability with other standards, and by providing robust tools and convenient Web interfaces to interact with the archive. DANDI will: 1) ?provide a cloud platform for versioned neurophysiology data storage for the purposes of collaboration, archiving, and preservation. 2) ?provide easy to use tools for neurophysiology data submission and access in the archive; and 3) facilitate adoption of NWB via standardized applications for data ingestion, visualization and processing. ?We will work with local investigators, the broader neurophysiology community, and with federal and other funders to determine how long and which pieces of data will be stored in DANDI. The archive will also use state of the art data distribution technologies to increase redundancy and fault tolerance, and allow distributed computing across cloud and local computing resources. Consequently the effort will significantly reduce the barrier between laboratories and the cloud, fostering collaboration and data exchange. Overall, we aim to leverage our collective expertise to create and support an NWB-based neurophysiology archive that seamlessly integrates with and enhances current researcher workflows, lowers barriers for scientific inquiry and collaboration, and preserves information for wide reuse.

Public Health Relevance

The proposal will build an easy to use infrastructure for scientists to share, collaborate, and process data from neurophysiology experiments, which form the basis for understanding cellular level mechanisms of brain function. Open data helps to increase collaboration and benefits researchers, but can also engage students in high schools and colleges. An open archive will facilitate data publishing and improved access by scientific communities and has the potential to accelerate scientific discoveries about the nervous system.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of Mental Health (NIMH)
Type: Resource-Related Research Projects (R24)
Project #: 1R24MH117295-01A1
Application #: 9795271
Study Section: Special Emphasis Panel (ZMH1)
Program Officer: Zhan, Ming

Project Start: 2019-08-01
Project End: 2024-04-30
Budget Start: 2019-08-01
Budget End: 2020-04-30
Support Year: 1
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Massachusetts Institute of Technology
Department: Miscellaneous
Type: Organized Research Units
DUNS #: 001425594

City: Cambridge
State: MA
Country: United States
Zip Code: 02142

Related projects


NIH 2020 R24 MH	DANDI: Distributed Archives for Neurophysiology Data Integration Ghosh, Satrajit Sujit; Halchenko, Yaroslav O. / Massachusetts Institute of Technology
NIH 2019 R24 MH	DANDI: Distributed Archives for Neurophysiology Data Integration Ghosh, Satrajit Sujit; Halchenko, Yaroslav O. / Massachusetts Institute of Technology

Comments

Be the first to comment on Satrajit Ghosh's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: