Data Science Resource Core

Paninski, Liam

Abstract

The major theme of this proposal is a tightly closed loop of experiment, theory, and data analysis. Sophisticated, scalable data science methods are a critical component of this loop. The Data Science Core serves two primary purposes. First, we will apply and refine sophisticated data analysis algorithms directly related to the project?s scientific goals. This project will generate massive streams of data from multiple recording and simulation modalities: whole-cell electrophysiology and anatomy, large-scale calcium imaging, spatiotemporally-complex optogenetic perturbations, RNA sequencing images, in addition to massive simulations of networks of spiking neurons. A correspondingly major effort is needed to manage this data, to distill it into new scientific knowledge, and to design new experiments, theoretical analyses, and simulations to close the theory-experiment-analysis loop. This will entail the application and iterative refinement of algorithms for preprocessing the data (e.g., taking calcium imaging video and extracting demixed and denoised neural activity from each cell visible in the field of view); aligning, registering, and performing statistical inferences on data across multiple modalities (e.g, calcium imaging, optogenetic stimulation, and seqFISH); functionally characterizing the stimulus preferences and correlation structure of the activity in the observed cells; and developing closed-loop optimal experimental design methods to obtain richer, more informative data. Second, this Core will build a collaborative infrastructure allowing the multiple laboratories in this project to act as one: sharing data and analysis tools, and closely integrating theorists and experimentalists. This infrastructure will: be completely open source; build on current efforts to standardize neuroscience data; be modular and extensible to allow for rapid iterative improvement of each stage of the algorithmic pipeline; enforce automatic archiving and recording of algorithmic metadata describing versioning and parameter choices for easy searchability and reproducibility; and allow for straightforward benchmarking. As we develop these practices and tools for data and analysis pipeline sharing, we will make them immediately available to the community. Thus we will provide a model platform for vastly improving reproducibility, keeping analysis pipelines up to date as improved methods are developed, and most importantly saving researchers from re- developing and re-implementing analysis software and data storage/sharing solutions.
We aim to make it easy for groups of labs anywhere in the world to unite and crack large-scale neural circuits. This will transform the way neuroscience is done.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of Neurological Disorders and Stroke (NINDS)
Type: Research Program--Cooperative Agreements (U19)
Project #: 5U19NS107613-03
Application #: 9967174
Study Section: Special Emphasis Panel (ZNS1)

Project Start: 2018-09-15
Project End: 2023-06-30
Budget Start: 2020-07-01
Budget End: 2021-06-30
Support Year: 3
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: Columbia University (N.Y.)
Department
Type
DUNS #: 621889815

City: New York
State: NY
Country: United States
Zip Code: 10032

Related projects

Comments

Be the first to comment on Liam Paninski's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Related projects

Comments