Functional magnetic resonance imaging (fMRI) has become the most common tool for cognitive neuroscience, because it provides a safe, non-invasive, and powerful means to image human brain function. Based on recent rates of publication, there are currently more than 2000 fMRI studies being performed every year worldwide. The aggregation of data across multiple studies can provide the ability to answer questions that cannot be answered based on a single study. For example, using datasets from multiple domains one can start to investigate to what degree a region is selectively engaged in relation to a particular mental process, as opposed to being generally engaged across a broad range of tasks and processes. In addition, it provides the ability to integrate across specific tasks to obtain stronger empirical generalizations about mind-brain relationships, and to better understand the nature of individual variability across different measures. Recent work in neuroimaging analysis has focused on the application of methods such as machine learning techniques to understand the coding of information at the macroscopic level, and network analysis techniques to understand the interactions inherent in large-scale neural systems. The availability of a large testbed of high-quality fMRI data from published studies would also provide an important resource for the development of these and other new analytic techniques for fMRI data. However, sharing of raw fMRI data is challenging due to the large size of the datasets and the complexity of the associated metadata, and there is currently no infrastructure for the open sharing of new fMRI datasets.

This project, OpenfMRI, will provide a new infrastructure for the broad dissemination of raw data within cognitive neuroscience, addressing a critical need by providing an open data sharing resource for neuroimaging. The initial project is already online at with a limited number of datasets. The full project will greatly expand this repository by providing access to a large number of fMRI datasets from several prominent neuroimaging labs, spanning across a broad range of cognitive domains. Utilizing the substantial computational resources of the Texas Advanced Computing Center, the project will also perform standard fMRI analyses on all data in the repository using a common analysis pipeline, thus providing directly comparable analysis results for all of the studies in the database. The OpenfMRI project will support the development of infrastructural elements to make sharing of data by additional investigators more straightforward.

The repository of data that will be created by the OpenfMRI project will also serve as an important resource for teaching by providing students with the ability to replicate the analyses from published studies using the same data. By providing any researcher in the world with the ability to acquire large fMRI datasets, it will also provide all researchers with the ability to work with the same state-of-the-art datasets, regardless of institution. By creating the infrastructure for open sharing of research data, the project will also enhance the impact of other NSF-funded neuroimaging research projects by providing an infrastructure that can be used to make their data available. The planned work has the potential to benefit society by improving education, health, and human productivity through an increased understanding of mental function and its relationship to brain function.

Project Report

The goal of this project was to develop a new online database that will allow researchers to openly share the data from brain imaging research studies. The site, which is available at, provides data to any interested researcher for use in research or teaching. During the grant period, the database has grown to contain 29 research studies with data from almost 700 individual research subjects. A major part of the work supported by the grant was to develop the infrastructure for processing these large datasets, which involved a collaboration with the Texas Advanced Computing Center to use their supercomputing resources. We have also extended the database beyond magnetic resonance imaging (MRI) data to also include data from other brain imaging methods such as electroencephalography (EEG). The database now includes data from healthy individuals as well as individuals with mental health disorders. The shared data have been used by a number of other researchers to ask new questions about brain function. In addition, they are being used to help educate students on how to work with large neuroscience datasets. In this way, the OpenfMRI database will serve as a major resource for training of the next generation of neuroscientists in how to deal with massive datasets. The project has also involved a number of undergraduate and graduate students, giving them important training in working with big neuroscience data. In addition, the project’s members have spoken widely about the need for open science. Altogether, the project has achieved its initial goals of advancing the open availability of neuroscience data, as well as providing a platform for the future growth of data sharing.

National Science Foundation (NSF)
Division of Advanced CyberInfrastructure (ACI)
Standard Grant (Standard)
Application #
Program Officer
Robert Chadduck
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Saint Louis
United States
Zip Code