Functional magnetic resonance imaging (fMRI) has become the most common tool for cognitive neuroscience, because it provides a safe, non-invasive, and powerful means to image human brain function. Based on recent rates of publication, there are currently more than 2000 fMRI studies being performed every year worldwide. The aggregation of data across multiple studies can provide the ability to answer questions that cannot be answered based on a single study. For example, using datasets from multiple domains one can start to investigate to what degree a region is selectively engaged in relation to a particular mental process, as opposed to being generally engaged across a broad range of tasks and processes. In addition, it provides the ability to integrate across specific tasks to obtain stronger empirical generalizations about mind-brain relationships, and to better understand the nature of individual variability across different measures. Recent work in neuroimaging analysis has focused on the application of methods such as machine learning techniques to understand the coding of information at the macroscopic level, and network analysis techniques to understand the interactions inherent in large-scale neural systems. The availability of a large testbed of high-quality fMRI data from published studies would also provide an important resource for the development of these and other new analytic techniques for fMRI data. However, sharing of raw fMRI data is challenging due to the large size of the datasets and the complexity of the associated metadata, and there is currently no infrastructure for the open sharing of new fMRI datasets.
This project, OpenfMRI, will provide a new infrastructure for the broad dissemination of raw data within cognitive neuroscience, addressing a critical need by providing an open data sharing resource for neuroimaging. The initial project is already online at www.openfmri.org with a limited number of datasets. The full project will greatly expand this repository by providing access to a large number of fMRI datasets from several prominent neuroimaging labs, spanning across a broad range of cognitive domains. Utilizing the substantial computational resources of the Texas Advanced Computing Center, the project will also perform standard fMRI analyses on all data in the repository using a common analysis pipeline, thus providing directly comparable analysis results for all of the studies in the database. The OpenfMRI project will support the development of infrastructural elements to make sharing of data by additional investigators more straightforward.
The repository of data that will be created by the OpenfMRI project will also serve as an important resource for teaching by providing students with the ability to replicate the analyses from published studies using the same data. By providing any researcher in the world with the ability to acquire large fMRI datasets, it will also provide all researchers with the ability to work with the same state-of-the-art datasets, regardless of institution. By creating the infrastructure for open sharing of research data, the project will also enhance the impact of other NSF-funded neuroimaging research projects by providing an infrastructure that can be used to make their data available. The planned work has the potential to benefit society by improving education, health, and human productivity through an increased understanding of mental function and its relationship to brain function.
The major goals of the project are to develop an open database for sharing of raw fMRI datasets, at openfmri.org, and to populate this database with datasets from collaborating labs. Major activities in the last year: We now have 24 datasets available which include more than 500 subjects worth of data. We have made all of the data available via Amazon S3 using space and bandwidth donated for free for the project by Amazon. We have also transitioned to using this space as our primary repository for downloads, though we continue to keep a repository in parallel at the Texas Advanced Computing Center. We have made available the first combined EEG/fMRI dataset, and will soon release a combined MEG/fMRI dataset. We have made available the first clinically relevant dataset, which includes both healthy controls and individuals diagnosed with schizophrenia. We have made available a 7T fMRI data which is being used for a decoding contest (www.studyforrest.org/). We have worked with members of the INCF Data Sharing Task Force to develop a data model for openfmri datasets, which will be used as the basis for future redesign of the site. We have completely redesigned the OpenfMRI.org website to make it more usable and give it a more professional appearance. We worked with the coordinators of the 2014 OHBM Hackathon to ensure that the data were available for use by hackathon participants. We are receiving increasing interest from groups outside our consortium who are interested in sharing data using the site. Co-investigator Wager's site specifically has contributed datasets from several studies, as agreed. His site has also contributed to conceptualizing the data repository and how its usability can be maximized, and how it fits into the landscape of related research resources. His team also helped to co-author a publication on the OpenfMRI dataset.