Analysis of multiple sets of data, either of the same type as in multi-subject data, or of different type as in multi-modality data, is inherent to many problems in computer science and engineering. Biomedical image analysis figures prominently among these and is particularly challenging because of the rich nature of the data made available by different imaging modalities. Data-driven methods are particularly attractive for the analysis and fusion of such data as they can achieve useful decompositions while minimizing assumptions on the model and underlying processes, and can also incorporate reliable prior information when available. One such approach recently introduced for medical image analysis and fusion is multi-dataset canonical correlation analysis (MCCA) that has proven especially useful for the analysis and fusion of rather disparate data, owing to its high flexibility and extendibility to a wide array of problem settings.
Intellectual Merit: In this proposal, the main aim is twofold. First, a number of powerful methods are developed for multi-subject (multi-set) data analysis and multi-modal data fusion based on canonical dependence analysis by significantly extending the power and flexibility of MCCA. Then, the successful application of the methods are demonstrated on a unique problem that demands these properties, namely the study of brain function and functional associations during simulated driving, a naturalistic task where data-driven methods have proven very useful. The data used in the project are complementary in nature but of very different nature: functional magnetic resonance imaging (fMRI), electroencephalography (EEG), structural MRI (sMRI), genetic array data--single nucleotide polymorphism (SNP)--and behavioral variables. The rich characteristics of the data and the problem at hand thus provide a special challenge for the methods developed and a unique testbed for the evaluation of their performance.
Broader Impacts: The broad impact of the proposed work lies in its potential to substantially impact science and information technology as well as in its educational features. Analysis of multiple datasets of the same type as well as fusion of data from different modalities/sensors is a key problem in many science and engineering disciplines. The new set of methods proposed thus form attractive solutions for many other problems beyond brain function analysis. The fully integrative nature of the proposed work is also an invaluable asset in the ongoing efforts in cross-training of students and researchers as well as increasing the participation of underrepresented groups in science and technology careers.
For further information, see the project web site at the URL: http://mlsp.umbc.edu/research_projects.html
During this project, we established the theory of a multivariate approach called independent vector analysis (IVA) and generalized much of the previous work on independent component analysis (ICA) and IVA as well as on joint blind source separation such as using canonical correlation analysis (CCA) and multiset (CCA). This is a very fundamental result that provides a solid basis for the application of IVA to a multitude of problems. We focused on applications in functional magnetic resonance imaging (fMRI) and electroencephalography (EEG) and have published papers and release software implementing our approach. Given the need for the analysis of analysis of multiple datasets in many problems in different areas, we are expecting the results to have a major impact in our research community and beyond. The Fusion ICA Toolbox (FIT) and the Group ICA of fMRI Toolbox (GIFT) are two major efforts that are primarily developed and maintained by Dr. Calhoun’s research group (http://mialab.mrn.org/software). These toolboxes disseminate the results of our project to a wider audience, in particular to the medical imaging community. Within the new release of the GIFT software, IVA algorithms are fully integrated and new fusion models are implemented within the current release of FIT. As of now, GIFT has been downloaded by over 8400 and FIT by over 1700 unique individuals. This software is a key component of our dissemination efforts that enable us to advertise the new methods by making them easily available for use by the wider community.