Artificial intelligence, fueled by recent advances in machine learning, is poised to transform healthcare and biomedical research. Machine learning algorithms allow researchers to analyze complex patterns in large datasets, in the service of advancing our understanding of biological mechanisms and developing clinical tools. This project considers very large scale brain imaging studies, including, for example, tens of thousands of individuals contributing head MRI scans and other biomedical data such as whole-genome sequences or clinical records. Such data allow researchers to map the effects of genetic, environmental, and other factors on the structure and function of the brain, which in turn advances our knowledge of disorders like Alzheimer's. Today, the primary obstacle in exploiting very large scale brain imaging datasets is computational, because existing software tools don't scale well and lack in quality assurance capabilities. This project will produce a machine-learning based computational pipeline that will fill this gap. In the largest study of its kind, we will showcase the developed software tools to chart the heritability of shapes of brain structures. In addition, the project will implement a diverse set of educational outreach initiatives, such as a customized research experience for under-represented minority high-school students.

Neuroimaging is entering a new era of unprecedented scale and complexity. Soon, we will have datasets including more than 100,000 individuals. The fundamental challenge in analyzing and exploiting these data is computational. Today, widely-used neuroimage analysis tools are computationally demanding, produce results that are sensitive to confounds, and are limited in quality control capabilities, making them infeasible at scale. This project will extend recent advances in machine learning to develop an innovative computational pipeline that addresses the drawbacks of existing methods. First, a computationally efficient and flexible brain MRI segmentation framework will be developed that integrates rich neuroanatomical prior models. The segmentation tool will be made robust to confounding effects such as subject motion via the use of an adversarial learning strategy. Learning-based methods will be further investigated to obviate the time-consuming manual quality control of segmentations. Finally, an innovative metric learning approach will be used to study genetic influences on brain morphology in the UK Biobank. The project will also implement an integrated educational plan that is focused on interdisciplinary, hands-on and lifelong learning. The researchers will devote significant effort to developing core educational material that will be adapted and utilized for audiences of various backgrounds and stages.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Cornell University
United States
Zip Code