The goal of this CAREER project is to identify and establish a new robust data mining framework for better modeling, understanding and analyzing brain imaging genomics data that combine the concepts of sparsity-induced learning models and new and more efficient computational algorithms. The proposed research in this project is innovative and crucial not only to facilitating the development of new data mining techniques, but also to addressing emerging scientific questions in brain imaging genomics, and to greatly supporting the BRAIN Initiative which has recently been unveiled by the U.S. Government and become a national goal. Integrated with the research in this project are the educational goals to create and broadly disseminate new curricular and K-12 outreach materials that focus both on the challenges of large-scale, heterogeneous-modal and high dimensional data processing and on the principles behind the robust data mining techniques for alleviating them.
This project focuses on designing principled data mining algorithms for analyzing multi-modal brain imaging genomics data to yield mechanistic understanding from gene to brain function and to phenotypic outcomes. Of particular interests are (1) large-scale non-convex sparse learning models with linear convergence algorithms, (2) linear computational cost multi-task multi-dimensional data integration algorithms, and (3) evaluation and validation in large-scale brain imaging genomics studies. The research in this project will enable new computational applications in a large number of research areas. The educational materials developed as part of this project will give K-12 students a taste of some of the many fascinating topics in the machine learning and data mining fields while communicating to students the relevance of their mathematics and science classes to futures in engineering.