New large-scale DNA sequencing and array technologies now provide a promising way to study the molecular mechanisms of cancer by generating enormous information measuring aberrations in cancer genome. The genomic information can potentially guide drug design on targeted molecules, and improve clinical decisions in cancer treatment. One of the main obstacles to further progress is to elucidate multiple complex molecular indicators of cancers from the enormous genomic data. This proposal tackles the problem with network-based machine-learning theoretical frameworks and methods that can model the underlying biological mechanisms for an integrative study of cancer genomic information and relevant biomedical knowledge. As a proof of concept, the developed methods will be applied to study chemoresistance in ovarian cancer treatment.

This proposal aims at creating a general computation-driven approach for guiding cancer genomics research and improving genomics-based clinical decisions in cancer treatment. The research activities described in the proposal will deliver a collection of effective and efficient computational tools to utilize heterogeneous genomic data combined with biomedical knowledge for clinical practices. The study of the ovarian cancer data will help reveal the crucial pathways driving chemoresistance, and provide useful prediction tools and drug targets for ovarian cancer treatment. This proposal will also integrate the latest research development in computational cancer genomics into new courses in several training programs to prepare students for their future professions to meet the need of workforce in the growing biomedical and health informatics industry in the upper midwest region. The education plan will also have a focus on recruiting students in minority and under-represented groups in computer science and information technology.

To achieve the goals, the components of the research plan are 1) to formulate graph kernels and subgraph mining algorithms that can integrate various types of cancer genome aberrations to improve cancer outcome predictions and to discover cancer-causative genome aberration patterns; 2) to formulate semi-supervised matrix factorization methods with Laplacian constraints for predicting novel cancer phenotype and gene associations for identifying potential drug targets, utilizing known relations in phenotype, gene and their association networks; 3) to study the chemoresistance in ovarian cancer treatment to reveal the crucial pathways driving the resistance, and develop useful prediction tools and drug targets for ovarian cancer treatment; 4) to release the developed methods in both software packages and webtools for public use in academia. The two major components of the education plan are: 1) to offer a two-week course, titled Cure Cancer with Computers, in the summer academy of the BioSMART program for Minnesota high school students, and 2) to create a new course Computational Genomics in Biomedical Informatics to support two graduate programs for training students in biomedical/health informatics with knowledge in genomics and computer science.

National Science Foundation (NSF)
Division of Information and Intelligent Systems (IIS)
Standard Grant (Standard)
Application #
Program Officer
Sylvia J. Spengler
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Minnesota Twin Cities
United States
Zip Code