EAGER: Advanced Machine Learning Techniques to Discover Disease Subtypes in Cancer

Chen, Ping; Ding, Wei; Zarringhalam, Kourosh; Chen, Ping

Abstract

A significant challenge in the analysis of large-scale genomic and molecular profiles of cancer is the identification of distinct, molecularly independent disease subtypes and the association of these with clinically relevant outcomes. The barriers to identifying molecularly-defined, clinically relevant subtypes have been the high-dimensionality of the feature space, limited sample sizes, and low recurrence rate of mutations between patients. The intellectual merits of this project are to develop theory, algorithms, and implementation for robust and scalable network-based machine learning and data mining techniques in high-dimensional gene expression and gene mutation data for disease subtype discovery in cancer. The results of the project can help to identify individual cancer, pan-cancer, and sex-specific subtypes to better understand the nature of cancer and to develop the most efficacious therapeutic strategies. The mathematical and machine-learning models developed in this study are general biological network-induced regularization models that are applicable in a broad range of supervised, semi-supervised, and unsupervised learning problems.

The goal of this project is to design novel network-based learning models that optimally integrate prior biological knowledge on gene regulatory mechanisms into learning algorithms. New group-based and Laplacian-based regularization techniques and restricted manifold learning in matrix factorization are investigated to design reproducible models for disease subtyping. This is the first study to build an efficient toolkit for cancer subtype discovery that fully integrates discrete mutational profiles and continuous gene expression data. The project provides extensive cross-disciplinary training in Computer Science, Mathematics, and Engineering. The models developed during this study can be broadly applied as more precision genomic medicine data become available.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1743010
Program Officer: Amarda Shehu

Project Start
Project End
Budget Start: 2017-07-01
Budget End: 2021-06-30
Support Year
Fiscal Year: 2017
Total Cost: $165,881
Indirect Cost

EAGER: Advanced Machine Learning Techniques to Discover Disease Subtypes in Cancer
Chen, Ping Ding, Wei Zarringhalam, Kourosh Chen, Ping
University of Massachusetts Boston, Dorchester, MA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments