Transfer Rule Learning for Knowledge Based Biomarker Discovery and Predictive Bio

Gopalakrishnan, Vanathi

Abstract

Predictive modeling of biomedical data arising from clinical studies for early detection, monitoring and prognosis of diseases is a crucial step in biomarker discovery. Since the data are typically measurements subject to error, and the sample size of any study is very small compared to the number of variables measured, the validity and verification of models arising from such datasets significantly impacts the discovery of reliable discriminatory markers for a disease. An important opportunity to make the most of these scarce data is to combine information from multiple related data sets for more effective biomarker discovery. Because the costs of creating large data sets for every disease of interest are likely to remain prohibitive, methods for more effectively making use of related biomarker discovery data sets continues to be important. Solution: This project develops and applies Transfer Rule Learning (TRL), a novel framework for integrative biomarker discovery from related but separate data sets, such as those generated from similar biomarker profiling studies. TRL alleviates the problem of data scarcity by providing automated ways to express, verify and use prior hypotheses generated from one data set while learning new knowledge via a related data set. This is the first study of transfer learning for biomarker discovery. Unlike other transfr learning approaches, TRL takes knowledge in the form of interpretable, modular classification rules, and uses them to seed learning of a rule model on a new data set. Classification rules simplify the extraction of discriminatory markers, and have been used successfully for biomarker discovery and verification in a non-integrative fashion.
Specific Aims : This project tests the main hypothesis that TRL provides a mechanism for transfer learning of classification rules between related source and target data sets that improve performance on the target data, compared to learning without transfer. TRL will be evaluated using cross-validation performance of classification accuracy and transfer measures, on related groups of existing biomarker discovery datasets obtained from multiple experimental platforms for lung cancer detection and prognosis. A new set of independent validation data will be generated for early detection of lung cancer to test the models generated on pilot data. Insights into the impact of different modeling algorithms on transfer outcomes will be gleaned. Significance: The TRL framework and tool are important for combined analysis and interpretation of clinical data, as they support incremental building, verification and refinement of rule models for predictive biomedicine. The application of TRL to real-world biomarker discovery datasets can yield insights into novel interactions involving known markers, and the most reliable biomarkers for early detection of disease, particularly lung cancer. This project has the potential to help create new diagnostic screening tools for lung cancer detection. It allows foundational understanding of the use of transfer learning for integrative biomarker discovery that could lead to novel technologies for combining information from data and prior knowledge.

Public Health Relevance

This project will develop highly-needed computational methods for integrative biomarker discovery from related but separate data sets produced by predictive molecular profiling studies of disease. It will generate new experimental data for early detection of lung cancer, and has the potential to help create new diagnostic screening tools for lung cancer, a leading cause of death from cancer in the United States.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM100387-02
Application #: 8549840
Study Section: Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer: Lyster, Peter

Project Start: 2012-09-24
Project End: 2015-07-31
Budget Start: 2013-08-01
Budget End: 2014-07-31
Support Year: 2
Fiscal Year: 2013
Total Cost: $289,303
Indirect Cost: $96,303

Institution

Name: University of Pittsburgh
Department: Miscellaneous
Type: Schools of Medicine
DUNS #: 004514360

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects


NIH 2018 R01 GM	Transfer Rule Learning with Functional Mapping for Integrative Modeling of Panomics Data Gopalakrishnan, Vanathi / University of Pittsburgh
NIH 2017 R01 GM	Transfer Rule Learning with Functional Mapping for Integrative Modeling of Panomics Data Gopalakrishnan, Vanathi / University of Pittsburgh	$260,079
NIH 2016 R01 GM	Transfer Rule Learning with Functional Mapping for Integrative Modeling of Panomics Data Gopalakrishnan, Vanathi / University of Pittsburgh
NIH 2014 R01 GM	Transfer Rule Learning for Knowledge Based Biomarker Discovery and Predictive Bio Gopalakrishnan, Vanathi / University of Pittsburgh
NIH 2013 R01 GM	Transfer Rule Learning for Knowledge Based Biomarker Discovery and Predictive Bio Gopalakrishnan, Vanathi / University of Pittsburgh	$289,303
NIH 2012 R01 GM	Transfer Rule Learning for Knowledge Based Biomarker Discovery and Predictive Bio Gopalakrishnan, Vanathi / University of Pittsburgh	$299,716

Publications

Balasubramanian, Jeya Balaji; Gopalakrishnan, Vanathi (2018) Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery. World J Clin Oncol 9:98-109

Lustgarten, Jonathan Lyle; Balasubramanian, Jeya Balaji; Visweswaran, Shyam et al. (2017) Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure. Data (Basel) 2:

Liu, Yuzhe; Gopalakrishnan, Vanathi (2017) An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data. Data (Basel) 2:

Pineda, Arturo López; Ogoe, Henry Ato; Balasubramanian, Jeya Balaji et al. (2016) On Predicting lung cancer subtypes using 'omic' data from tumor and tumor-adjacent histologically-normal tissue. BMC Cancer 16:184

Huang, Tianzhi; Alvarez, Angel A; Pangeni, Rajendra P et al. (2016) A regulatory circuit of miR-125b/miR-20b and Wnt signalling controls glioblastoma phenotypes through FZD6-modulated pathways. Nat Commun 7:12885

Torbati, Mahbaneh Eshaghzadeh; Mitreva, Makedonka; Gopalakrishnan, Vanathi (2016) Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations. Data (Basel) 1:

Gopalakrishnan, Vanathi; Menon, Prahlad G; Madan, Shobhit (2015) cMRI-BED: A novel informatics framework for cardiac MRI biomarker extraction and discovery applied to pediatric cardiomyopathy classification. Biomed Eng Online 14 Suppl 2:S7

Pineda, Arturo Lopez; Gopalakrishnan, Vanathi (2015) Novel Application of Junction Trees to the Interpretation of Epigenetic Differences among Lung Cancer Subtypes. AMIA Jt Summits Transl Sci Proc 2015:31-5

Ogoe, Henry A; Visweswaran, Shyam; Lu, Xinghua et al. (2015) Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data. BMC Bioinformatics 16:226

Menon, Prahlad G; Morris, Lailonny; Staines, Mara et al. (2014) Novel MRI-derived quantitative biomarker for cardiac function applied to classifying ischemic cardiomyopathy within a Bayesian rule learning framework. Proc SPIE Int Soc Opt Eng 9034:

Showing the most recent 10 out of 15 publications

Comments

Be the first to comment on Vanathi Gopalakrishnan's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: