Molecular profiling data from scientific studies aiming for early detection and better management of diseases such as cancer, has accumulated at rates far beyond our abilities to efficiently extract knowledge of value to the practice of precisin medicine. A major challenge is that these data are often generated using multiple high-throughput technologies giving rise to panomics data such as gene expression and DNA methylation for the same or related classification task. This project will develop critically neede computational methods and tools for the integrative modeling of panomics data to improve disease state classification from related molecular profiling studies. This project will extend Transfer Rule Learning (TRL) methods that were previously developed to deal with sparse data from biomarker profiling studies, by automatically learning classification rules from one dataset, transferring that knowledge and using it when learning rules from a related dataset. This project will develop, apply and evaluate a novel method for knowledge transfer that involves the use of ontological or taxonomic hierarchies along with classification rule learning. Specifically, this project will test the hypothesis that transfer learning of classification rules using functional mapping (TRL-FM) via ontological structure to provide domain-specific relatedness improves integrative modeling of panomics data over conventional methods to yield better predictive performance and identify more robust biomarker panels for disease state classification. The TRL-FM prototype will be applied to existing de-identified panomics data from two diverse domains for the classification of (1) cancer, and (2) parasitic infections in global populations using microbiome profiling. The TRL-FM models will be validated for precise lung cancer classification and robust biomarker discovery, using an existing set of de-identified panomics data and related nodule size information from a cohort of high-risk CT-screened patients, and comprehensively compared to state-of-the-art classifiers. This project can help create more robust screening tools for the precise classification of lung cancer, the leading cause of death from cancer in the United States. This project will also impact global health with the potential to help improve screening and management of infections caused by helminths, the most common parasites affecting more than a billion people worldwide, using data obtained from fecal microbiome profiling. This project will result in computational tools that can efficiently integrat knowledge from multiple sources when building predictive models from panomics data. The predictive models are highly interpretable, capturing patterns that underlie subpopulations in the data, as classification rules with augmented information about the robustness of discriminative biomarkers. This project will create tools to benefit the rapidly growing human microbiome research community, by incorporating knowledge specific to the analyses of bacterial species sequenced from ribosomal RNA. The TRL-FM tools will make integrative modeling of microbiome data more efficient thereby enabling rapid insights into bacterial strains and species that harm or support human health.

Public Health Relevance

This project will develop critically needed computational methods and tools for the integrative modeling of panomics data to improve disease state classification from related molecular profiling studies. This project can help create more robust screening tools for the precise classification of lung cancer, the leading cause of death from cancer in the United States. This project will also impact global health with the potential to help improve screening and management of infections caused by helminths, the most common parasites affecting more than a billion people worldwide, using data obtained from fecal microbiome profiling.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM100387-05
Application #
9246538
Study Section
Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer
Ravichandran, Veerasamy
Project Start
2012-09-24
Project End
2019-03-31
Budget Start
2017-04-01
Budget End
2018-03-31
Support Year
5
Fiscal Year
2017
Total Cost
$260,079
Indirect Cost
$74,594
Name
University of Pittsburgh
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Balasubramanian, Jeya Balaji; Gopalakrishnan, Vanathi (2018) Tunable structure priors for Bayesian rule learning for knowledge integrated biomarker discovery. World J Clin Oncol 9:98-109
Lustgarten, Jonathan Lyle; Balasubramanian, Jeya Balaji; Visweswaran, Shyam et al. (2017) Learning Parsimonious Classification Rules from Gene Expression Data Using Bayesian Networks with Local Structure. Data (Basel) 2:
Liu, Yuzhe; Gopalakrishnan, Vanathi (2017) An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data. Data (Basel) 2:
Pineda, Arturo López; Ogoe, Henry Ato; Balasubramanian, Jeya Balaji et al. (2016) On Predicting lung cancer subtypes using 'omic' data from tumor and tumor-adjacent histologically-normal tissue. BMC Cancer 16:184
Huang, Tianzhi; Alvarez, Angel A; Pangeni, Rajendra P et al. (2016) A regulatory circuit of miR-125b/miR-20b and Wnt signalling controls glioblastoma phenotypes through FZD6-modulated pathways. Nat Commun 7:12885
Torbati, Mahbaneh Eshaghzadeh; Mitreva, Makedonka; Gopalakrishnan, Vanathi (2016) Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations. Data (Basel) 1:
Gopalakrishnan, Vanathi; Menon, Prahlad G; Madan, Shobhit (2015) cMRI-BED: A novel informatics framework for cardiac MRI biomarker extraction and discovery applied to pediatric cardiomyopathy classification. Biomed Eng Online 14 Suppl 2:S7
Pineda, Arturo Lopez; Gopalakrishnan, Vanathi (2015) Novel Application of Junction Trees to the Interpretation of Epigenetic Differences among Lung Cancer Subtypes. AMIA Jt Summits Transl Sci Proc 2015:31-5
Ogoe, Henry A; Visweswaran, Shyam; Lu, Xinghua et al. (2015) Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data. BMC Bioinformatics 16:226
Menon, Prahlad G; Morris, Lailonny; Staines, Mara et al. (2014) Novel MRI-derived quantitative biomarker for cardiac function applied to classifying ischemic cardiomyopathy within a Bayesian rule learning framework. Proc SPIE Int Soc Opt Eng 9034:

Showing the most recent 10 out of 15 publications