Biomedical Computing and Informatics Strategies for Precision Medicine

Huang, Xiuzhen; Moore, Jason; Nathanson, Katherine

Abstract

The use of genomic measures for precision medicine will depend critically on our ability to identify genes whose expression impacts the initiation, progression, and severity of common diseases such as sporadic cancer. A multitude of powerful computational and statistical methods have been developed over the last 20 years to assist with this endeavor. However, the vast majority of these approaches focus on error or related measures such as sensitivity and specificity as a measure of model quality. These measures are important but do not capture other measures of model quality that may be meaningful to biomedical researchers and physicians. We propose here to develop a comprehensive approach to modeling genomics data that takes into consideration multiple objective and subjective measures of model quality simultaneously. It is our working hypothesis that multiobjective methods will yield results that are more consistent, more reproducible, and with greater clinical impact. Specifically, we will develop a novel Hierarchical Pareto Optimization (HiParOp) algorithm that is capable of integrating multiple criteria for a given computational model of gene expression and clinical outcomes (AIM 1). This approach will first be validated with simulated gene expression data that reflect the hierarchical complexity of cancer. We will then evaluate the HiParOp algorithm by applying it to several well-studied and well-characterized breast cancer data sets that have led to diagnostic tests and new drug targets (AIM 2). Here, we will include a long list of measures of model quality that include traditional objective measures such as the cohesiveness or distinctiveness of tumor clusters as well as subjective measures such as clinical relevance and druggability. Experience applying HiParOp to a well-studied cancer where significant progress has been made will be used to make further refinements to the algorithm. We will then apply the HiParOp approach to the genomic analysis of non-small cell lung cancer (NSCLC) where there is substantial opportunity for improved diagnosis and treatment. We will analyze several carefully conducted gene expression studies in NSCLC cancer tissue (AIM 3). Finally, we will develop and release an R package that will allow others to easily implement the HiParOp method (AIM 4).

Public Health Relevance

The use of genomic measures for precision medicine will depend critically on our ability to identify genes whose expression impacts the initiation, progression, and severity of common diseases such as sporadic cancer. Current approaches for computational analysis focus on prediction error as a measure of model quality. We propose here to develop a comprehensive approach to modeling genomics data that takes into consideration multiple objective and subjective measures of model quality simultaneously.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 5R01LM012601-04
Application #: 9999032
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Ye, Jane

Project Start: 2017-09-01
Project End: 2021-08-31
Budget Start: 2020-09-01
Budget End: 2021-08-31
Support Year: 4
Fiscal Year: 2020
Total Cost
Indirect Cost

Institution

Name: University of Pennsylvania
Department: Biostatistics & Other Math Sci
Type: Schools of Medicine
DUNS #: 042250712

City: Philadelphia
State: PA
Country: United States
Zip Code: 19104

Related projects


NIH 2020 R01 LM	Biomedical Computing and Informatics Strategies for Precision Medicine Huang, Xiuzhen; Moore, Jason H.; Nathanson, Katherine L. / University of Pennsylvania
NIH 2019 R01 LM	Biomedical Computing and Informatics Strategies for Precision Medicine Huang, Xiuzhen; Moore, Jason H.; Nathanson, Katherine L. / University of Pennsylvania
NIH 2018 R01 LM	Biomedical Computing and Informatics Strategies for Precision Medicine Beer, David George; Huang, Xiuzhen; Moore, Jason H. / University of Pennsylvania
NIH 2018 R01 LM	Biomedical Computing and Informatics Strategies for Precision Medicine Beer, David George; Huang, Xiuzhen; Moore, Jason H. / University of Pennsylvania
NIH 2017 R01 LM	Biomedical Computing and Informatics Strategies for Precision Medicine Beer, David George; Huang, Xiuzhen; Moore, Jason H. / University of Pennsylvania

Publications

Moore, Jason H; Shestov, Maksim; Schmitt, Peter et al. (2018) A heuristic method for simulating open-data of arbitrary complexity that can be used to compare and evaluate machine learning methods. Pac Symp Biocomput 23:259-267

Piette, Elizabeth R; Moore, Jason H (2018) Improving machine learning reproducibility in genetic association studies with proportional instance cross validation (PICV). BioData Min 11:6

Causey, Jason L; Ashby, Cody; Walker, Karl et al. (2018) DNAp: A Pipeline for DNA-seq Data Analysis. Sci Rep 8:6793

Causey, Jason L; Zhang, Junyu; Ma, Shiqian et al. (2018) Highly accurate model for prediction of lung nodule malignancy with CT scans. Sci Rep 8:9286

Olson, Randal S; Cava, William La; Mustahsan, Zairah et al. (2018) Data-driven advice for applying machine learning to bioinformatics problems. Pac Symp Biocomput 23:192-203

Comments

Be the first to comment on Xiuzhen Huang's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: