Systems Level Causal Discovery in Heterogeneous TOPMed Data

Benos, Panagiotis; Sciurba, Frank

Abstract

SYSTEMS LEVEL CAUSAL DISCOVERY IN HETEROGENEOUS TOPMED DATA ABSTRACT The advent of new technologies for collecting and analyzing multiple heterogeneous data streams from the same individual makes possible the detailed phenotypic characterization of diseases and paves the way for the development of individualized precision therapies. A major bottleneck in this process is the lack of robust, efficient and truly integrative analytic methods for such multi-modal data. This proposal builds on the ongoing efforts of our group in the area of causal learning in biomedicine. The objective of this application is to extend, modify and tailor our causal probabilistic graphical models to data typically collected by TOPMed projects, such as ?omics data (SNPs, metabolomics, RNA-seq, etc), imaging, patients' history, and clinical data. COPDGene is one of the TOPMed projects and has generated datasets with those modalities for 10,000 patients with chronic obstructive pulmonary disease (COPD), the third leading cause of death and a major cause of disability and health care costs in the US. The prevailing view is that COPD is a syndrome, consisting of multiple diseases with different characteristics. There is currently no satisfactory method for COPD subtyping or prediction of disease progression. In this project we will apply, test and validate our approaches on COPDGene and another large independent COPD cohort. The extension and application of our methods to cross-sectional and longitudinal data will also allow us to investigate a number of important questions and aspects related to COPD. Mechanistically, we will investigate how SNPs, genes and their networks are causally linked to disease phenotypes. In pathology, we will identify conditional biomarkers, which will lead to disease sub-classification and identification of causal components in each subtype. In pathophysiology, we will identify features that are directly linked to lung function decline and outcome. We will make all our algorithms and results available to the community through web and public cloud interfaces. The deliverables will be (1) new probabilistic approaches for integration and analysis of multi-modal cross-sectional and longitudinal data, including SNPs, blood biomarkers, CT scans and clinical data; (2) new cloud-based server to make these approaches available to the research community; (3) results on the mechanism, pathology and pathophysiology of COPD facilitation and progression. To guarantee the success of the project we have assembled a team of experts in genomics, machine learning, cloud computing and COPD. This cross- disciplinary team project will have a positive impact beyond the above deliverables, since the generality of our approaches makes them applicable to any disease. We expect that during this U01 we will have the opportunity to collaborate with other teams in the TOPMed consortium to help them investigate the causes of their corresponding disease phenotypes. We do believe that data integration in a single probabilistic framework will be in the heart of precision medicine strategies in the future, when massive high-throughput data collection will become a routine diagnostic and prognostic procedure in all hospitals.

Public Health Relevance

Current technologies for high-throughput biomedical data collection allow the interrogation of multiple modalities from a single patient. New promising analytical methods started emerging, which can analyze those multi-modal data in a holistic way. Chronic obstructive pulmonary disease (COPD) constitutes the third leading cause of death and a major cause of disability and health care costs in the US. The prevailing view is that COPD is a syndrome, consisting of multiple diseases with their own characteristics. There is currently no satisfactory method for COPD subtyping. We will apply, test and validate new probabilistic approaches on two cohorts of COPD patients. We will investigate the mechanisms of disease facilitation; we will identify patient cohorts with specific characteristics (disease subtypes); and investigate risk factors and causal variants for the disease progression in each subtype.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Heart, Lung, and Blood Institute (NHLBI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 5U01HL137159-02
Application #: 9473087
Study Section: Special Emphasis Panel (ZHL1)
Program Officer: Gan, Weiniu

Project Start: 2017-04-18
Project End: 2020-03-31
Budget Start: 2018-04-01
Budget End: 2019-03-31
Support Year: 2
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: University of Pittsburgh
Department: Biology
Type: Schools of Medicine
DUNS #: 004514360

City: Pittsburgh
State: PA
Country: United States
Zip Code: 15213

Related projects


NIH 2019 U01 HL	Systems Level Causal Discovery in Heterogeneous TOPMed Data Benos, Panagiotis V.; Sciurba, Frank / University of Pittsburgh
NIH 2018 U01 HL	Systems Level Causal Discovery in Heterogeneous TOPMed Data Benos, Panagiotis V.; Sciurba, Frank / University of Pittsburgh
NIH 2017 U01 HL	Systems Level Causal Discovery in Heterogeneous TOPMed Data Benos, Panagiotis V.; Sciurba, Frank / University of Pittsburgh	$607,934

Publications

Raghu, Vineet K; Ramsey, Joseph D; Morris, Alison et al. (2018) Comparison of strategies for scalable causal discovery of latent variable models from mixed data. Int J Data Sci Anal 6:33-45

Kitsios, Georgios D; Fitch, Adam; Manatakis, Dimitris V et al. (2018) Respiratory Microbiome Profiling for Etiologic Diagnosis of Pneumonia in Mechanically Ventilated Patients. Front Microbiol 9:1413

Manatakis, Dimitris V; Raghu, Vineet K; Benos, Panayiotis V (2018) piMGM: incorporating multi-source priors in mixed graphical models for learning disease networks. Bioinformatics 34:i848-i856

Ping, Peipei; Hermjakob, Henning; Polson, Jennifer S et al. (2018) Biomedical Informatics on the Cloud: A Treasure Hunt for Advancing Cardiovascular Medicine. Circ Res 122:1290-1301

Raghu, Vineet K; Beckwitt, Colin H; Warita, Katsuhiko et al. (2018) Biomarker identification for statin sensitivity of cancer cell lines. Biochem Biophys Res Commun 495:659-665

Andrews, Bryan; Ramsey, Joseph; Cooper, Gregory F (2018) Scoring Bayesian Networks of Mixed Variables. Int J Data Sci Anal 6:3-18

Comments

Be the first to comment on Panagiotis Benos's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: