Methods for integrated analysis of multi-level omics data

Foulkes, Andrea

Abstract

Novel analytic paradigms allowing for fully integrated interrogation of independent genomics data resources is expected to reveal substantial new knowledge regarding the mechanistic foundations of genetic associations. In this proposal we aim to develop, evaluate and apply sound statistical methods for leveraging and integrat- ing the vast amount of publicly available transcriptome and genomics resources to improve understanding of the mechanistic relationships among genes and regulatory elements associated with complex traits. Ultimately, methods for uncovering the molecular and physiological underpinnings of complex diseases will provide clin- ically relevant impact toward development of novel prognostic markers and therapeutic targets. The Speci?c Aims are to: (1) Develop a likelihood-based framework for integrated analysis of genomic elements, expression pro- ?les and phenotypes. An overarching challenge in this setting is that transcriptomics data, composed of genotypes and expression pro?les, and GWA data, composed of genotypes and complex traits, are only generally available for independent cohorts. We propose combining these two data resources and framing the analysis in terms of a missing data problem. The unobserved expression pro?les in the GWA data are treated as missing and an expectation-maximization (EM) approach is proposed. Methods for ef?cient implementation and inference, as well as an alternative Bayesian MCMC approach, are also described. (2) Extend the methods of Aim 1 for alternative data structures and types. The framework of Aim 1 will be further developed to: (a) account for complex linkage disequilibrium (LD) structures within and across genes; (b) address disparities across genotyping platforms; (c) provide for simultaneous investigation of multiple cell and tissue compartments, multiple isoforms, and multiple genes and regulatory elements; and (d) accommodate time-varying biomarker pro?les and time-to-event outcomes. (3) Apply and evaluate performance of the methods developed in Aims 1 and 2. In addition to fully vetting the proposed methods and comparing to alternative strategies using extensive simulation studies, we will further unravel and elucidate the mechanisms of gene and regulatory element control of complex traits using multiple publicly-available reference transcriptome data resources, repeatedly measured biomarker data arising from the GENE study, and clinical outcomes from the CRIC study (see Section C). This application launches from an extensive, decade-long and highly productive trans-disciplinary collabora- tion. Building on a strong research and mentoring record, the proposed research offers novel statistical research addressing pressing challenges in precision medicine.

Public Health Relevance

The emerging collections of big data in genomic medicine promise unprecedented opportunities to elucidate complex disease etiology and inform clinical management strategies. Using in?ammatory stress as our model system, we propose to develop, evaluate and apply new analytic paradigms for integrated analysis of publicly- available transcriptome data and data arising from genome-wide association studies, with the goal of improving understanding of the mechanisms of complex diseases. Ultimately, these methods will allow us to derive infor- mation from the vast quantities of genomics data for personalized, clinical decisions and thus serve as a central component of precision medicine.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM127862-02
Application #: 9646365
Study Section: Biostatistical Methods and Research Design Study Section (BMRD)
Program Officer: Brazhnik, Paul

Project Start: 2018-04-01
Project End: 2022-03-31
Budget Start: 2019-04-01
Budget End: 2020-03-31
Support Year: 2
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Mount Holyoke College
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 066985714

City: South Hadley
State: MA
Country: United States
Zip Code: 01075

Related projects


NIH 2020 R01 GM	Methods for integrated analysis of multi-level omics data Foulkes, Andrea S. / Massachusetts General Hospital
NIH 2019 R01 GM	Methods for integrated analysis of multi-level omics data Foulkes, Andrea S. / Mount Holyoke College
NIH 2019 R01 GM	Methods for integrated analysis of multi-level omics data Foulkes, Andrea S. / Massachusetts General Hospital
NIH 2018 R01 GM	Methods for integrated analysis of multi-level omics data Foulkes, Andrea S. / Mount Holyoke College

Comments

Be the first to comment on Andrea Foulkes's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: