Formulating biological models from genetic studies with multidimensional phenotype data requires new analytical methods that capture the complexity of genetic systems while also providing verifiable hypotheses of variant activity. This challenge has become increasingly acute with the advent of genome-scale data resources designed to determine how genetic variation affects biological processes at molecular resolution. This proposal addresses this need by leveraging two complementary aspects of genetic complexity: pleiotropy, in which one variant affects multiple phenotypes; and epistasis, in which multiple variants interact to affect one phenotype. Although widely observed in model organisms and increasingly in human data, these phenomena are rarely distilled into concise models. Our method, called Combined Analysis of Pleiotropy and Epistasis (CAPE), integrates these aspects to mathematically constrain possible genetic models and determine a genetic network that best describes the multiple phenotypes. This is achieved through multivariate linear regression followed by a formal reparametrization that translates interaction coefficients into directed edges between variants, each representing genetic suppression or enhancement. CAPE has proven successful in model systems and we now aim to extend the approach to complex genetic systems that include greater allelic diversity, sex, and dietary differences. To this end, we will use the Diversity Outbred (DO) mouse population, the Genotype-Tissue Expression (GTEx), and the ENCODE and Roadmap projects to developing models of complex gene regulation. We will model regulatory interactions between genetic variants and epigenetic states to interpret genetic networks and uncover rules that govern gene expression. To facilitate collaborations and community use, we will develop open-source software tools. These tools will include an R-based software library to perform CAPE analysis in human and model populations, and a suite of visualization tools to facilitate researcher exploration and interpretation of results. Our overall goal is to derive new methods to discover complex and novel genetic mechanisms of gene regulation and disease risk and, in the course of this work, release analytical and visualization tools for use in complex trait research. This project is divided into three specific aims.
Aim 1 is to develop methods to infer networks of genetic variants that influence high-dimensional quantitative traits and create analytic and visualization software for community use.
Aim 2 is to apply our methods to model gene expression in DO mice and human tissues.
Aim 3 is to integrate epigenetic and genetic data to model how genetic variation and chromatin modifications interactively affect gene expression.

Public Health Relevance

The clinical success of genomic medicine is contingent upon the development of analytical methods to dissect genetic complexity. Large-scale genotype and phenotype data need to be translated into explicit, testable hypotheses of how multiple gene variants interact to affect specific health outcomes. The proposed research addresses this need through the development, application, and validation of novel computational methods to model genetic effects in a complex mammalian system. The computational tools will be implemented in open- source software designed for a broad range of genetic applications, ranging from engineered screens in model organisms to human genome-wide association data.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM115518-04
Application #
9774178
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Krasnewich, Donna M
Project Start
2016-09-01
Project End
2021-08-31
Budget Start
2019-09-01
Budget End
2020-08-31
Support Year
4
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Jackson Laboratory
Department
Type
DUNS #
042140483
City
Bar Harbor
State
ME
Country
United States
Zip Code
04609