Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation

Sabeti, Pardis

Abstract

The ENCODE project has generated comprehensive maps of cis-regulatory elements (CREs) controlling the transcription of genes within the human genome. These maps have been crucial in our efforts to understand sequence variants linked to human traits and disease, as the majority of these variants are non- coding regulatory changes rather than amino acid substitutions. However, even though we know the locations of thousands of CREs, our understanding of how they operate is derived from a relatively small set of well- described examples. Therefore, we plan to directly characterize the function of ENCODE CREs at a genome- wide scale in multiple cell-types. This will transition the field of functional genomics from a simple map of regulatory elements towards a deep understanding of the fundamental rules governing regulatory logic down to the basepair resolution. Achieving this will dramatically expand ENCODE's utility by strengthening our ability to interpret the effects of natural human variation on gene regulation. We propose to directly measure regulatory activity of over 3% of the genome, pursuing loci highlighted as important by ENCODE and other functional data. We will first apply computational methods to identify the most biologically informative CREs, representing a diversity of regulatory logic and architecture, and will use machine learning techniques to prioritize functional variants for characterization relevant to common and rare human diseases, traits, and adaptation. Of these we will select 200,000 CREs and 300,000 variants, representing 100 Mb of genomic sequence, and characterize them using the massively parallel reporter assay (MPRA) to understand each element's regulatory activity. Then, to complement data from the MPRA, we will characterize additional 1 Mb regions across 10 loci using CRISPR-based non-coding screens to build a comprehensive picture of these loci. This strategy leverages the throughput and flexibility of MPRA while maintaining the connectivity of regulatory logic in the CRISPR-based screens, which perturb elements within their endogenous genomic context. This will help us judge the accuracy and completeness of ENCODE, while also providing data from both approaches to address a wide-variety of research questions. These methods are difficult to apply to disease relevant primary cells at full scale, but we will use the results of our MPRA and CRISPR screens to inform our models and better predict the fundamental rules of regulatory logic. We will then construct smaller, targeted libraries to test disease-specific variants in primary cells and use assays specific for each of three autoimmune diseases: type 1 diabetes, inflammatory bowel disease, and lupus. This approach will inform the research community on the rules governing the activity of the CREs mapped by the ENCODE project, and will simultaneously provide concrete information about the function of hundreds of thousands of sequence variants relevant for human traits, health, and disease.

Public Health Relevance

In our proposal we seek to extend the efforts by the ENCODE consortium and others who have made significant strides towards mapping the regulatory landscape of the human genome. We will apply large-scale functional characterization methods to directly test over 3% of the human genome for cis-regulatory activity. In doing so, we will create a resource that will improve our ability to pinpoint regulatory elements in our genome, increase our understanding of how they function, and aid in our ability to link genetic variation to human health and disease.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project with Complex Structure Cooperative Agreement (UM1)
Project #: 1UM1HG009435-01
Application #: 9247640
Study Section: Special Emphasis Panel (ZHG1)
Program Officer: Pazin, Michael J

Project Start: 2017-09-12
Project End: 2021-06-30
Budget Start: 2017-09-12
Budget End: 2018-06-30
Support Year: 1
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: Broad Institute, Inc.
Department
Type
DUNS #: 623544785

City: Cambridge
State: MA
Country: United States
Zip Code: 02142

Related projects


NIH 2020 UM1 HG	Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation Sabeti, Pardis Christine / Broad Institute, Inc.
NIH 2019 UM1 HG	Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation Sabeti, Pardis Christine / Broad Institute, Inc.
NIH 2018 UM1 HG	Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation Sabeti, Pardis Christine / Broad Institute, Inc.
NIH 2018 UM1 HG	Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation Sabeti, Pardis Christine / Broad Institute, Inc.
NIH 2017 UM1 HG	Comprehensive functional characterization and dissection of noncoding regulatory elements and human genetic variation Sabeti, Pardis Christine / Broad Institute, Inc.

Comments

Be the first to comment on Pardis Sabeti's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: