More than 27 million Americans suffer from Type 2 diabetes (T2D). GWAS have identified 128 lead SNPs associated with T2D and/or fasting hyperglycemia, but little is known about how these variants contribute to T2D pathogenesis. A major challenge in functionally characterizing variants found in GWAS is that each lead SNP directly associated with a trait is in LD with a collection of additional variants, and thus identifying the precise variant(s) underlying the association requires extensive computational and experimental analyses. Additionally, the majority of the associated SNPs are located within non-coding regions, where inferring functional consequences of sequence variants remains challenging. Finally, when associated SNPs are identified as candidate regulatory variants, functional testing is frequently hampered by a lack of appropriate experimental models. To address these challenges we have assembled a team of highly accomplished researchers in genomics (Frazer), epigenomics (Ren) and T2D biology (Sander). We propose to combine state-of-the-field computational methods, high throughput molecular assays, and disease modeling in human embryonic stem cells to comprehensively annotates T2D GWAS data and test variants for their gene regulatory function.
In Aim 1 we will analyze 5,150 whole-genomes for variants in T2D and fasting hyperglycemia GWAS risk-associated loci. We estimate that these analyses will identify ~1,000,000 variants with a MAF > 1% in the intervals of interest. Additionally, we will identify rare variants that are enriched in T2D patients. We estimate that intersecting these data with existing epigenomic datasets will identify ~99,000 variants in putative regulatory elements in T2D-relevant tissues.
In Aim 2 we will use three high throughput molecular assays to characterize these 99,000 candidate regulatory variants in T2D-relevant cell types. First, we will carry out massively parallel reporter assays (MPRA) to test the potentia of each SNP-harboring sequence element to act as a transcriptional enhancer, and if so, whether enhancer activity is affected by the candidate variant. Second, we will carry out a high throughput in vitro binding assay (SELEX) to determine whether the candidate variants affect DNA binding of relevant transcription factors. Third, we will predict target genes of the candidate variants using high throughput chromosome conformation capture (Hi-C) assays.
In Aim 3, results from Aim 1 and Aim 2 will be integrated to prioritize 20 beta cell-relevant SNPs for functional validation. Key criteria include: (1) the variant resides in an active beta cell enhance, (2) disrupts transcription factor binding, and (3) targets T2D-relevant genes in Hi-C assays. We will validate these variants by (1) genetic engineering of an embryonic stem cell-derived cell model of human beta cells, testing how deletion of the cis-regulatory element or introduction of the risk variant affects target gene expression; (2) genetic engineering of mouse models, testing whether candidate enhancer/target gene pairs control glucose metabolism in vivo. The proposed study will provide key insights into the underpinnings of regulatory variants identified through GWAS in T2D etiology.

Public Health Relevance

Type 2 diabetes (T2D), which affects more than 27 million Americans, can be largely attributed to genetic factors. In the proposed study we will explore non-coding regulatory variants that affect the levels of protein production as a cause of T2D. Results from the proposed research shall greatly improve our understanding of the genetic basis of T2D, and facilitate the development of novel therapies.

Agency
National Institute of Health (NIH)
Institute
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01DK105541-01
Application #
8894331
Study Section
Special Emphasis Panel (ZDK1)
Program Officer
Blondel, Olivier
Project Start
2015-05-01
Project End
2020-04-30
Budget Start
2015-05-01
Budget End
2016-04-30
Support Year
1
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Ludwig Institute for Cancer Research Ltd
Department
Type
DUNS #
627922248
City
La Jolla
State
CA
Country
United States
Zip Code
92093
Panopoulos, Athanasia D; D'Antonio, Matteo; Benaglio, Paola et al. (2017) iPSCORE: A Resource of 222 iPSC Lines Enabling Functional Characterization of Genetic Variation across a Variety of Cell Types. Stem Cell Reports 8:1086-1100
Diao, Yarui; Fang, Rongxin; Li, Bin et al. (2017) A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat Methods 14:629-635
Yu, Miao; Ren, Bing (2017) The Three-Dimensional Organization of Mammalian Genomes. Annu Rev Cell Dev Biol 33:265-289
Panopoulos, Athanasia D; Smith, Erin N; Arias, Angelo D et al. (2017) Aberrant DNA Methylation in Human iPSCs Associates with MYC-Binding Motifs in a Clone-Specific Manner Independent of Genetics. Cell Stem Cell 20:505-517.e6
DeBoever, Christopher; Li, He; Jakubosky, David et al. (2017) Large-Scale Profiling Reveals the Influence of Genetic Variation on Gene Expression in Human Induced Pluripotent Stem Cells. Cell Stem Cell 20:533-546.e7
Nariai, Naoki; Greenwald, William W; DeBoever, Christopher et al. (2017) Efficient Prioritization of Multiple Causal eQTL Variants via Sparse Polygenic Modeling. Genetics 207:1301-1312
Panopoulos, Athanasia D; D'Antonio, Matteo; Benaglio, Paola et al. (2017) iPSCORE: A Resource of 222 iPSC Lines Enabling Functional Characterization of Genetic Variation across a Variety of Cell Types. Stem Cell Reports :
D'Antonio, Matteo; Woodruff, Grace; Nathanson, Jason L et al. (2017) High-Throughput and Cost-Effective Characterization of Induced Pluripotent Stem Cells. Stem Cell Reports 8:1101-1111
D'Antonio, Matteo; Woodruff, Grace; Nathanson, Jason L et al. (2017) High-Throughput and Cost-Effective Characterization of Induced Pluripotent Stem Cells. Stem Cell Reports :
Greenwald, William W; Li, He; Smith, Erin N et al. (2017) Pgltools: a genomic arithmetic tool suite for manipulation of Hi-C peak and other chromatin interaction data. BMC Bioinformatics 18:207