Rare and highly disease-penetrant human genetic variation offers potential to accelerate our understanding of mechanisms and pathways that contribute to type-2 diabetes (T2D), opening opportunities to translate findings into therapeutic targets and improved individual risk stratification. Motivated by this potential, large-scale sequencing studies have been undertaken to systematically catalog rare variation (<<1%) across the entire genome. These efforts have revealed thousands of rare variants, with unclear functional significance. The identification of causal rare variants has been hindered in three ways: First, current analytical practices focus on the coding genome for ease of interpretability, leaving unevaluated the role of noncoding variation, despite its clear importance for disease risk. Second, ignoring the polygenic nature of T2D, rare variant burden is calculated at the level of indiviual genes rather than across biological networks of genes or potentially functional noncoding regions, due to lack of computational methods for credible groupings and systematic collective evaluation. Finally, existing algorithms to identify pathogenic candidates are underpowered, a problem that is antagonized by prohibitive replication costs and impedes efforts to demonstrate compelling statistical association between rare variants and disease. Overcoming these challenges will allow us to evaluate the hypothesis that rare, particularl noncoding variation contributes risk to T2D, the aim of this proposal. We will: (1) develop algorithms to model expected levels of coding and noncoding polymorphism across human population using features empirically learned from publicly available data sets (1000 Genomes, NHLBI Exomes), implemented in a new rare variant burden test for association, (2) develop computational informatics and systems-based approaches to uncover pathogenic T2D gene networks based on genetic data from hundreds of loci implicated in T2D risk and related traits, (3) apply our new algorithms and identified gene networks to evaluate rare variant burden for T2D in ~2850 individuals sequences across the genome, and (4) demonstrate T2D relevance via replication using cost-effective multiplex targeted re-sequencing in >33,000 individuals. Completion of these aims will result in development and public release of software for the analysis of non-coding variation, and the identification of networks and rare variants contributing susceptibility to T2D.

Public Health Relevance

Worldwide, the increasing incidence of type-2 diabetes is placing a critical burden on health care, demanding new approaches to treatment and intervention. Rare non-coding mutations with large effects on T2D predisposition offer the promise for novel innovations in clinical practice, though identifying these mutations remains challenging. To overcome this challenge, our proposal develops new computational methodology to pinpoint relevant variation, the pathways in which they fall, and replication efforts to demonstrate conclusive association.

Agency
National Institute of Health (NIH)
Institute
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Type
Research Project (R01)
Project #
5R01DK101478-03
Application #
9095353
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Blondel, Olivier
Project Start
2014-09-06
Project End
2018-06-30
Budget Start
2016-07-01
Budget End
2017-06-30
Support Year
3
Fiscal Year
2016
Total Cost
Indirect Cost
Name
University of Pennsylvania
Department
Pharmacology
Type
Schools of Medicine
DUNS #
042250712
City
Philadelphia
State
PA
Country
United States
Zip Code
19104
Khetarpal, Sumeet A; Babb, Paul L; Zhao, Wei et al. (2018) Multiplexed Targeted Resequencing Identifies Coding and Regulatory Variation Underlying Phenotypic Extremes of High-Density Lipoprotein Cholesterol in Humans. Circ Genom Precis Med 11:e002070
Cousminer, Diana L; Mitchell, Jonathan A; Chesi, Alessandra et al. (2018) Genetically Determined Later Puberty Impacts Lowered Bone Mineral Density in Childhood and Adulthood. J Bone Miner Res 33:430-436
Johnson, Kelsey Elizabeth; Voight, Benjamin F (2018) Patterns of shared signatures of recent positive selection across human populations. Nat Ecol Evol 2:713-720
Lorenz, Kim; Voight, Benjamin F (2018) Dissecting an adiposity locus with an arsenal of genomics. Genome Biol 19:74
Siewert, Katherine M; Voight, Benjamin F (2018) Bivariate Genome-Wide Association Scan Identifies 6 Novel Loci Associated With Lipid Levels and Coronary Artery Disease. Circ Genom Precis Med 11:e002239
Scott, Robert A; Scott, Laura J; Mägi, Reedik et al. (2017) An Expanded Genome-Wide Association Study of Type 2 Diabetes in Europeans. Diabetes 66:2888-2902
Brynedal, Boel; Choi, JinMyung; Raj, Towfique et al. (2017) Large-Scale trans-eQTLs Affect Hundreds of Transcripts and Mediate Patterns of Transcriptional Co-regulation. Am J Hum Genet 100:581-591
Flannick, Jason (see original citation for additional authors) (2017) Sequence data and association statistics from 12,940 type 2 diabetes cases and controls. Sci Data 4:170179
Zhao, Wei; Rasheed, Asif; Tikkanen, Emmi et al. (2017) Identification of new susceptibility loci for type 2 diabetes and shared etiological pathways with coronary heart disease. Nat Genet 49:1450-1457
Yin, Peter; Anttila, Verneri; Siewert, Katherine M et al. (2017) Serum calcium and risk of migraine: a Mendelian randomization study. Hum Mol Genet 26:820-828

Showing the most recent 10 out of 21 publications