We propose to create new infrastructure and methods for genomic analysis and apply these to large, complex datasets for type 2 diabetes (T2D), a leading cause of morbidity and mortality that is driven by diverse genetic and environmental factors. This proposal has three primary scientific goals. (1) We will develop infrastructure and analytical tools to harmonize heterogeneous genomic datasets ascertained for the study of complex disease, as demonstrated on DNA sequencing data from over 50,000 individuals; (2) we will design statistical frameworks to identify functional mutations in T2D and analyze their biological consequences, taking advantage of existing data and resources on genetic variation, transcription, and epigenetics; and finally (3) we will democratize access to genomic data by creating user-friendly portals with automated analytical pipelines and intuitive features for data exploration. The software, methods, and web portals we build will help overcome the barriers that currently inhibit the translation of genomic data into biological knowledge and therapeutic insights for T2D.

Public Health Relevance

A major goal of biomedical research is to identify biological processes that underlie human diseases so that safe, effective therapies can be developed more rapidly and cost-effectively. New technology has made it possible to collect large-scale genomic data sets relevant to the genetic and molecular basis of disease, but interpreting this information is difficult due to challenges in accessing the underlying data, assessing its biological implications, and disseminating results so that biomedical researchers and ultimately patients can benefit. We will assemble a multi-disciplinary team to overcome these challenges and create the methods and tools necessary to translate genomic big data into biological understanding, applying these methods and tools specifically to improving our understanding of type 2 diabetes.

Agency
National Institute of Health (NIH)
Institute
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Type
Specialized Center--Cooperative Agreements (U54)
Project #
5U54DK105566-04
Application #
9351497
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Pawlyk, Aaron C
Project Start
2014-09-30
Project End
2019-08-31
Budget Start
2017-09-01
Budget End
2019-08-31
Support Year
4
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Broad Institute, Inc.
Department
Type
DUNS #
623544785
City
Cambridge
State
MA
Country
United States
Zip Code
02142
Ganna, Andrea; Satterstrom, F Kyle; Zekavat, Seyedeh M et al. (2018) Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am J Hum Genet 102:1204-1211
Li, Heng; Bloom, Jonathan M; Farjoun, Yossi et al. (2018) A synthetic-diploid benchmark for accurate variant-calling evaluation. Nat Methods 15:595-597
Tukiainen, Taru; Villani, Alexandra-Chloé; Yen, Angela et al. (2017) Landscape of X chromosome inactivation across human tissues. Nature 550:244-248
Carlston, Colleen M; O'Donnell-Luria, Anne H; Underhill, Hunter R et al. (2017) Pathogenic ASXL1 somatic variants in reference databases complicate germline variant interpretation for Bohring-Opitz Syndrome. Hum Mutat 38:517-523
Paludan-Müller, C; Ahlberg, G; Ghouse, J et al. (2017) Integration of 60,000 exomes and ACMG guidelines question the role of Catecholaminergic Polymorphic Ventricular Tachycardia-associated variants. Clin Genet 91:63-72
Kosmicki, Jack A; Samocha, Kaitlin E; Howrigan, Daniel P et al. (2017) Refining the role of de novo protein-truncating variants in neurodevelopmental disorders by using population reference samples. Nat Genet 49:504-510
Zhang, Xiaolei; Minikel, Eric V; O'Donnell-Luria, Anne H et al. (2017) ClinVar data parsing. Wellcome Open Res 2:33
Whiffin, Nicola; Minikel, Eric; Walsh, Roddy et al. (2017) Using high-resolution variant frequencies to empower clinical genome interpretation. Genet Med 19:1151-1158
Karczewski, Konrad J; Weisburd, Ben; Thomas, Brett et al. (2017) The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res 45:D840-D845
Minikel, Eric Vallabh; MacArthur, Daniel G (2016) Publicly Available Data Provide Evidence against NR1H3 R415Q Causing Multiple Sclerosis. Neuron 92:336-338

Showing the most recent 10 out of 16 publications