In this proposal, we address the enormous challenges common complex diseases pose for genomic analysis and the enormous opportunities surmounting them offers for advancing healthcare. The common genetic disorders proposed for study here are believed to have extreme locus heterogeneity, requiring the analysis of large numbers of samples to comprehensively identify the genomic variants underlying them. We propose that a combination of deep population studies and joint analysis of SNPs, indels, and structural variants both in coding and noncoding regions will provide the next level of understanding of common genetic disorders. Whole genome sequencing (WGS) will be critical to this next-generation approach to the genomics of complex disease. WGS will need to be accompanied by the technical ability to generate and handle very large data sets, a particular focus and strength of NYGC. WGS will also need to be accompanied by new statistical tools and algorithms, which will be developed by the strong core group committed to this proposal. An overarching goal of this proposal, one that capitalizes on the power of WGS, is to identify disease- associated variants at the individual nucleotide level. In many cases pathogenic mutations fall in noncoding regions of the genome, which can only be fruitfully explored with WGS. A major effort will be put into building new computational strategies to functionally annotate noncoding transcribed sequences, and to build new datasets to enable such strategies, opening new frontiers of understanding of disease-related regulatory variants. We will explore a wide spectrum of human variation using the WGS platform, including rare variants of modest to large effect, de novo variants of large effect, and common variants of small effect. We will combine available RNA and epigenomic datasets to predict modes of action of risk and identify protective alleles. These results, combined with the integration of environmental and clinical data, will enhance our understanding of genetic risk for common disease and lay the groundwork for utilization of personal genomics in disease prevention and treatment, including the delineation of pathways for drug development. Many of the population cohorts proposed for study are from New York, which harbors the most diverse population in the world. Analyzing diverse populations is a critical component of comprehensive common disease analysis, as effect sizes of individual alleles are believed to vary in different populations due to gene- gene interactions. Using the genetic admixture present in different populations from NY and throughout the United States, we will conduct the first systematic study of these interaction effects in many phenotypes.
These aims will be accomplished through widespread collaborations, with genomicists, physicians, and patients, organized through a focused team at NYGC. They will be enriched by the collaboration and support from independent Foundations and from Industry.

Public Health Relevance

Common and complex diseases present enormous challenges for genomic analysis and enormous opportunities for advancing medical research. We will address these challenges using whole genome sequencing of large numbers of ethnically diverse cohorts, applying new methods to identify non-coding variants, and population genetic approaches to identify variants and assess the comprehensiveness of our analysis. We will foster a collaborative environment to generate synergy between genomicists, physicians and patients to provide technology and data access, with the goal of making genomics clinically actionable.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project with Complex Structure Cooperative Agreement (UM1)
Project #
3UM1HG008901-04S1
Application #
9930374
Study Section
Program Officer
Felsenfeld, Adam
Project Start
2016-01-14
Project End
2020-11-30
Budget Start
2019-08-27
Budget End
2020-11-30
Support Year
4
Fiscal Year
2019
Total Cost
Indirect Cost
Name
New York Genome Center
Department
Type
DUNS #
078473711
City
New York
State
NY
Country
United States
Zip Code
10013
Castel, Stephane E; Cervera, Alejandra; Mohammadi, Pejman et al. (2018) Modified penetrance of coding variants by cis-regulatory variation contributes to disease risk. Nat Genet 50:1327-1334
Jereb, Saša; Hwang, Hun-Way; Van Otterloo, Eric et al. (2018) Differential 3' Processing of Specific Transcripts Expands Regulatory and Protein Diversity Across Neuronal Cell Types. Elife 7:
Regier, Allison A; Farjoun, Yossi; Larson, David E et al. (2018) Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat Commun 9:4038
Yuan, Yuan; Xie, Shirley; Darnell, Jennifer C et al. (2018) Cell type-specific CLIP reveals that NOVA regulates cytoskeleton interactions in motoneurons. Genome Biol 19:117
Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L et al. (2018) Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma. Am J Respir Crit Care Med 197:1552-1564
Stoeckius, Marlon; Hafemeister, Christoph; Stephenson, William et al. (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14:865-868
Hwang, Hun-Way; Saito, Yuhki; Park, Christopher Y et al. (2017) cTag-PAPERCLIP Reveals Alternative Polyadenylation Promotes Cell-Type Specific Protein Diversity and Shifts Araf Isoforms with Microglia Activation. Neuron 95:1334-1349.e5
Willems, Thomas; Zielinski, Dina; Yuan, Jie et al. (2017) Genome-wide profiling of heritable and de novo STR variations. Nat Methods 14:590-592
Huang, Yi-Fei; Gulko, Brad; Siepel, Adam (2017) Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat Genet 49:618-624
Turner, Tychele N; Coe, Bradley P; Dickel, Diane E et al. (2017) Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 171:710-722.e12

Showing the most recent 10 out of 14 publications