In this proposal, we address the enormous challenges common complex diseases pose for genomic analysis and the enormous opportunities surmounting them offers for advancing healthcare. The common genetic disorders proposed for study here are believed to have extreme locus heterogeneity, requiring the analysis of large numbers of samples to comprehensively identify the genomic variants underlying them. We propose that a combination of deep population studies and joint analysis of SNPs, indels, and structural variants both in coding and noncoding regions will provide the next level of understanding of common genetic disorders. Whole genome sequencing (WGS) will be critical to this next-generation approach to the genomics of complex disease. WGS will need to be accompanied by the technical ability to generate and handle very large data sets, a particular focus and strength of NYGC. WGS will also need to be accompanied by new statistical tools and algorithms, which will be developed by the strong core group committed to this proposal. An overarching goal of this proposal, one that capitalizes on the power of WGS, is to identify disease- associated variants at the individual nucleotide level. In many cases pathogenic mutations fall in noncoding regions of the genome, which can only be fruitfully explored with WGS. A major effort will be put into building new computational strategies to functionally annotate noncoding transcribed sequences, and to build new datasets to enable such strategies, opening new frontiers of understanding of disease-related regulatory variants. We will explore a wide spectrum of human variation using the WGS platform, including rare variants of modest to large effect, de novo variants of large effect, and common variants of small effect. We will combine available RNA and epigenomic datasets to predict modes of action of risk and identify protective alleles. These results, combined with the integration of environmental and clinical data, will enhance our understanding of genetic risk for common disease and lay the groundwork for utilization of personal genomics in disease prevention and treatment, including the delineation of pathways for drug development. Many of the population cohorts proposed for study are from New York, which harbors the most diverse population in the world. Analyzing diverse populations is a critical component of comprehensive common disease analysis, as effect sizes of individual alleles are believed to vary in different populations due to gene- gene interactions. Using the genetic admixture present in different populations from NY and throughout the United States, we will conduct the first systematic study of these interaction effects in many phenotypes.
These aims will be accomplished through widespread collaborations, with genomicists, physicians, and patients, organized through a focused team at NYGC. They will be enriched by the collaboration and support from independent Foundations.

Public Health Relevance

The diseases that NYGC proposes to study, autism, autism and Alzheimer's, all have a large public health burden and often one that differs based on individuals' ethnicity. By studying large, ethnically diverse cohorts, using family-based cohorts when possible, NYGC will uncover disease-associated alleles that can be used for prevention, screening and treatment. Further, the data sets created will serve as a resource to the community and can be mined for other disease associations and ethnicity-specific allele frequencies.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project with Complex Structure Cooperative Agreement (UM1)
Project #
5UM1HG008901-02
Application #
9205530
Study Section
Special Emphasis Panel (ZHG1)
Program Officer
Felsenfeld, Adam
Project Start
2016-01-14
Project End
2019-11-30
Budget Start
2016-12-01
Budget End
2017-11-30
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
New York Genome Center
Department
Type
DUNS #
078473711
City
New York
State
NY
Country
United States
Zip Code
10013
Castel, Stephane E; Cervera, Alejandra; Mohammadi, Pejman et al. (2018) Modified penetrance of coding variants by cis-regulatory variation contributes to disease risk. Nat Genet 50:1327-1334
Jereb, Saša; Hwang, Hun-Way; Van Otterloo, Eric et al. (2018) Differential 3' Processing of Specific Transcripts Expands Regulatory and Protein Diversity Across Neuronal Cell Types. Elife 7:
Regier, Allison A; Farjoun, Yossi; Larson, David E et al. (2018) Functional equivalence of genome sequencing analysis pipelines enables harmonized variant calling across human genetics projects. Nat Commun 9:4038
Yuan, Yuan; Xie, Shirley; Darnell, Jennifer C et al. (2018) Cell type-specific CLIP reveals that NOVA regulates cytoskeleton interactions in motoneurons. Genome Biol 19:117
Mak, Angel C Y; White, Marquitta J; Eckalbar, Walter L et al. (2018) Whole-Genome Sequencing of Pharmacogenetic Drug Response in Racially Diverse Children with Asthma. Am J Respir Crit Care Med 197:1552-1564
Willems, Thomas; Zielinski, Dina; Yuan, Jie et al. (2017) Genome-wide profiling of heritable and de novo STR variations. Nat Methods 14:590-592
Huang, Yi-Fei; Gulko, Brad; Siepel, Adam (2017) Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat Genet 49:618-624
Turner, Tychele N; Coe, Bradley P; Dickel, Diane E et al. (2017) Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 171:710-722.e12
Kim-Hellmuth, Sarah; Bechheim, Matthias; Pütz, Benno et al. (2017) Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat Commun 8:266
Mohammadi, Pejman; Castel, Stephane E; Brown, Andrew A et al. (2017) Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change. Genome Res 27:1872-1884

Showing the most recent 10 out of 14 publications