In this proposal, we address the enormous challenges common complex diseases pose for genomic analysis and the enormous opportunities surmounting them offers for advancing healthcare. The common genetic disorders proposed for study here are believed to have extreme locus heterogeneity, requiring the analysis of large numbers of samples to comprehensively identify the genomic variants underlying them. We propose that a combination of deep population studies and joint analysis of SNPs, indels, and structural variants both in coding and noncoding regions will provide the next level of understanding of common genetic disorders. Whole genome sequencing (WGS) will be critical to this next-generation approach to the genomics of complex disease. WGS will need to be accompanied by the technical ability to generate and handle very large data sets, a particular focus and strength of NYGC. WGS will also need to be accompanied by new statistical tools and algorithms, which will be developed by the strong core group committed to this proposal. An overarching goal of this proposal, one that capitalizes on the power of WGS, is to identify disease- associated variants at the individual nucleotide level. In many cases pathogenic mutations fall in noncoding regions of the genome, which can only be fruitfully explored with WGS. A major effort will be put into building new computational strategies to functionally annotate noncoding transcribed sequences, and to build new datasets to enable such strategies, opening new frontiers of understanding of disease-related regulatory variants. We will explore a wide spectrum of human variation using the WGS platform, including rare variants of modest to large effect, de novo variants of large effect, and common variants of small effect. We will combine available RNA and epigenomic datasets to predict modes of action of risk and identify protective alleles. These results, combined with the integration of environmental and clinical data, will enhance our understanding of genetic risk for common disease and lay the groundwork for utilization of personal genomics in disease prevention and treatment, including the delineation of pathways for drug development. Many of the population cohorts proposed for study are from New York, which harbors the most diverse population in the world. Analyzing diverse populations is a critical component of comprehensive common disease analysis, as effect sizes of individual alleles are believed to vary in different populations due to gene- gene interactions. Using the genetic admixture present in different populations from NY and throughout the United States, we will conduct the first systematic study of these interaction effects in many phenotypes.
These aims will be accomplished through widespread collaborations, with genomicists, physicians, and patients, organized through a focused team at NYGC. They will be enriched by the collaboration and support from independent Foundations.
The diseases that NYGC proposes to study, autism, autism and Alzheimer's, all have a large public health burden and often one that differs based on individuals' ethnicity. By studying large, ethnically diverse cohorts, using family-based cohorts when possible, NYGC will uncover disease-associated alleles that can be used for prevention, screening and treatment. Further, the data sets created will serve as a resource to the community and can be mined for other disease associations and ethnicity-specific allele frequencies.
|Mohammadi, Pejman; Castel, Stephane E; Brown, Andrew A et al. (2017) Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change. Genome Res 27:1872-1884|
|Kim-Hellmuth, Sarah; Bechheim, Matthias; Pütz, Benno et al. (2017) Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat Commun 8:266|
|Stoeckius, Marlon; Hafemeister, Christoph; Stephenson, William et al. (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14:865-868|
|Hwang, Hun-Way; Saito, Yuhki; Park, Christopher Y et al. (2017) cTag-PAPERCLIP Reveals Alternative Polyadenylation Promotes Cell-Type Specific Protein Diversity and Shifts Araf Isoforms with Microglia Activation. Neuron 95:1334-1349.e5|
|Willems, Thomas; Zielinski, Dina; Yuan, Jie et al. (2017) Genome-wide profiling of heritable and de novo STR variations. Nat Methods 14:590-592|
|Huang, Yi-Fei; Gulko, Brad; Siepel, Adam (2017) Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat Genet 49:618-624|
|Turner, Tychele N; Coe, Bradley P; Dickel, Diane E et al. (2017) Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 171:710-722.e12|
|Turner, Tychele N; Hormozdiari, Fereydoun; Duyzend, Michael H et al. (2016) Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA. Am J Hum Genet 98:58-74|
|Kim-Hellmuth, Sarah; Lappalainen, Tuuli (2016) Concerted Genetic Function in Blood Traits. Cell 167:1167-1169|