In this proposal, we address the enormous challenges common complex diseases pose for genomic analysis and the enormous opportunities surmounting them offers for advancing healthcare. The common genetic disorders proposed for study here are believed to have extreme locus heterogeneity, requiring the analysis of large numbers of samples to comprehensively identify the genomic variants underlying them. We propose that a combination of deep population studies and joint analysis of SNPs, indels, and structural variants both in coding and noncoding regions will provide the next level of understanding of common genetic disorders. Whole genome sequencing (WGS) will be critical to this next-generation approach to the genomics of complex disease. WGS will need to be accompanied by the technical ability to generate and handle very large data sets, a particular focus and strength of NYGC. WGS will also need to be accompanied by new statistical tools and algorithms, which will be developed by the strong core group committed to this proposal. An overarching goal of this proposal, one that capitalizes on the power of WGS, is to identify disease- associated variants at the individual nucleotide level. In many cases pathogenic mutations fall in noncoding regions of the genome, which can only be fruitfully explored with WGS. A major effort will be put into building new computational strategies to functionally annotate noncoding transcribed sequences, and to build new datasets to enable such strategies, opening new frontiers of understanding of disease-related regulatory variants. We will explore a wide spectrum of human variation using the WGS platform, including rare variants of modest to large effect, de novo variants of large effect, and common variants of small effect. We will combine available RNA and epigenomic datasets to predict modes of action of risk and identify protective alleles. These results, combined with the integration of environmental and clinical data, will enhance our understanding of genetic risk for common disease and lay the groundwork for utilization of personal genomics in disease prevention and treatment, including the delineation of pathways for drug development. Many of the population cohorts proposed for study are from New York, which harbors the most diverse population in the world. Analyzing diverse populations is a critical component of comprehensive common disease analysis, as effect sizes of individual alleles are believed to vary in different populations due to gene- gene interactions. Using the genetic admixture present in different populations from NY and throughout the United States, we will conduct the first systematic study of these interaction effects in many phenotypes.
These aims will be accomplished through widespread collaborations, with genomicists, physicians, and patients, organized through a focused team at NYGC. They will be enriched by the collaboration and support from independent Foundations and from Industry.
Common and complex diseases present enormous challenges for genomic analysis and enormous opportunities for advancing medical research. We will address these challenges using whole genome sequencing of large numbers of ethnically diverse cohorts, applying new methods to identify non-coding variants, and population genetic approaches to identify variants and assess the comprehensiveness of our analysis. We will foster a collaborative environment to generate synergy between genomicists, physicians and patients to provide technology and data access, with the goal of making genomics clinically actionable.
Showing the most recent 10 out of 14 publications