Genome-wide association studies focusing on common variants have explained a fraction of the heritable risk for many complex traits, but for many psychiatric diseases, the majority of heritable risk remains unknown. It is widely believed that rare variants also contribute to disease risk, and we and others have published examples of rare variants that contribute to psychiatric disease. Improvements in technology have now made it possible to generate large comprehensive data sets focusing on rare variants, using exome sequencing as well as the exome chip that we designed. We propose to assess the overall contribution of rare variants to disease heritability, develop statistical tests to localize these signals that are robust to population stratification, and build a map of mutation rates across the human genome for application to analysis of de novo mutations and case-only association tests. We will guide our research using >40,000 samples from psychiatric disease data sets.
In Specific Aim 1 we will quantify components of heritability attributable to rare variants. Initial exome sequencing studies in complex traits have had limited success in identifying new disease genes. This leaves the field of genetics at a crossroads. Should even greater resources be invested in sequencing studies with very large sample sizes, or should the focus shift to other approaches? We will explore the idea that even if current sample sizes are not large enough to identify new genes, they are large enough to quantify the components of heritability explained by rare variants. We will develop new methods and apply them to several psychiatric disease data sets. This work will quantify the potential of future sequencing studies in larger sample sizes to identify new disease genes.
In Specific Aim 2 we will extend rare variant tests to account for population stratification. We and others have developed statistical tests for multiple rare variants, including both burden and over-dispersion tests. These tests can succeed in detecting genes containing multiple associated rare variants, but only if sample sizes are very large. Unfortunately, large sample sizes increase the dangers of false-positive associations due to population stratification. Recent work showing differing patterns of population structure in common versus rare variants highlights the dangers of applying standard approaches using information from common variants. We will develop new methods to effectively correct for population stratification in rare variant tests and perform extensive simulations to demonstrate the efficacy of each approach.
In Specific Aim 3 we will build a map of mutation rates across the human genome. We and others have recently shown that de novo mutation screens have a potential to identify genes of interest for neuropsychiatric phenotypes. We will construct a mutation rate map informed by comparative genomics and functional genomics data and will develop new statistical approaches for the analysis of human de novo mutations and their involvement in psychiatric diseases.

Public Health Relevance

Human genetic studies have a potential to reveal biological mechanisms underlying psychiatric diseases and suggest targets for therapeutic intervention. Rapid advances in DNA sequencing technology propel studies of newly arising mutations and rare DNA variants and their potential role in mental illness. We will develop new statistical methods to assist these studies and relate DNA sequence data to psychiatric phenotypes.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Behavioral Genetics and Epidemiology Study Section (BGES)
Program Officer
Senthil, Geetha
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Reshef, Yakir A; Finucane, Hilary K; Kelley, David R et al. (2018) Detecting genome-wide directional effects of transcription factor binding on polygenic disease risk. Nat Genet 50:1483-1493
Hormozdiari, Farhad; Gazal, Steven; van de Geijn, Bryce et al. (2018) Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat Genet 50:1041-1047
Ganna, Andrea; Satterstrom, F Kyle; Zekavat, Seyedeh M et al. (2018) Quantifying the Impact of Rare and Ultra-rare Coding Variation across the Phenotypic Spectrum. Am J Hum Genet 102:1204-1211
Kaplanis, Joanna; Gordon, Assaf; Shor, Tal et al. (2018) Quantitative analysis of population-scale family trees with millions of relatives. Science 360:171-175
Gazal, Steven; Loh, Po-Ru; Finucane, Hilary K et al. (2018) Functional architecture of low-frequency variants highlights strength of negative selection across coding and non-coding annotations. Nat Genet 50:1600-1607
Turley, Patrick; Walters, Raymond K; Maghzian, Omeed et al. (2018) Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet 50:229-237
Palamara, Pier Francesco; Terhorst, Jonathan; Song, Yun S et al. (2018) High-throughput inference of pairwise coalescence times identifies signals of selection and enriched disease heritability. Nat Genet 50:1311-1317
Loh, Po-Ru; Genovese, Giulio; Handsaker, Robert E et al. (2018) Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559:350-355
Milne, Roger L (see original citation for additional authors) (2017) Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer. Nat Genet 49:1767-1778
Zheng, Jie; Erzurumluoglu, A Mesut; Elsworth, Benjamin L et al. (2017) LD Hub: a centralized database and web interface to perform LD score regression that maximizes the potential of summary level GWAS data for SNP heritability and genetic correlation analysis. Bioinformatics 33:272-279

Showing the most recent 10 out of 47 publications