Rare variants contribute to schizophrenia risk, including gene disrupting large insertions and deletions and single nucleotide variants? however, these mutations and the causal genes they disrupt have proven difficult to pinpoint and characterize, and their effect sizes are estimated across broad classes of variants and gene sets. New data from next generation DNA sequencing applications enable rare variant detection and association. We propose to develop statistical models to infer causal genes and variants, and their effects on schizophrenia risk, from large scale exome sequencing data. Schizophrenia is a common, complex psychiatric disorder that affects as much as 1% of the population, over two million people in the United States. Schizophrenia comes with debilitating comorbidities ranging from unemployment to early death, and there is no cure, only palliative treatment with moderate success rates. Development of genomic models for clinical risk prediction of common, complex disease risk is a burgeoning area of research, with long-term potential for
. Here we propose to build an innovative framework for genetic architecture inference that can integrate all available genomic and clinical data, for both discovery genetics and risk prediction. In Specific Aim 1, we will develop and analyze a model for rare functional DNA sequence variants in a hierarchical Bayesian framework that will allow formal integration as probabilistic determinants of causality for disease risk. The model will be developed in concert with application to schizophrenia discovery genetics, by analyzing whole exome sequence data from case/control and simplex trio samples and integrating functional genomic data on brain and neuronal specific protein complexes and interaction, expression quantitative traits and chromatin regulation. These analyses will deliver posterior probabilities of causality for individual genes and variants, and posterior distributions of effect sizes of rare risk variants across genes, annotations and exposures. In Specific Aim 2, we will extend the model to incorporate individual level data including common GWAS variants and epidemiological variables in the same samples. These analyses will refine rare variant effect size estimates and promises to provide insights into rare variant mechanisms of action across implicated genes. In Specific Aim 3, we will apply the method in two contexts that strongly depend on an accurate rare variant genetic architecture for schizophrenia. First, power analyses of next generation sequencing studies will be conducted to prioritize future study designs and to determine the sample sizes needed. Second, genomic risk prediction models will be assessed, with possible implications for risk stratification in clinical and research contexts. PUBLIC HEALTH RELEVANCE: Schizophrenia is a common, complex psychiatric disorder that affects 1% of the population, over two million people in the United States, and there is no cure. This project will develop statistical models for rare highly penetrant variants and the genes they disrupt, which have proven difficult to characterize, by integrating all available data. The project will highlight putatively causal genes, prioritize future study designs, and pilot risk prediction based on genome sequence data in schizophrenia patients.