Major depressive disorder (MDD) is the leading cause of disability in the world (WHO). The severe, recurrent form often onsets during childhood or adolescence and becomes chronic. Heritability (35-70%) is similar to that of other common disorders for which genetic methodologies have yielded major discoveries. However, genome-wide association (GWA) studies have not yet produced significant results, and success may require much larger samples than are available. Alternative discovery strategies are urgently needed. Here, we will integrate genome-wide gene expression data from fresh blood with GWAS data to detect expression quantitative trait loci (eQTLs - SNP loci that influence gene expression) with the goal of discovering new MDD susceptibility loci using two novel analytic strategies, including a gene network analysis method that will be substantially developed in the context of this work. We will also seek to refine and confirm the results with genome-wide expression data from brain tissue. We will recruit a new, population-based sample of 500 new recurrent MDD cases and 500 depression-free controls, using a novel strategy in collaboration with a survey research company. Using RNA extracted from RNA-stabilized whole blood, genome-wide gene expression levels will be assayed with Illumina HT-12 array, and common SNP genotypes will be assayed (Illumina 660W) from DNA. (We will also assay brain tissue from two regions from 39 case and 27 control specimens for confirmatory studies.) Potential non-genetic sources of variability will be considered and statistically controlled. Two novel analytic strategies will be used to discover MDD susceptibility alleles, genes and gene regulatory networks. (1) A univariate discovery strategy will identify case-control gene expression differences;detect SNP eQTLs for the selected genes using genome-wide SNP data in controls;and test the association of these SNPs to MDD in two independent GWA datasets. The prior selection of a small number of eQTLs in a separate sample greatly improves power by reducing the burden of multiple testing in GWA analysis. (2) We will extend a novel multivariate method that uses machine learning algorithms to detect gene regulatory networks, and then apply this model to the analysis of the association of co-regulated gene modules to MDD. These algorithms (a) use a background pathway graph that integrates a wide range of high-throughput and curated bioinformatic data to predict biological relationships between genes, to bias the algorithm toward biologically plausible hypotheses;and then (b) use eQTL data to construct a set of gene regulatory modules (co-expressed genes) and infer a sparse regulatory program for each module. The method will then be applied to the GWA data, and (using a novel transfer learning method) to brain data. These studies will identify genes and gene networks underlying susceptibility to major depression.
The proposed project will recruit individuals with and without histories of recurrent depression and then study both the DNA sequence and the activity levels of genes in blood cells. Using a new set of statistical methods, the study will attempt to identify genes and networks of genes that are altered in some people who suffer from depression. This information would be useful in finding new treatment or prevention strategies.