Blood expression profiling of major depression disorder (MDD) patients has the potential to refine disease classification and diagnosis, elucidate molecular mechanisms, and improve drug targeting and clinical trial outcomes. However, the only significant finding in the largest MDD blood RNA study, a recent RNAseq profiling of 922 individuals comparing MDD patients with healthy controls, was a composite increase in the expression of genes associated with the interferon response pathway in MDD. Statistical analysis of such datasets is complicated by disease heterogeneity and by sources of inter-individual gene expression variation, such as high person-to-person differences in human blood cell type proportions, making it difficult to find measurements that robustly distinguish clinical groups. We have recently developed latent variable based computational approaches that more effectively model heterogeneity including blood proportions in blood gene expression data and that improve the identification of disease associated differentially expressed genes and disease-associated differences in cell type proportions. We have demonstrated that our methods increase the power to detect differentially expressed genes and improve agreement among separate studies of disease-associated global RNA expression. The latent variable framework we use can be exploited for interpreting the observed changes by attributing them to specific blood cell types. Preliminary analysis using these approaches on the large recent MDD RNAseq dataset finds evidence for additional depression related signatures and differentially expressed RNAs not detected by the original analysis. Analysis of the association of these signatures with acute symptoms and their stability over time in individuals suggests that they are most likely novel MDD trait markers. We will apply this enhanced methodology to identify novel MDD-associated genes in this large RNAseq dataset, confirm these signatures by analysis of other MDD public datasets, investigate the state/trait and genetic basis for these signatures and use machine learning to generate a purely clinical and biological data-driven classification of depression subgroups. This study is expected to result in an improved analysis framework for blood RNA biomarkers studies of psychiatric disease, significant new insight into MDD associated cell type proportion and blood gene expression trait signatures, and the identification of molecularly coherent depression subgroups.
We have developed new computational methods that improve the ability to identify blood gene expression changes in genome-wide studies of blood in MDD. We will apply enhanced versions of these methods to investigate public MDD datasets. This research has the potential to refine disease classification and diagnosis, elucidate molecular mechanisms, and improve drug targeting and clinical trial outcomes.