Despite great progress in molecular genetic methods, considerably less progress has been made in the refinement of phenotypes for substance dependence (SD) and other psychiatric disorders. SD, as defined by the Diagnostic and Statistical Manual of Mental Disorders (DSM), is clinically and etiologically heterogeneous. The DSM-defined traits are not optimal for gene finding efforts, which has substantially limited our understanding of the genetic etiology of SD. Thus, the differentiation of homogeneous subtypes of drug use, related behaviors, and co-occurring phenotypes could improve the identification of genetic variation that underlies the risk for SD and other complex traits. Existing methods are not adequate to tackle this task. The most sophisticated subtyping methods available perform unsupervised cluster analysis or latent class analysis of a disorder's clinical features. Without theoretical guidance, blind cluster or latent class analysis can lead to subtypes of little utilityin genetic analysis. In this project, we will develop novel statistical methods to subtype SD traits quantitatively. Using data from >11,000 identically assessed subjects aggregated from family-based and case-control genetic studies (including genome-wide association studies (GWAS)) of cocaine, opioid and alcohol dependence, we will identify clinical subtypes that are optimized with respect to heritability. All subjects underwent thorough phenotyping using a poly-diagnostic instrument that includes 3000 items, yielding reliable demographic, medical, substance use, and substance-related measures, and DSM diagnoses of all major substance use and psychiatric disorders. A majority of the subjects also underwent GWAS. Our preliminary results support the hypothesis that careful subtyping of substance use and related behaviors enhances the detection of genetic variants that contribute to the risk of addiction-related phenotypes and are not detected using a standard diagnostic approach. The primary aims of the proposed research are to develop: (1) bioinformatics methods to derive quantitative traits that are highly heritable n terms of traditional narrow-sense heritability and recently-defined SNP-based heritability; (2) integrative methods to jointly analyze phenotypic features and genetic markers to identify subtypes that are homogeneous phenotypically and genetically; and (3) genetic association approaches that are more efficient for subtype analysis. The derived subtypes and their association findings will be validated using multiple independent samples. An important secondary aim of the project is to develop and disseminate validated methods and software for public use through the PI's website. In summary, the objectives of the project are significant in their potential to enhance the discovery of genetic variants that contribute to the risk of SD usin novel methods validated by the interdisciplinary research team. These methods, once applied to understanding the etiology of SD, may be suitable for extension to other complex phenotypes.
This project will develop novel statistical and quantitative tools and techniques to refine the phenotypes of substance dependence and other complex disorders to enhance genetic analysis, an important area of genetics research that is underdeveloped. The proposed novel approaches are expected to advance our understanding of genetic contributions to the heterogeneity in disease phenotypes.
|Wang, Xin; Bi, Jinbo (2017) Bi-convex Optimization to Learn Classifiers from Multiple Biomedical Annotations. IEEE/ACM Trans Comput Biol Bioinform 14:564-575|
|Johannesen, Jason K; Bi, Jinbo; Jiang, Ruhua et al. (2016) Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults. Neuropsychiatr Electrophysiol 2:3|
|Lu, Jin; Liang, Guannan; Sun, Jiangwen et al. (2016) A Sparse Interactive Model for Matrix Completion with Side Information. Adv Neural Inf Process Syst 29:4071-4079|
|Sun, Jiangwen; Jiang, Zongliang; Tian, Xiuchun et al. (2016) A cross-species bi-clustering approach to identifying conserved co-regulated genes. Bioinformatics 32:i137-i146|
|Wang, Xin; Bi, Jinbo; Yu, Shipeng et al. (2016) Multiplicative Multitask Feature Learning. J Mach Learn Res 17:|
|Sun, Jiangwen; Kranzler, Henry R; Bi, Jinbo (2015) Refining multivariate disease phenotypes for high chip heritability. BMC Med Genomics 8 Suppl 3:S3|
|Sun, Jiangwen; Kranzler, Henry R; Bi, Jinbo (2015) An Effective Method to Identify Heritable Components from Multivariate Phenotypes. PLoS One 10:e0144418|