This Independent Scientist Award will significantly enhance my research capabilities, enabling me to become a leading quantitative investigator in the field of substance use disorders (SUDs). Specifically, it will allow me to increase my knowledge in the areas of SUD phenotypes, treatment and genetics. SUDs are clinically and etiologically heterogeneous and their classification has been difficult. This application reflects my ongoing commitment to developing an innovative and interdisciplinary research program on the classification of SUDs through quantitative analysis of multidimensional data. My extensive training in computational science and prior research on biomedical informatics have provided me with the skills to design, implement and evaluate advanced algorithms and sophisticated analyses to solve challenging problems in classifying SUDs. My ongoing NIDA-funded R01 employs a large (n=~12,000) sample aggregated from multiple genetic studies of cocaine, opioid, and alcohol dependence to develop and evaluate novel statistical models to generate clinical SUD subtypes that are optimized for gene finding. This K02 proposal extends that work to evaluate treatment outcome in refined subgroups of SUD populations using data from treatment studies for cocaine, opioid, alcohol and multiple substance dependence. This project will integrate data from diagnostic behavioral variables and genotypes, as well as biological/neurobiological features of the disorders and repeated measures of treatment outcome. The primary career development goals of this application are to: (1) understand the reliability, validity and functional mechanisms of various phenotyping methods; (2) to continue training in the genetics of addictions; and (3) to gain greater knowledge of different treatment approaches and their efficacy. A solid foundation in these areas will enhance my ability to realize the full potential of the data collected and aggregated from multiple dimensions, and to use the data to design the most clinically useful analysis and generate innovative solutions to diagnostic and predictive challenges in SUD research. Through formal coursework, directed readings, individual tutoring and intensive multidisciplinary collaboration with a diverse team of world-renowned researchers, I will receive training and collect pilot data for future R01 projects by examining (Aim I): whether clinically-defined highly heritable subtypes derived in my current R01 project predict differential treatment response;
(Aim II) whether new statistical models that directly combine treatment data with behavioral, biological, and genomic data identify refined subtypes with confirmatory multilevel evidence;
and (Aim III) whether there are genetic and social moderators of treatment outcome by subtype. The overall goal of this proposal is to further my independent and multidisciplinary research program in the development of statistical methods for refined classification of SUDs. The K02 award will provide me with the protected time necessary to fully engage in the training activities described that will enhance my knowledge and skills to enable me to make important, novel contributions to the genetics and treatment of SUD.

Public Health Relevance

This project will develop novel statistical and quantitative tools to identify homogeneous subtypes of substance use disorders (SUDs) and other complex diseases to enhance gene finding and treatment matching. The proposed project will perform secondary analyses of existing data from treatment studies of cocaine, opioid, alcohol, and mixed SUDs. The proposed novel approaches are expected to advance precision medicine approaches to SUDs by enabling treatment matching and a more refined SUD classification to gene finding.

National Institute of Health (NIH)
National Institute on Drug Abuse (NIDA)
Research Scientist Development Award - Research (K02)
Project #
Application #
Study Section
Behavioral Genetics and Epidemiology Study Section (BGES)
Program Officer
Duffy, Sarah Q
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Connecticut
Engineering (All Types)
Schools of Engineering
United States
Zip Code
Lu, Jin; Sun, Jiangwen; Wang, Xinyu et al. (2018) Inferring phenotypes from substance use via collaborative matrix completion. BMC Syst Biol 12:104
Lu, Jin; Sun, Jiangwen; Wang, Xinyu et al. (2017) Collaborative Phenotype Inference from Comorbid Substance Use Disorders and Genotypes. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2017:392-397
Shang, Chao; Palmer, Aaron; Sun, Jiangwen et al. (2017) VIGAN: Missing View Imputation with Generative Adversarial Networks. Proc IEEE Int Conf Big Data 2017:766-775