Susceptibility to alcohol and substance dependence is influenced by genetic factors. However, few specific genetic variations that alter susceptibility to addiction have been discovered. This is partly because addiction is a polygenic trait, influenced by many genetic variations, each with a small marginal effect. However, the collective effects of those genetic variations and their interactions with other genetic and environmental factors may be quite important in predisposition to alcohol and other substance use disorders and related phenotypes. It is still not clear which study designs and analysis methods are most suitable for detecting the interacting risk factors that contribute to complex traits. Unfortunately, the commonly used statistical approaches for analysis of genetic data may not be optimal for this challenging task. The long term goals of our research are to improve the detection of interacting genetic and environmental risk factors that contribute to the development of substance/alcohol use disorders, by applying optimal statistical techniques. The research proposed in this application aims to develop alternative methods for analyzing genetic data, assess the performance of the proposed methods, and importantly apply these methods to existing genetic data on substance use disorders. In particular, methods based on random forests and related resampling-based data-mining approaches will be considered. Areas of development will include methods for assessing haplotype and gene-level effects, novel permutation algorithms, and improvements in power to detect interacting factors. We will implement these methods in user-friendly software capable of analyzing the vast amounts of data produced by genome wide association scans. Simulations will be used to assess performance of the novel approaches and compare them to traditional genetic association testing methods. The optimal approaches developed in this research program will then be applied to existing datasets on substance dependence and other addiction-related phenotypes. Specifically, case-control data from the NICSNP project and the Study of Addiction: Genetics and Environment (SAGE), collected by Dr. Laura Bierut and colleagues, will be analyzed. Analysis of existing data using new statistical methods that account for genetic interactions has great potential to identify novel genetic variations that contribute to individual differences in susceptibility to substance abuse and dependence. Discovery of genetic and environmental factors that influence substance dependence and related disorders, or outcomes of treatment for these disorders, has important implications including increasing our understanding of the pathways of development of addiction and risk prediction. Perhaps more importantly, this knowledge may help identify subtypes of addiction that require different interventions leading to personalized treatment with increased success rates.

Public Health Relevance

Although progress has been made in terms of understanding the heritable aspects of substance and alcohol use disorders, few specific genetic risk factors have been identified. This is partly because the small changes in susceptibility to these disorders conferred by relevant genetic variations are individually very difficult to detect, and currently used statistical approaches for analysis of genetic data are not optimal for this challenging task. The research proposed in this application aims to develop alternative methods for analyzing genetic data, assess the performance of the proposed methods, and apply the methods to existing genetic data on substance use disorders to identify genetic risk factors for addiction and related traits.

National Institute of Health (NIH)
National Institute on Drug Abuse (NIDA)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Behavioral Genetics and Epidemiology Study Section (BGES)
Program Officer
Wideroff, Louise
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Mayo Clinic, Rochester
United States
Zip Code
Winham, Stacey J; Jenkins, Gregory D; Biernacka, Joanna M (2016) Modeling X Chromosome Data Using Random Forests: Conquering Sex Bias. Genet Epidemiol 40:123-32
Winham, Stacey J; Biernacka, Joanna M (2013) Gene-environment interactions in genome-wide association studies: current approaches and new directions. J Child Psychol Psychiatry 54:1120-34
Winham, Stacey J; Freimuth, Robert R; Biernacka, Joanna M (2013) A Weighted Random Forests Approach to Improve Predictive Performance. Stat Anal Data Min 6:496-505
Winham, Stacey J; Colby, Colin L; Freimuth, Robert R et al. (2012) SNP interaction detection with Random Forests in high-dimensional genetic data. BMC Bioinformatics 13:164