Perennial challenges in the field of substance use include obtaining sufficient sample sizes to test low-base rate behaviors, comparing findings across a wide array of substance use measures and diagnostic instruments which may even rely on different raters, and evaluating the generalizability of findings across the highly heterogeneous population of substance users (e.g., subgroup comparisons). One methodological approach to addressing these challenges is Integrative Data Analysis (IDA) or the simultaneous analysis of data pooled from multiple studies. Recent research using IDA demonstrates the feasibility of this approach for studying substance use and disorder. The promise of IDA, however, rests on the availability of effective techniques for data harmonization, that is, the creation of substance use and substance use disorder measures that are equivalent in scale and meaning across studies despite variation in the primary measures originally used to assess the participants. The overarching goal of this proposal is thus to develop, refine, and evaluate measurement models that will expand the domain of potential measurement contexts to which harmonization techniques can be applied. Measurement contexts include variation in instrumentation, such as scales that differ in item overlap, content, and response format or options across studies, and variation in assessment source, such as self- versus peer-report. This goal is pursued through four aims: (1) to extend and evaluate psychometric models (e.g., item factor analysis) for harmonizing continuous outcomes including symptom severity and quantity/frequency of use, (2) to develop and evaluate psychometric models (e.g., latent class and mixture models) for harmonizing categorical outcomes including substance use diagnoses based on different instruments or versions of the DSM, and to extend these models to harmonize data obtained from multiple sources (self and peers) across studies for both (3) continuous and (4) categorical substance use outcomes. We will pursue these aims through a novel combination of computer simulation studies (to evaluate the statistical properties of harmonized scores obtained from data with known population parameters) and laboratory analogue studies (to evaluate the validity of harmonized scores in controlled conditions permitting unknown participant factors to influence responding across item set or study), yielding unique information about the conditions under which these psychometric models produce valid harmonized scores in practice. This research will provide both novel methods permitting the broader use of IDA in substance use research as well as new guidelines regarding the limits of IDA. Resulting public health benefits include advancing domains of substance use research that especially benefit from large sample sizes to increase power (e.g., GWAS) or observance of low-base rate behaviors (e.g., injection drug use and HIV-related behaviors), direct tests of replicability of novel hypotheses across study (e.g., GxE interactions in BG studies), or increased population diversity to examine the generalizability of effects across subgroups.
The goal of this application is to develop and evaluate psychometric models for harmonizing substance use outcomes assessed in varying ways across independent studies, permitting the pooling of data for integrative data analysis. Integrative data analysis provides one approach to address perennial challenges in the field of substance use such as the need to obtain sufficient sample sizes to test low-base rate behaviors, compare findings across a wide array of substance use measures and diagnostic instruments which may even rely on different raters, and evaluate the generalizability of findings across the highly heterogeneous population of substance users (e.g., subgroup comparisons). Thus, the methodological implications of this application are significant for advancing substance use research addressing a range of public health issues.