Missing data occur in nearly every study carried out to evaluate the effectiveness of proposed treatments for substance abuse. The problem is very apparent when the study is carried out over a period of time. Patients often miss clinic visits, fail to give urine samples when required, or simply drop out of the study. Until recently, missing data were handled by deletion or various methods of data imputation. Each of which has known problems, such as producing biased results (Rovine and Delaney, 1990). Beginning in the late 1970's and early 1980's, several researchers (Dempster, et al, 1977; Little and Rubin, 1987; Marini, et al, 1980; and others) began summarizing and extending the theoretical foundations for calculating maximum likelihood estimates under conditions of missing data. The formulations for these general linear model analyses are based on population distribution theory. Evaluation of the validity and robustness of these new methods, for small sample data, is essentially nil. This project will evaluate two approaches to handling missing data (Complete cases/List-wise deletion and a Random Regression Model, which implements EM estimation of covariance matrix and model parameters) using Monte Carlo techniques. Several variables will be assessed for their effects on Type I error rate (when no true treatment effect is present), power (in the presence of a true treatment effect), and bias in estimating model parameters. These include: the pattern and degree of missing observations, the number of repeated observations, the sample size, and the size of true treatment effect. The final purpose of this project is to describe how data, collected as part of substance abuse research, can be appropriately analyzed using Random Regression methods. As part of this description is a planned re-analysis of several previously collected data sets. These will be used to demonstrate the utility and interpretation of predicting treatment outcome with this analysis method.
Showing the most recent 10 out of 88 publications