The objective of the proposed research is to develop and apply new procedures for analyzing data with missing covariates in the context of three cancer studies, with the following aims: to develop biostatistical methods that will be generally useful in research on cancer as well as in other research areas; to conduct innovative analyses of the data from our studies; and to develop guidelines for when specific methods are preferable to others. To focus the techniques and illustrate the methodology, we will analyze data from the following studies: a randomized clinical trial to study the efficacy of dietary supplements in preventing polyps of the large bowel; a retrospective study of the recurrence of tonsil cancer following radiation therapy; and a randomized clinical trial and related prospective cohort study that will be used to examine the time to recurrence of lung cancer following surgery and the time to death following surgery. The models used to analyze the three studies will include logistic regression, survival models, and cure models that combine logistic regression and survival models. The procedures that we will develop to fit these models in the presence of missing covariate data include maximum- likelihood methods using variants of the EM algorithm, Bayesian methods using variants of Gibbs sampling, and multiple-imputation techniques. The methods developed will be compared with each other (when applicable) and with alternatives from the literature such as complete-case analysis and single imputation, using both the actual study data and simulated data that are plausible realizations of study results derived from known generating mechanisms. These comparisons will explore the performances of the methods under various conditions involving the sample size, the amount of missing data, the mechanism causing missing data, and the true model for the data. Performance criteria will include validity of inferences, efficiency, bias, and robustness to model violations.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project (R01)
Project #
5R01CA064235-02
Application #
2106582
Study Section
Special Emphasis Panel (ZRG7-SSS-1 (01))
Project Start
1994-08-15
Project End
1998-05-31
Budget Start
1995-06-01
Budget End
1996-05-31
Support Year
2
Fiscal Year
1995
Total Cost
Indirect Cost
Name
University of California Los Angeles
Department
Biostatistics & Other Math Sci
Type
Schools of Public Health
DUNS #
119132785
City
Los Angeles
State
CA
Country
United States
Zip Code
90095