Repeated measures data taken in longitudinal studies arc increasingly used in both clinical and epidemiological work in order to make inferences about changes in outcome measures over time. Several new maximum likelihood procedures are now commercially available through major software packages which permit the analysis of longitudinal data. Although this represents a major advance in the last few years, these programs have several limitations. In particular, they assume multivariate normality, linear models, and that any missing data are missing at random (Little and Rubin, 1987). GEE Liang and Zeger (1986a,b) has become an increasingly popular approach for analyzing correlated data because it does not require making distributional assumptions about the data, permits nonlinear mean functions and yields consistent estimates of mean parameter vectors and their standard errors under mild conditions. A version of the GEE program currently available through SAS and 5-Plus has several limitations which make it awkward for longitudinal data analysis, including restriction to monotone patterns of missingness and 'missing completely at random' data, allowing the variance to vary only as a function of the mean, and estimation of working correlations using an 'all available pairs' approach. The work we propose in the area of longitudinal studies has two specific aims. First, we propose to develop methods for handling informative dropouts, where missing at random may not hold. Our method will permit time of dropout to be informative about rate of change, and can incorporate measurements taken after subjects are withdrawn from the study.
Our second aim i s to provide a computer program for implementing the GEE with longitudinal data which eliminates the restrictions on missing data patterns and variance assumption, and utilizes an alternative method for estimating working variance-covariance matrices. We also propose to incorporate inverse probability weights as suggested by Robins et al. (1994) , which extends the validity of the method to include data which are missing at random. Finally we plan to extend our methods previously developed for the analysis of binary longitudinal outcomes to study familial aggregation. Familial aggregation includes a broad range of studies designed to investigate the clustering of diseases in families, with the intent of separating out environmental and genetic factors. Such studies are increasingly used to study the influence of genetic factors in diseases before undertaking formal segregation analyses which require the specification of the genetic model. Currently much interest focuses on the use of a particular auto-logistic type model for binary traits which has many attractive features, including the ease of interpretation of the 'aggregation' parameters and of making inferences about parameters using available software. A major limitation of the model is the dependence of the parameters of interest on family size. We propose to develop a methodology for estimation and inference for a closely related model which is specifically formulated to remove the dependence of parameters on family size.
Showing the most recent 10 out of 76 publications