Longitudinal and survival designs are fundamental to epidemiology [e.g. studies of acquired immune deficiency syndrome (AIDS) and pulmonary function (PF)]. Such designs are the only way to assess changes in health status over time and factors that determine those changes. Many already articulated research questions concern whether prognostic subgroups of patients exist such that patients within the subgroup have similar changes in health status, while patients from differing subgroups have distinct patterns of change. Again, determining what factors characterize the differing subgroups is of central interest. However, the current statistical methodology for addressing such questions is inadequate. This project will develop statistical methodology to perform such subgroup identification by extending tree-structured techniques to longitudinal and survival data settings. Tree-structured methodology has already proved adept at subgroup identification in regression and classification problems. We will apply the techniques to at least seven datasets from AIDS and PF epidemiologic studies. The analyses will focus on resolving existing research hypotheses surrounding each dataset. Additionally, comparisons with standard techniques will be made. The development of methodology will focus on choice and properties of split functions. We will pay particular heed to data analytic issues that arise in epidemiologic studies. These include handling missing values and time- varying covariates. Additionally, problems inherent to HIV seroprevalent cohorts deriving from unknown times of infection will be addressed. The research questions to be studied by this project have also been motivated by many of the issues raised by the 1986 NHLBI workshop on longitudinal data analysis. The results of this research will furnish epidemiologists and statisticians with new analytic tools that will enable investigators to take full advantage of the benefits of longitudinal and survival designs. The ability to directly contend with prognostic group identification will enable them to resolve many outstanding research hypotheses. Tree-structured methods avoid the assumptions of more parametric analyses and therefore also afford a basis for checking such assumptions. Thus the methodologic developments proposed in this project will both expand and enhance the investigators' data analysis options.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
First Independent Research Support & Transition (FIRST) Awards (R29)
Project #
5R29GM045543-04
Application #
2183223
Study Section
Epidemiology and Disease Control Subcommittee 2 (EDC)
Project Start
1991-01-01
Project End
1995-12-31
Budget Start
1994-01-01
Budget End
1994-12-31
Support Year
4
Fiscal Year
1994
Total Cost
Indirect Cost
Name
University of California San Francisco
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
073133571
City
San Francisco
State
CA
Country
United States
Zip Code
94143
Segal, M R; Wight, S; Hanrahan, J P et al. (1997) Maternal smoking during pregnancy and birth outcomes with weight gain adjustments via varying-coefficient models. Stat Med 16:1603-16
Segal, M R; Neuhaus, J M; James, I R (1997) Dependence estimation for marginal models of multivariate survival data. Lifetime Data Anal 3:251-68
Bacchetti, P; Segal, M R (1995) Survival trees with time-dependent covariates: application to estimating changes in the incubation period of AIDS. Lifetime Data Anal 1:35-47
Segal, M R; Tager, I B (1993) Trees and tracking. Stat Med 12:2153-68
Segal, M R; Neuhaus, J M (1993) Robust inference for multivariate survival data. Stat Med 12:1019-31
Neuhaus, J M; Segal, M R (1993) Design effects for binary regression models fitted to dependent data. Stat Med 12:1259-68