Built upon two advanced nonparametric statistical techniques, Multivariate Adaptive Regression Splines and Classification and Regression Trees, tree-based methods will be developed and applied to explore the data from the Yale Pregnancy Outcome Study (YPOS), which was designed to examine the relationship between pregnancy outcome and a variety of risk factors, including prescription drug and alcohol use, tobacco smoke, caffeine consumption, and contraceptive practice. Data from about 7,000 subjects will be available from two related YPOS's. Analyses will also be extended to several other important databases including the 1988 National Health Interview Survey on Child Health. In contrast to traditional statistical methods and software, the mechanisms that we will employ and investigate have several advantages: (i) automatically finding the important variables and significant interactions among a large number of variables, making it more likely that new risk factors for pregnancy outcome study (other studies as well) will be discovered; (ii) identifying high risk individuals; (iii) efficiently using data by dealing with missing data and predictors of mixed (ordinal, nominal, and nested) types appropriately. We will study pregnancy outcomes associated with perinatal death such as intrauterine growth retardation, small for gestational age, and preterm delivery, and determine the relationship between these outcomes and putative risk factors. Although the YPOS data base has been extensively analyzed using more traditional methods, the tree-based methods will provide a deeper understanding of risk factors, and will therefore impact on the development of plans for public health programs to prevent birth defects. The emphasis of this project will be on interactive effects among risk factors in connection to the outcome of interest (e.g., miscarriage or birthweight). The methods and software developed by this study will offer researchers the opportunity to perform more flexible, realistic, and efficient analyses in epidemiologic studies.

Project Start
1994-05-01
Project End
1999-04-30
Budget Start
1995-05-01
Budget End
1996-04-30
Support Year
2
Fiscal Year
1995
Total Cost
Indirect Cost
Name
Yale University
Department
Public Health & Prev Medicine
Type
Schools of Medicine
DUNS #
082359691
City
New Haven
State
CT
Country
United States
Zip Code
06520
Zhang, H; Triche, E; Leaderer, B (2000) Model for the analysis of binary time series of respiratory symptoms. Am J Epidemiol 151:1206-15
Zhang, H; Bonney, G (2000) Use of classification trees for association studies. Genet Epidemiol 19:323-32
Zhang, H; Merikangas, K (2000) A frailty model of segregation analysis: understanding the familial transmission of alcoholism. Biometrics 56:815-23
Zhang, H (1999) Analysis of infant growth curves using multivariate adaptive splines. Biometrics 55:452-9
Zhang, H; Zhao, H; Merikangas, K (1997) Strategies to identify genes for complex diseases. Ann Med 29:493-8
Zhao, H; Zhang, H; Rotter, J I (1997) Cost-effective sib-pair designs in the mapping of quantitative-trait loci. Am J Hum Genet 60:1211-21
Zhang, H; Bracken, M B (1996) Tree-based, two-stage risk factor analysis for spontaneous abortion. Am J Epidemiol 144:989-96
Zhang, H; Risch, N (1996) Mapping quantitative-trait loci in humans by use of extreme concordant sib pairs: selected sampling by parental phenotypes. Am J Hum Genet 59:951-7
Risch, N J; Zhang, H (1996) Mapping quantitative trait loci with extreme discordant sib pairs: sampling considerations. Am J Hum Genet 58:836-43
Zhang, H; Bracken, M B (1995) Tree-based risk factor analysis of preterm delivery and small-for-gestational-age birth. Am J Epidemiol 141:70-8

Showing the most recent 10 out of 11 publications