The long-term objective of this research is the creation of sound methodologies which will facilitate studying the role of both genetic and environmental factors in cancer etiology. The short-term goal of this proposal is to develop statistical methods for age at onset data from population-based family studies. These methods address two important problems in the field, namely, ascertainment and the missing data problem. Methods for analysis of age at onset are not well established, due to the complexity of censorship and possible correlations. To avoid the ascertainment problem, family studies are conducted through the multistage design, which collects population-based index subjects and their risk factors and family history at the first stage, collects risk factors for the relatives of index subjects at the second stage, and finally collects blood/tissue samples from selected families. In accordance with each stage, the specific aims of this proposal are development of methods for the following: 1) aggregation analysis of ages at onset from the first stage study; 2) combined segregation and aggregation analysis of ages at onset from the second stage study; and 3) combined linkage, segregation and aggregation analysis of ages at onset from the third stage study. Methods will also be developed for incorporating missing data problems, which include the following: missing risk factors for all relatives; missing or mismeasured risk factors for index subjects; missing or mismeasured age at onset though disease onset is known; and misclassification of disease status. Methods (1) - (3) will be applied to three genetic epidemiologic datasets. These are a population-based case-control family study of early breast cancer; an HMO population-based case-control family study of Alzheimer's disease; and a population-based case-control family study of ovarian cancer. The age at onset will be modeled semiparametrically and estimating equations techniques will be used to model the nature of correlations within families. The methods will be studied rigorously using empirical processes and other statistical devices. User-friendly programs will be developed and made available to the public.
Sun, Jianping; Zheng, Yingye; Hsu, Li (2013) A unified mixed-effects model for rare-variant association in sequencing studies. Genet Epidemiol 37:334-44 |
Hsu, Li; Jiao, Shuo; Dai, James Y et al. (2012) Powerful cocktail methods for detecting genome-wide gene-environment interaction. Genet Epidemiol 36:183-94 |
Surewicz, Witold K; Apostol, Marcin I (2011) Prion protein and its conformational conversion: a structural perspective. Top Curr Chem 305:135-67 |
Chen, Lin S; Hutter, Carolyn M; Potter, John D et al. (2010) Insights into colon cancer etiology via a regularized approach to gene set analysis of GWAS data. Am J Hum Genet 86:860-71 |
Young, Janet M; Massa, Hillary F; Hsu, Li et al. (2010) Extreme variability among mammalian V1R gene families. Genome Res 20:10-8 |