Methods for Genetic Epidemiology We published methods of analysis for case-control family data in which probands (cases or controls) are genotyped and time to disease onset information is available for first-degree relatives. Methods to estimate relative risks, cumulative risks and residual familial aggregation are given. The work indicates that case-control family design is robust to misspecification of copula models used to accommodate residual familial correlation, but that samples with case probands only are not robust to such model misspecification. We published methods for cohort and nested case-control studies to estimate relative risks from haplotypes. Based on a hazard function derived from the observed genotype data, we developed a semiparametric method for joint estimation of relative-risk parameters and the cumulative baseline hazard function. The method performs well in simulations. We published a class of TDT-type methods that can jointly analyze haplotypes from multiple linked or unlinked candidate genes. Our approach first uses a linear signed rank statistic to compare at individual gene level the structural similarity among transmitted haplotypes against that among non-transmitted haplotypes. The results of the ranked comparisons from all considered genes are subsequently combined into global statistics, which can simultaneously test the association of the set of genes with disease. Using simulation studies, we found that the proposed tests yielded correct type I error rates in stratified populations. Compared with the gene-by-gene test, the new global tests were more powerful in situations where all candidate genes are associated with the disease. To take advantage of high-density SNP maps across the genome, various candidate gene association tests have been developed to compare multilocus genotypes or estimated haplotypes between cases and controls. We viewed the two-sample testing problem from the perspective of supervised machine learning and proposed a new association test. The approach adopts the flexible and easy-to-understand classification tree model as the learning machine and uses the estimated prediction error of the resulting prediction rule as the test statistic. The procedure not only provides an association test but also generates a prediction rule that can be useful in understanding the mechanisms underlying complex disease. In related work we published on the performance of various types of prediction error estimators for the stochastic gradient boosting model. The boundary of common haplotype blocks in Hapmap constructions can be ambiguous, as are the associated tagSNPs. We addressed this issue by defining a marker ambiguity score (MAS), and evaluated it in simulations based on a real data. We showed that the MAS method can assess boundary ambiguity caused by ethnic variation, limited sample sizes for Hapmap construction, and disease aggregation. We found striking differences in overall patterns of blocks between blacks and whites. We published a re-sampling approach to control the family-wise error level of multiple testing procedures to detect genetic associations in case-control data. An omnibus test combines single nucleotide polymorphism (SNP)-based and haplotype-based procedures and has good power whether the genetic disease tendency is conferred by SNPs or haplotypes. A related two-stage procedure is also developed that controls the false discovery rate. Methods and procedures were developed to facilitate collaborations among members of consortia working to identify low-penetrance alleles associated with breast and prostate cancer.

Agency
National Institute of Health (NIH)
Institute
Division of Cancer Epidemiology And Genetics (NCI)
Type
Intramural Research (Z01)
Project #
1Z01CP010181-04
Application #
7330712
Study Section
(BSB)
Project Start
Project End
Budget Start
Budget End
Support Year
4
Fiscal Year
2006
Total Cost
Indirect Cost
Name
Cancer Epidemiology and Genetics
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Wacholder, Sholom (2013) Precursors in cancer epidemiology: aligning definition and function. Cancer Epidemiol Biomarkers Prev 22:521-7
Rosenberg, Philip S; Tamary, Hannah; Alter, Blanche P (2011) How high are carrier frequencies of rare recessive syndromes? Contemporary estimates for Fanconi Anemia in the United States and Israel. Am J Med Genet A 155A:1877-83
Khoury, Muin J; Wacholder, Sholom (2009) Invited commentary: from genome-wide association studies to gene-environment-wide interaction studies--challenges and opportunities. Am J Epidemiol 169:227-30; discussion 234-5
Wacholder, Sholom; Rotunno, Melissa (2009) Control selection options for genome-wide association studies in cohorts. Cancer Epidemiol Biomarkers Prev 18:695-7
Nam, Jun-Mo; Kwon, Deukwoo (2009) Non-inferiority tests for clustered matched-pair data. Stat Med 28:1668-79
Gail, Mitchell H (2009) Applying the Lorenz curve to disease risk to optimize health benefits under cost constraints. Stat Interface 2:117-121
Wacholder, Sholom (2009) Bias in full cohort and nested case-control studies? Epidemiology 20:339-40
Nam, Jun-Mo (2009) Efficient interval estimation of a ratio of marginal probabilities in matched-pair data: non-iterative method. Stat Med 28:2929-35
Strasak, A M; Pfeiffer, R M; Brant, L J et al. (2009) Time-dependent association of total serum cholesterol and cancer incidence in a cohort of 172,210 men and women: a prospective 19-year follow-up study. Ann Oncol 20:1113-20
Gail, Mitchell H (2009) Value of adding single-nucleotide polymorphism genotypes to a breast cancer risk model. J Natl Cancer Inst 101:959-63

Showing the most recent 10 out of 21 publications