General epidemiologic methodologic work has progressed in four areas: (1) extension of the resampling methodology to within-cluster paired resampling (WCPR), which permits within-cluster inference for clustered binary outcomes; (2) theoretical development of design and analysis based on outcome-dependent sampling when the outcome is continuous, such as IQ or blood pressure; (3) development of semi-parametric modeling methods for combining information from previously diagnosed cases with that for occult cases who are detected by screening; (4) missing data methods were applied to the setting where molecular markers are used to subclassify cases into subsyndromes, but the cases cannot be fully classified. (1) We had developed a method based on sampling one outcome per cluster, carrying out a classical analysis on these now-independent outcomes, and then repeating the resampling many times, ultimately pooling the parameter estimates. """"""""Within Cluster Resampling"""""""" (WCR) performs well in simulations, is analytically tractable, and obviates untestable assumptions about the underlying covariance structure. We have now modified the method to permit subject-specific inference. The idea with WCPR is to sample an affected and an unaffected individual from each cluster and compare them with regard to covariate differences by fitting a paired-data logistic regression model. The paired resampling is repeated numerous times and the separate estimates pooled, as in WCR. In simulations, the method compares very favorably with conditional logistic regression (CLR) in scenarios where the assumptions required by CLR are met. But when the response to an exposure varies across clusters, the dependency structure becomes complex and the assumptions required for CLR are violated, whereas WCPR remains valid. Thus, WCPR is more broadly applicable than the standard method. WCPR may prove most useful in genetic studies where affected and unaffected siblings are to be compared with respect to an allele that may be in linkage disequilibrium with a disease gene. (2) When studying a continuous biomarker of health, such as blood pressure, we have shown that one can markedly improve the efficiency of a study (over what would be achieved with random sampling) by our proposed design, which oversamples observations at the extremes of the outcome distribution, i.e. people with unusually high or low values of the outcome, provided the proposed semi-parametric empirical likelihood methods of analysis are then used. The strategy has been applied to studies of IQ and infant neurologic scores in relation to pesticide exposure. (3) A common condition, such as uterine fibroids, can be studied by recruiting women in ages when they are at risk and ascertaining prior diagnoses of the condition. If the condition is often subclinical, one can supplement case accrual by offering screening as well. We have developed a statistical method to model onset and progression of such conditions, allowing for an initial subclinical phase. (4) Molecular and genetic markers can be used to subtype cases into etiologically distinct subtypes, which may help to clarify inference in a case-control study. Unfortunately, tissue forcarrying out the classification may be incompletely available. We showed that an analysis based on discarding the cases who could not be classified often leads to bias, whereas in a scenario where missingness can depend on covariates, but not on the underlying subtype conditional on covariates, missing data methods can both prevent bias and improve precision of estimation, by exploiting the information from cases who were enrolled but could not be subtyped.

Agency
National Institute of Health (NIH)
Institute
National Institute of Environmental Health Sciences (NIEHS)
Type
Intramural Research (Z01)
Project #
1Z01ES040006-04
Application #
6432289
Study Section
(BB)
Project Start
Project End
Budget Start
Budget End
Support Year
4
Fiscal Year
2000
Total Cost
Indirect Cost
Name
U.S. National Inst of Environ Hlth Scis
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Weinberg, Clarice R (2017) Invited Commentary: Can Issues With Reproducibility in Science Be Blamed on Hypothesis Testing? Am J Epidemiol 186:636-638
Lyles, Robert H; Mitchell, Emily M; Weinberg, Clarice R et al. (2016) An efficient design strategy for logistic regression using outcome- and covariate-dependent pooling of biospecimens prior to assay. Biometrics 72:965-75
Lyles, Robert H; Van Domelen, Dane; Mitchell, Emily M et al. (2015) A Discriminant Function Approach to Adjust for Processing and Measurement Error When a Biomarker is Assayed in Pooled Samples. Int J Environ Res Public Health 12:14723-40
Saha-Chaudhuri, Paramita; Umbach, David M; Weinberg, Clarice R (2011) Pooled exposure assessment for matched case-control studies. Epidemiology 22:704-12
Weinberg, C R (2009) Less is more, except when less is less: Studying joint effects. Genomics 93:10-2
Weinberg, Clarice R (2007) Can DAGs clarify effect modification? Epidemiology 18:569-72
Weinberg, Clarice R; Shore, David L; Umbach, David M et al. (2007) Using risk-based sampling to enrich cohorts for endpoints, genes, and exposures. Am J Epidemiol 166:447-55
Basso, Olga; Wilcox, Allen J; Weinberg, Clarice R (2006) Birth weight and mortality: causality or confounding? Am J Epidemiol 164:303-11
Howards, Penelope P; Hertz-Picciotto, Irva; Weinberg, Clarice R et al. (2006) Misclassification of gestational age in the study of spontaneous abortion. Am J Epidemiol 164:1126-36
Weinberg, C R (2005) Invited commentary: Barker meets Simpson. Am J Epidemiol 161:33-5; discussion 36-7

Showing the most recent 10 out of 17 publications