The development of diagnostic predictors as well as assessing agreement between raters or devices is important for many studies in the Division of Epidemiology, Statistics, and Prevention Research. Features of these data are complex and new statistical methodology is often required to adequately address these analytic issues. A goal in the natural driving study is to predict the occurrence of crashes or near crashes among teenagers from poor driving behavior at or before the time of the crash. Developing and assessing the performance of a predictor is complex since measurements are taken in real time over a long period, and some individuals have many more crashes than others. Also, the quality of the predictor may itself be related to the number of near crashes observed on a given teenager (the quality of the prediction may be worse for those who have more crashes). We plan on developing a within-cluster resampling approach in developing this predictor. Although the methodology for developing the predictor is established (although this would be a novel application of this methodology), further research will be needed in developing methodology for validating the predictor. We will examine different cross-validation procedures for this problem and generalize our work in order to develop guidelines for developing and validating predictors from longitudinal data when the number of events is informative. Predicting the occurrence and time of ovulation is an important goal in reproductive epidemiology. The biocycle study collects information on the occurrence and time of ovulation for two menstrual cycles on approximately 250 women. We will use multivariate longitudinally biomarker data in developing a predictor of this event. New statistical methodology will be required for estimating these predictive models as well as to validate their predictive performance. Measures of agreement between raters have been developed for raters at a single time as well as raters over time. Recently, the ENDO study encountered an unsolved analytic problem in which the agreement is monotonically increasing. Specifically, a group of expert raters are presented with a case with an increasing amount of information at each of a series of time points. Interest is on testing whether the agreement is increasing over time with this additional information. New statistical methodology will be developed for this problem under this project.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Zip Code
Xie, Yunlong; Chen, Zhen; Albert, Paul S (2013) A crossed random effects modeling approach for estimating diagnostic accuracy from ordinal ratings without a gold standard. Stat Med 32:3472-85
Tang, Liansheng Larry; Liu, Aiyi; Chen, Zhen et al. (2013) Nonparametric ROC summary statistics for correlated diagnostic marker data. Stat Med 32:2209-20
Chen, Zhen; Zhang, Bo; Albert, Paul S (2011) A joint modeling approach to data with informative cluster size: robustness to the cluster size model. Stat Med 30:1825-36
Shih, Joanna H; Albert, Paul S (2010) Modeling familial association of ages at onset of disease in the presence of competing risk. Biometrics 66:1012-23
Albert, Paul S (2009) Estimating diagnostic accuracy of multiple binary tests with an imperfect reference standard. Stat Med 28:780-97
Laiyemo, Adeyinka O; Murphy, Gwen; Albert, Paul S et al. (2008) Postpolypectomy colonoscopy surveillance guidelines: predictive accuracy for advanced adenoma at 4 years. Ann Intern Med 148:419-26