Methodology for evaluating biomarker studies lags far behind that for evaluating therapeutic and epidemiologic studies. For example, the notion of covariate adjustment is well established in therapeutic and epidemiologic research for proper evaluation of therapeutic and exposure effects. However, in biomarker research, covariate adjustment has not yet been defined. Also, in epidemiology there is a clear understanding of the attributes and limitations of matching controls to cases, but not in biomarker research. Factors that are correlated with the biomarker and/or disease are termed covariates. There are many examples, including subject characteristics (e.g., age is associated with PSA levels and with risk of prostate cancer in men), collection and processing factors (e.g., the biomarker may vary with the covariate study site in a multicenter study) and disease characteristics (e.g., the biomarker may vary with histology or stage of the cancer). We believe that without proper adjustment for covariates the receiver operating characteristic (ROC) curve for a biomarker can be biased. Moreover, comparisons between biomarkers can be invalid. Importantly, matching of controls to cases in study design does not obviate the need for covariate adjustment: bias and invalid comparisons of biomarkers can still result. We propose to develop an understanding of the various roles of covariates in biomarker evaluations and to develop simple techniques for including them in data analyses. We will address covariate adjustment in the evaluation of a biomarker's performance with the ROC curve (Aim 1), in making comparisons between biomarkers (Aim 2(i)) and in evaluating variations in the performance of a biomarker (Aim 2(ii)).
In Aim 3 a detailed study of the attributes and limitations of matching controls to cases in study design will be undertaken. In many settings existing clinical factors or biomarkers are predictive and the goal in studying a new marker is to evaluate its added benefit. Methods for evaluating this, the incremental value of a marker, will be studied in Aim 4. Phase 2 studies conducted by the Early Detection Research Network (EDRN) will provide context for our work. Programs written in Stata will be made available through the Stata archive, the EDRN and through a 'Diagnostics and Biomarkers Statistical Center'web site.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Cancer Biomarkers Study Section (CBSS)
Program Officer
Wagner, Paul D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Fred Hutchinson Cancer Research Center
United States
Zip Code
Bansal, Aasthaa; Pepe, Margaret Sullivan (2013) When does combining markers improve classification performance and what are implications for practice? Stat Med 32:1877-92
Seymour, Christopher W; Cooke, Colin R; Wang, Zheyu et al. (2013) Improving risk classification of critical illness with biomarkers: a simulation study. J Crit Care 28:541-8
Pepe, Margaret S (2011) Problems with risk reclassification methods for evaluating prediction models. Am J Epidemiol 173:1327-35
Huang, Y; Pepe, M S (2010) Semiparametric methods for evaluating the covariate-specific predictiveness of continuous markers in matched case-control studies. J R Stat Soc Ser C Appl Stat 59:437-456
Morris, Daryl E; Pepe, Margaret Sullivan; Barlow, William E (2010) Contrasting two frameworks for ROC analysis of ordinal ratings. Med Decis Making 30:484-98
Pepe, Margaret S; Gu, Jessie W; Morris, Daryl E (2010) The potential of genes and other markers to inform about risk. Cancer Epidemiol Biomarkers Prev 19:655-65
Huang, Ying; Pepe, Margaret Sullivan (2009) Semiparametric methods for evaluating risk prediction markers in case-control studies. Biometrika 96:991-997