Conventional power and sample-size (P&SS) methodology for radiological imaging experiments fails to take into account imprecise pilot-study estimates of variability among radiologists, making it likely that studies will be unacceptably overpowered or underpowered. Overestimation of sample sizes results in unneeded radiologist reading time, disease-status verification, and possible risks to subjects. Underestimation of sample sizes results in inconclusive findings. Other problems with current P&SS methodology for MRMC studies are that it has only been developed for one study design, has undergone limited evaluation, and free stand-alone soft- ware does not exist for implementing it. This proposal will significantly advance DBM/OR power and sample-size methodology by contributing a thoroughly validated P&SS methodology that effectively accounts for imprecise pilot reader-variance estimates, can be used with several designs, and can be implemented using free stand-alone software. The long-term goal is to provide a thorough statistical methodology appropriate for diagnostic radiological imaging research that accounts for both patient and reader variability. The objective of this application is to improve the power and sample size (P&SS) aspect of this methodology by pursuing the following four specific aims: (1) Develop a realistic and interpretable model for generating ROC decision data for evaluating the power and sample size methodology that emulates data from clinical studies. (2) Validate a new approach to power and sample-size calculation, 'confidence-level P&SS,'that takes into account unreliable pilot-study variance estimates. (3) Extend the methodology to include other designs. (4) Develop user-friendly, free stand-alone software for implementing the methodology.
Aim 1 is necessary because current simulation models for radiological imaging data may not accurately reflect clinical studies.
Aim 2 will validate the proposed approach using the simulation model developed in Aim1.
Aim 3 will extend the P&SS methodology to include designs which include an additional factor (e.g., CAD, radiologist experience, or reading condition) for factorial as wel as nested designs (e.g., split plot de- signs).
Aim 4 will ensure essential uniform implementation and wide-spread use of the methodology. The proposed research will significantly advance P&SS methodology. The positive impact will be to accelerate the application of biomedical technologies as a result of (1) more efficient allocation of financial and human resources for research;(2) less frequent inconclusive underpowered experiments;and (3) the ability to compute sample-size estimates for several additional useful designs. The major innovation will be the development of P&SS methodology that takes into account the lack of precision in reader variance estimates by al- lowing researchers to determine sample-size estimates that correspond not only to a specified power but also to a specified confidence level that defines the probability that the specified power will actually be realized.

Public Health Relevance

By allowing radiology researchers to more precisely estimate the numbers of patients and radiologists needed for imaging experiments, the proposed research will result in more efficient utilization of financial and human resources for evaluating new imaging modalities. Thus the proposed research is relevant to the mission of the National Institute of Biomedical Imaging and Bioengineering, namely to improve health by leading the development and accelerating the application of biomedical technologies.

National Institute of Health (NIH)
National Institute of Biomedical Imaging and Bioengineering (NIBIB)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Imaging Technology Study Section (BMIT)
Program Officer
Luo, James
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Iowa
Schools of Medicine
Iowa City
United States
Zip Code
Hillis, Stephen L (2018) Relationship between Roe and Metz simulation model for multireader diagnostic data and Obuchowski-Rockette model parameters. Stat Med 37:2067-2093
Hillis, Stephen L (2016) Equivalence of binormal likelihood-ratio and bi-chi-squared ROC curve models. Stat Med 35:2031-57
Hillis, Stephen L; Schartz, Kevin M (2015) Demonstration of Multi- and Single-Reader Sample Size Program for Diagnostic Studies software. Proc SPIE Int Soc Opt Eng 9416:
Gallas, Brandon D; Hillis, Stephen L (2014) Generalized Roe and Metz receiver operating characteristic model: analytic link between simulated decision scores and empirical AUC variances and covariances. J Med Imaging (Bellingham) 1:031006
Hillis, Stephen L (2014) A marginal-mean ANOVA approach for analyzing multireader multicase radiological imaging data. Stat Med 33:330-60
Hillis, Stephen L (2012) Simulation of unequal-variance binormal multireader ROC decision data: an extension of the Roe and Metz simulation model. Acad Radiol 19:1518-28
Obuchowski, Nancy A; Gallas, Brandon D; Hillis, Stephen L (2012) Multi-reader ROC studies with split-plot designs: a comparison of statistical methods. Acad Radiol 19:1508-17
Zanca, Federica; Hillis, Stephen L; Claus, Filip et al. (2012) Correlation of free-response and receiver-operating-characteristic area-under-the-curve estimates: results from independently conducted FROC?ROC studies in mammography. Med Phys 39:5917-29
Hillis, Stephen L; Metz, Charles E (2012) An analytic expression for the binormal partial area under the ROC curve. Acad Radiol 19:1491-8