Current sample size methods for multilevel and longitudinal data often get the wrong answer. Studies that are too small cannot achieve scientific goals. Studies that are too large expose human research participants to unnecessary harm. The ethical codes that bind NIH funded behavioral scientists doing research with human participants demand correct sample size selection. Flaws of current sample size methods are a critical barrier to progress. Behavioral scientists use community trials, longitudinal data, and multilevel models to compare diverse populations. Understanding addiction and gauging knowledge about head and neck cancer screening require measuring outcomes over time. The complex data yield complex variance patterns. Current software (such as Optimal Design) makes simplifying assumptions about the variance patterns, which may lead to the wrong sample size. No available software always gives the correct sample size analysis;the existing paradigm fails. There are three barriers to progress in designing multilevel and longitudinal studies: 1) inadequate tools to evaluate when current sample size methods fail;2) flawed methods and software to select sample size;and 3) insufficient training to find correct sample sizes. The successful completion of two aims will remove the three barriers.
The aims are 1) Create new sample size methods and software needed to accurately mirror multilevel and longitudinal models of normal, binary, and Poisson data common in studying health behavior in diverse populations and 2) Train behavioral scientists to use the new methods and software. Training will include template sample size analyses, tutorial publications, short courses, webinars and online tutorials. Better sample size calculations will produce better designed studies. Better designed studies will speed ethical, efficient, and effective behavioral research. Behavioral research lies at the heart of preventing addiction and cancer. Correct sample size selection is not merely a statistical nicety. The new methods will directly improve the health of millions of Americans.

Public Health Relevance

The ethics and cost of human research demands accurate sample size. Current sample size selection methods can fail with multilevel and longitudinal data. We will develop new sample size tools for behavioral scientists conducting prevention research in communities with diverse populations.

National Institute of Health (NIH)
National Institute of Dental & Craniofacial Research (NIDCR)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-AARR-F (52))
Program Officer
Clark, David
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Florida
Other Health Professions
Schools of Medicine
United States
Zip Code
Zhang, Xinrui; Muller, Keith E; Goodenow, Maureen M et al. (2018) Internal pilot design for balanced repeated measures. Stat Med 37:375-389
Ringham, Brandy M; Kreidler, Sarah M; Muller, Keith E et al. (2016) Multivariate test power approximations for balanced linear mixed models in studies with missing data. Stat Med 35:2921-37
Johnson, Jacqueline L; Kreidler, Sarah M; Catellier, Diane J et al. (2015) Recommendations for choosing an analysis method that controls Type I error for unbalanced cluster sample designs with Gaussian outcomes. Stat Med 34:3531-45
Brinton, John T; Ringham, Brandy M; Glueck, Deborah H (2015) An internal pilot design for prospective cancer screening trials with unknown disease prevalence. Trials 16:458
Guo, Yi; Pandis, Nikolaos (2015) Sample-size calculation for repeated-measures and longitudinal studies. Am J Orthod Dentofacial Orthop 147:146-9
Simpson, Sean L; Edwards, Lloyd J; Styner, Martin A et al. (2014) Separability tests for high-dimensional, low sample size multivariate repeated measures data. J Appl Stat 41:2450-2461
Munjal, Aarti; Sakhadeo, Uttara R; Muller, Keith E et al. (2014) GLIMMPSE Lite: calculating power and sample size on smartphone devices. PLoS One 9:e102082
Ringham, Brandy M; Alonzo, Todd A; Brinton, John T et al. (2014) Reducing decision errors in the paired comparison of the diagnostic accuracy of screening tests with Gaussian outcomes. BMC Med Res Methodol 14:37
Chi, Yueh-Yun; Gribbin, Matthew J; Johnson, Jacqueline L et al. (2014) Power calculation for overall hypothesis testing with high-dimensional commensurate outcomes. Stat Med 33:812-27
Andridge, Rebecca R; Shoben, Abigail B; Muller, Keith E et al. (2014) Analytic methods for individually randomized group treatment trials and group-randomized trials when subjects belong to multiple groups. Stat Med 33:2178-90

Showing the most recent 10 out of 18 publications