The overall goal of this proposal is to develop appropriate rank based tests for clustered data when the cluster size is potentially informative and apply the resulting methods for various marginal comparisons (e.g., average condition of teeth before and after treatment) using existing dental database resources, specifically as obtained from the Piedmont 65 + Dental Study and Iowa Fluoride Study. Informative cluster size arises when the number of units in a cluster is non-constant/random and in correlation with the outcome of interest. In the context of dental data, all teeth belonging to an individual will form a cluster. Since tooth loss (in adult) is correlated with two of the diseases we are planning to study, namely, periodontal disease and dental caries, we have potentially informative cluster sizes in the Piedmont data sets. It is a methodological challenge to adapt a classical rank test to such situations. As for example, the two sample Wilcoxon rank sum test has difficulty maintaining the correct size/significance level under informative clustering even if it is adjusted for cluster dependence through appropriate variance estimate. This proposal has a goal of developing proper classes of rank based tests (and related R estimators) and studying their statistical properties for three classical problems adapted to marginal inference under cluster dependence with informative cluster size. These are the so called one sample location problem (Aim 1), the regression problem (Aim 2) and the association problem (Aim 3). In each of these problems, we will obtain a class of test statistics using general score functions that maintain proper asymptotic size under the informative cluster size scenario. We will also study the properties of the related R estimates of marginal parameters. Multivariate extensions of the first two problems will also be considered (Aim 4). Another signification component of the proposed research will be to extend these procedures to handle missing data where the missingness mechanism can be modeled using observable covariates (Aim 5). Finally, when the cluster size is not informative, as in the case of Iowa Study which comprises of children only, we will be able to increase the power of our tests by incorporating cluster specific weights in the construction of our test statistics (Aim 6).

Public Health Relevance

The proposed research will lead to novel methodological and theoretical development in nonparametric/rank tests and estimators for clustered data that will have direct impact on the analyses of a dental data. The results from the proposed research have the potential to transform the way clustered data are handled in practice. Dental researchers and practitioners will be more aware of the informative cluster size issue and employ robust methods such as the ones developed here that accounts for the non-ignorability of the cluster size. -cluster exchangeability remains an issue.

National Institute of Health (NIH)
National Institute of Dental & Craniofacial Research (NIDCR)
Small Research Grants (R03)
Project #
Application #
Study Section
Special Emphasis Panel (ZDE1-LK (25))
Program Officer
Harris, Emily L
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Louisville
Biostatistics & Other Math Sci
Schools of Public Health
United States
Zip Code
Lorenz, Douglas J; Datta, Somnath; Harkema, Susan J (2011) Marginal association measures for clustered data. Stat Med 30:3181-91