In delimited populations, such as nursing homes, day care centers, prisons, hospitals, and cruise ships, serious outbreaks of illness generally produce small absolute numbers of disease incidence. Moreover, these groups are often more susceptible to disease than the general population (e.g. Garibaldi et al. 1981, Nimri 1994, March et al. 2000). Additionally, they can have broad effect on the general population, acting as disease reservoirs and leading to increased overall incidence. However, these limited populations cannot be monitored effectively using traditional statistical techniques due to the sparseness of observed incidence, even under epidemic scenarios. The temporal progression of outbreaks and the social-contact mediated dynamics within these smaller groups instead lend themselves directly to exact combinatorial methods. This project will formulate computational algorithms and develop convenient software that implements ten exact combinatorial statistical tests for real-time use by front-line and drop-in surveillance programs focusing on limited or fixed small populations. These tests include: (1) maximum number of cases, (2) linear discrete scan, (3) the visitors test, (4) range-scan, (5) longest run of empty cells, (6) empty cells, (7) variant max test, (8) extreme values, (9) binomial maximum, and (10) hypergeometric maximum. These tests will be formulated in terms of space-time units, in the sense of the Ederers-Myers-Mantel test, allowing generalizations that account for changes in population over time and across space, while maintaining exactness of the p-values. Although limited tables for a few of these tests have been published, no general algorithms have heretofore been described for any of these methods. During the Phase I project, feasibility was demonstrated by formulating computational algorithms for four of the ten tests, implementing them in software, and preliminarily studying their sensitivity and power for detecting outbreaks in real and simulated data. The performance of the new algorithms was compared to the results of applying a standard statistical technique that assumes large sample size. In Phase II, computational algorithms will be developed for the remaining exact statistics and all will be implemented in a user-friendly software package. The software will be modular in design, allowing for the incorporation of new methods as they are developed. More comprehensive sensitivity, specificity and time-to-detection analyses will be conducted using Monte Carlo methods to generate outbreak scenarios with alternate clustering mechanisms. The results will lead to guidance regarding which methods are best for detecting particular types of outbreaks.
This project will develop exact statistical methods and software for use by public health professionals to detect clusters of disease in temporal incidence data. Unlike traditional cluster detection methods, these exact methods are reliable when sample sizes are very small, or when background incidence rates are very low. They should therefore be useful in monitoring disease outbreaks, or other non-random patterns of events of health concern, amid sparse data associated with institutional settings such as nursing homes, schools, hospitals, prisons, cruise ships, or particular high-risk behavior groups.