Binary logistic regression and its extensions to unordered polytomous response, ordered polytomous response, and Poisson response are among the most powerful tools in the arsenal of the epidemiologist or applied biostatistician analyzing discrete biomedical data. The usual method of inference for such models is unconditional maximum likelihood. For large well balanced data sets or for data with only a few parameters this approach is satisfactory. However unconditional maximum likelihood estimation can produce inconsistent point estimates, inaccurate p-values and inaccurate confidence intervals for small or imbalanced data sets, and for data sets with a large number of parameters relative to the number of observations. Sometimes the method fails entirely as no estimates can be found which maximize the unconditional likelihood function. A methodologically sound alternative approach which has none of the above drawbacks is the conditional approach. Here one only estimates the parameters of interest, eliminating the others from the likelihood function by conditioning on their sufficient statistics. The method is amenable to both exact and asymptotic inference. Hence it produces reliable inferences no matter how small or imbalanced the data. Although the theoretical basis for conditional inference has been established since the time of R.A.Fisher, numerical algorithms making conditional inference computationally feasible have been developed only in the past five years. They are published in technical journals not normally read by applied biostatisticians and epidemiologists. Software based on these new conditional methods is scarce, expensive, and difficult to develop. There is a need for good educational materials and accompanying public domain software to popularize the conditional approach to logistic regression and its extensions. Without these materials the methods will remain of academic importance only, since they will not be accessible to the majority of statisticians and epidemiologists. Under this proposal, the modern methods of conditional inference for binary and categorical data would be made accessible to the general statistical and epidemiological communities in three ways: through the preparation of a detailed work-book suitable for self-study or classroom instruction; through the development of public domain software to accompany the work-book; through the writing of expository papers on the subject, and publishing them in applied rather than theoretical journals.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project (R01)
Project #
5R01CA061050-02
Application #
2101827
Study Section
Special Emphasis Panel (SRC (53))
Project Start
1993-08-01
Project End
1996-07-31
Budget Start
1994-08-01
Budget End
1995-07-31
Support Year
2
Fiscal Year
1994
Total Cost
Indirect Cost
Name
Cytel Software Corporation
Department
Type
DUNS #
183012277
City
Cambridge
State
MA
Country
United States
Zip Code
02139
Corcoran, C; Ryan, L; Senchaudhuri, P et al. (2001) An exact trend test for correlated binary data. Biometrics 57:941-8
Mehta, C R (1994) The exact analysis of contingency tables in medical research. Stat Methods Med Res 3:135-56