Over the past decade a variety of alternative computer based modeling techniques have been introduced which show promise for the construction of clinical decision aids. These techniques include statistical regression approaches such as generalized additive modeling, classification tree induction such as ID3 or CART, and multi-layer neural networks. Logistic regression models (LR) are currently central to most probabilistic predictive clinical decision aids and are fundamental to comparative analyses of medical care based risk adjusted events. These newer techniques have been applied on a larger scale in the last few years. They appear to have unique advantages in selected circumstances. The successful use of these methods, however, depends on understanding their accuracy, performance, and model transportability. A formal assessment of these new techniques with four specific aims is proposed: (1) to assess and compare the performance of different models to determine the factors which affect performance; (2) to develop automated computer based procedures for exploratory model development for each method; (3) to develop hybrid models incorporating the strengths of each of the existing techniques, and (4) to determine the situations that restrict the transportability of these models.
These specific aims will be achieved in a three stage project. In the first stage four approaches will be pursued: (1) the mathematical properties of the different computational algorithms for the modeling techniques will be studied; (2) automated modeling procedures will be developed and utilized; (3) the factors that affect performance for each modeling technique will be explored and(4) new hybrid techniques will be developed and assessed. In the second stage the methods developed in the first stage will be used to create and test models that predict cardiovascular events on data from 15,000 patients in a prospective clinical trial. In the third stage the factors that affect the generalizability and transportability of models to new datasets will be explored by repeated sampling and model construction on different subsets of the cardiovascular database including separating the database into subsets from each of ten different hospitals. This work will broaden the understanding of these important modeling techniques and their potential contributions for clinical decision making, health policy research, and medical informatics. New modeling techniques might be developed which incorporate elements from different techniques.
Terrin, Norma; Schmid, Christopher H; Griffith, John L et al. (2003) External validity of predictive models: a comparison of logistic regression, classification trees, and neural networks. J Clin Epidemiol 56:721-9 |
Schmid, C H; D'Agostino, R B; Griffith, J L et al. (1997) A logistic regression model when some events precede treatment: the effect of thrombolytic therapy for acute myocardial infarction on the risk of cardiac arrest. J Clin Epidemiol 50:1219-29 |