A New Class of Model Selection Criteria Based on Kullback's Symmetric Divergence Joseph E. Cavanaugh University of Missouri - Columbia The selection of a statistical model from a collection of candidates can often be facilitated by the use of a model selection criterion, which evaluates a fitted model by assessing whether it offers an optimal balance between "goodness of fit" and parsimony. This research considers the development of model selection criteria based on Kullback's symmetric divergence measure. The symmetric divergence is related to Kullback's directed divergence, better known as the Kullback-Leibler information, which serves as the basis for the well-known Akaike information criterion and its subsequent variants. In the context of model selection, the symmetric and directed divergence can both be utilized to measure the discrepancy between the model which presumably generated the data and a fitted approximating model. It can be argued, however, that the symmetric divergence is more sensitive to deviations between these two models and therefore functions as a better discriminant. Consequently, an estimator of the symmetric divergence may serve as a more effective model selection criterion than an estimator of the directed divergence, provided that the former estimator is accurate enough to sufficiently reflect the sensitivity of the targeted measure. The preceding notion serves as the impetus for this research, which involves the development and investigation of a new class of model selection criteria based on estimation of the symmetric divergence. Scientists who model phenomena are often faced with the dilemma of how to choose an appropriate model to characterize an underlying set of data. A model selection criterion is a measure which assigns a "score" to each model in a candidate collection, an index which reflects how well the associat ed model satisfies a certain optimality principle. These scores allow an analyst to simply and objectively choose a final model based on an evaluation of a potentially expansive class of candidates. Thus, model selection criteria provide an ideal means for the computer to occupy a central role as a decision maker in statistical investigations. The importance of this notion is discussed by Cheeseman and Olford ("Selecting Models from Data: Artificial Intelligence and Statistics IV," Springer-Verlag Lecture Notes in Statistics, 89, page v): "...Computers will increasingly be required to draw robust inferences from data, sometimes very large quantities of data... And, because the scale of the problems arising from large computer databases quickly overwhelms the human analyst, it is desirable to have a computer assume as much of the role of analyst as possible." In the future evolution of statistical methodologies and practices, model selection criteria will play an increasingly vital function. Thus, it is essential that continuing work is conducted in this area, pertaining not only to the evaluation and improvement of existing criteria, but also to the introduction and investigation of new criteria based on appealing statistical principles. This research focuses on the latter. The results of this investigation will have potential impact on many scientific areas, including engineering (e.g., image and signal processing), economics (e.g., econometric modeling), and computer science (e.g., artificial intelligence).

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
9704436
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
1997-07-01
Budget End
2000-12-31
Support Year
Fiscal Year
1997
Total Cost
$70,120
Indirect Cost
Name
University of Missouri-Columbia
Department
Type
DUNS #
City
Columbia
State
MO
Country
United States
Zip Code
65211