There have been significant developments in the areas of applied regression and classification over the past 10-15 years. Much of the impetus originally came from outside of the field of statistics, from areas such as computer science, machine learning and neural networks. These disciplines have brought many fresh ideas to the table, a host of new and exciting models such as neural networks, as well as many interesting areas of application. As the dust settles, we find that these new ideas are best synthesized within a statistical framework, and have a natural place alongside traditional linear and nonlinear models. A key item in this research program is a research monograph with working title: THE ELEMENTS OF STATISTICAL LEARNING (with Jerome Friedman and Rob Tibshirani). This book develops a framework for describing and understanding the new regression and classification techniques from a statistical point of view, and for synthesizing them with existing methods. We strike a natural balance between the classical well tested linear and parametric models, and the more exotic and adaptive techniques appropriate in data rich scenarios. The research program includes the development of some new techniques for multiclass classification, each of which expand on existing techniques in novel ways.

Many important problems in data analysis and modeling focus on prediction: computer assisted diagnosis of disease (e.g. reading digital mammograms), heart disease risk assessment, automatic reading of handwritten digits (e.g. zip-codes on envelopes), speech recognition, to name a few. This research program has two arms. The first is a monograph that synthesizes from the many varied contributions a collection of well-tested techniques, and explains them from a statistical point of view. The second arm is to develop some new techniques for prediction. All these new methods exploit the rapid computing facilities we have available, and allow us to develop methods for prediction that would have been infeasible ten years ago.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
9803645
Program Officer
John Stufken
Project Start
Project End
Budget Start
1998-07-15
Budget End
2002-06-30
Support Year
Fiscal Year
1998
Total Cost
$199,074
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Palo Alto
State
CA
Country
United States
Zip Code
94304