The proposed research will develop and evaluate variable selection procedures for multiple regression and related problems. Both the Bayesian and the frequentist points of view will be considered. From the Bayesian perspective, the focus will be on stochastic search variable selection methods which use a hierarchical mixture model to guide variable selection. These methods will include a procedure for fast Bayes variable selection to handle hundreds of variables at once, a procedure for variable selection in generalized linear models, a procedure for variable selection across exchangeable regressions, and a procedure for performing simultaneous variable selection and outlier removal. From the frequentist perspective, the focus will be on a new criterion for evaluating selection criteria called risk inflation. This research will entail the application and generalization of risk inflation to a broad set of model selection problems which includes generalized linear models, order selection procedures for time series models, and change-point selection for estimation in change-point problems. Risk inflation will also be used to gauge the bias of using variable selection procedures in conjunction with heuristic search methods. A new class of variable selection procedures which use adaptive dimensionality penalties will also be developed. One of the main goals of modern statistical methods is to provide statistical models which relate input variables to outputs of interest. For example, such models are used to predict and explain annual crop production in agriculture, interest rates in business and economics, cancer incidence in medicine, and crime rates in sociology. A key component in building such models is the selection of input variables which contain strong predictive and explanatory information. This component is especially important because of the recent proliferation of large databases containing vast numbers of potential input variables. The propos ed research will provide powerful new methods for selecting such input variables for a wide variety of model building situations.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
9404408
Program Officer
James E. Gentle
Project Start
Project End
Budget Start
1994-09-01
Budget End
1997-08-31
Support Year
Fiscal Year
1994
Total Cost
$75,000
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712