This research considers the development of computationally practical robust multivariate location and dispersion estimators, robust multiple linear regression estimators and resistant dimension reduction estimators along with the corresponding theory. Regression is the study of the conditional distribution of the response variable given a vector of the predictor variables. Dimension reduction searches for a lower dimensional vector of predictors that carries all the information relevant to the regression. A 1D regression is a special case of dimension regression and can be visualized in a plot of the estimated sufficient predictor versus the response. Many of the most used statistical procedures, including multiple linear regression and generalized linear models, are special cases of 1D regression. Robust estimators are needed since existing methods for dispersion, regression and dimension reduction such as ordinary least squares and sliced inverse regression often perform poorly in the presence of outliers.

Statistics is the science of extracting useful information from data and is used by industry and government and in the fields of engineering, biological sciences, geological and environmental sciences, information technology, medicine, physical sciences, and the social sciences. Applications of the methods under investigation include biomedical research, predicting future observations based on previous data, and the analysis of economic and social data. Increasingly complex high dimensional data sets are being collected for scientific, social and strategic purposes. These data sets tend to contain outliers which are observations that differ from the bulk of the data. Typing and recording errors (e.g., 1000 or 10 pound women) may create outliers, but often there is no simple explanation for a group of data that differs from the bulk of the data. Robust statistics combined with dimension reduction is perhaps the most promising technique for making the most used methods of statistics simultaneously easier and more effective. Hence the research is also likely to have a great impact on statistical education.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0600933
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2006-06-01
Budget End
2009-05-31
Support Year
Fiscal Year
2006
Total Cost
$89,162
Indirect Cost
Name
Southern Illinois University at Carbondale
Department
Type
DUNS #
City
Carbondale
State
IL
Country
United States
Zip Code
62901