This research considers the development of resistant procedures for visualizing regression data and the development of rigorous statistical theory for practical robust estimators. Regression is the study of the conditional distribution of the response variable given a vector of the predictor variables. Many of the most used statistical procedures, including multiple linear regression and generalized linear models, are special cases of regression. Existing methods for regression often make rather strong assumptions on the predictor distribution. If this distribution is skewed or if outliers are present, then regression graphics methodology such as ordinary least squares, sliced inverse regression, and principal Hessian directions may fail to give useful results. Of primary interest in this research is the question of how to visualize regression when the predictor distribution assumptions are violated. Previous research has shown that robust estimators can be useful for visualizing regression. Since robust estimators for which there is rigorous theory are generally impractical to compute, another focus of this research is to develop practical consistent robust estimators.
This research will lead to a better understanding of the increasingly complex high dimensional data sets collected for scientific, social and strategic purposes. Many methods using regression have been developed, and applications include biomedical research, predicting future observations based on previous data, and the analysis of economic and social data. Robust statistics combined with regression graphics has the potential to make simpler but more accurate models and diagnostics. New robust estimators are needed since robust methods that perform well on text book sized problems frequently fail if applied to larger high dimensional data sets that actually occur in practice.