DMS 9705209 Serfling In order to provide strengthened foundations for statistical analysis of multidimensional data, this research develops a general theory of statistical depth functions. General theory is developed which unifies and extends the few examples of depth functions presently in the literature. Based on the notion of center-outward ordering of multidimensional data points by "depth," corresponding notions of multidimensional location, spread, quantiles, ranks, and other traditional one-dimensional sample statistics are formulated and studied in this research. In this framework, statistics such as multidimensional L-statistics, rank statistics, and generalized forms of these statistics, are investigated (extending previous work of the investigator for the one-dimensional case). Further, corresponding notions of "contours" are investigated. Statistics are developed which perform well overall with respect to robustness criteria (e.g., breakdown points), equivariance (relevant to the geometric structure), computational ease, conceptual consistency (with associated population notions), and theoretical tractability.Tools used, and further developed, for this research include functional analytic and U-statistic methods. Application contexts receiving special attention include robust and nonparametric regression and analysis of variance. This research develops improved methods for analyzing multidimensional data. For a "cloud" of data points, one wishes to have a sense of where the "center" is located, for example. One can take the average of the points, or one can seek to define a "middle point" that is less influenced by the extremities of the data cloud. Similarly, other representative features of the data cloud need to be defined as analogues or extensions of concepts already in use for analysis of simple one-dimensional data. This research systematically treats such issues and develops new methods to be put into practice. Such summary statistics enable the main featur es of a data cloud to be conveyed by means of a few easily interpretable numbers, thus enabling one to describe the data adequately within the confines of a conventional statistical report. Whereas visual methods lose their effectiveness for dimensions greater than three, the summarizing methods developed in this research apply equally well for any number of dimensions. Multidimensional data sets are arising increasingly in the very complex data-gathering activities now pursued in the various arenas of modern society and strategic national concern. This research leads to tools for simplification and reduction of this complexity. Further, this study develops methods for interpreting the data as but a sample from a target population about which one seeks to make statistical inferences. For example, the question of what exactly is being estimated by such data is addressed. Basic mathematical advances needed for development of these new statistical methods are also accomplished as part of this research.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
9705209
Program Officer
William B. Smith
Project Start
Project End
Budget Start
1997-07-15
Budget End
2001-06-30
Support Year
Fiscal Year
1997
Total Cost
$95,774
Indirect Cost
Name
University of Texas at Dallas
Department
Type
DUNS #
City
Richardson
State
TX
Country
United States
Zip Code
75080