Nonparametric methodology is widely used in one and two dimensions, but not in high dimensions. This research proposal focuses on the mid-range dimensions in an attempt to foster a deeper understanding of the implications of the curse of dimensionality. Particular emphasis will be given to multivariate regression and density estimation problems, and closely related applications. Anecdotal evidence suggests a gap exists between the apparent successes of nonparametric methodology and the poor performance predicted by theory. We will examine new points of view, especially related to adaptive estimation. Higher quality estimation has often required use of negative kernels, but recent research and shown that equivalent gains are possible in regions where the Hessian is indefinite, often in the tails which dominate in higher dimensions. Other recent work suggests that cross-validation algorithms which are considered of marginal practical value in one dimension, improve dramatically in the multivariate case. We have found the many bandwidth selection algorithms cluster into two cases, and propose to characterize and investigate these classes. Dealing with medium dimensional data gives rise to many problems in data visualization which we propose to investigate. Multivariate visualization requires aids and guides such as cognostics. We plan to extend our density estimation visualization capabilities to regression surfaces as well as applications such as visual clustering and discrimination. We propose to extend univariate ideas of mode estimation and testing based on the mode tree and simulation to several dimensions. Algorithmic development for multiprocessor and parallel architectures will be briefly considered. Nonparametric methodology seems to work well in the hands of experts, and this research is designed to not only aid the expert but to facilitate the use of the methodology by a wider audience. The proposer has recently completed a book on the topic of multivariate density and regression estimation, and related applications, particularly focusing on histograms and their logical extensions (Scott, 1992). Difficult theoretical problems with practical consequences still abound. However, widespread application reflects the general acceptance of nonparametric methodology. The growth in the field of scientific visualization is fertile ground for these exploratory procedures. This project attempts to capitalize on existing investments in large data bases, by developing flexible techniques that attempt to extract the maximum amount of information and structure hidden in the high dimensional data.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
9306658
Program Officer
James E. Gentle
Project Start
Project End
Budget Start
1993-05-15
Budget End
1996-06-30
Support Year
Fiscal Year
1993
Total Cost
$130,500
Indirect Cost
Name
Rice University
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005