An important area of mathematical statistics are nonparametric curve estimation problems in their various forms. One growing approach to these complicated problems is to find asymptotically sufficient statistics to simplify them. These statistics can also be used to find asymptotically equivalent experiments that can unify the approaches to inference and estimation for a number of different types of nonparametric problems. One class of nonparametric problems is the class of regression models where the mean of the data is a smooth function of the design points at which the data is observed. It is well known that these models can be thought of as discrete samples of a continuous, signal-plus-white-noise model where, as the size of the sample increases, negligible information is lost about the mean function. Specifically, the increments of the continuous process observed by the white-noise model are asymptotically sufficient and have nearly the same distribution as the regression observations. This project finds asymptotically sufficient statistics in some nonparametric regression models that allow estimation of nuisance parameters such as the variance or covariance of the errors or the density of the randomly placed design points. These statistics consist of two parts: one corresponds to the increments of some process, and the second contains information about the value of the nuisance parameter. These asymptotically sufficient statistics lead to an asymptotically equivalent model that observes a continuous Gaussian process where the mean function is first transformed by an observed filter. These white-noise approximations provide a new and unifying approach to a number of nonparametric regression problems. There are other nonparametric models, such as density estimation from independent data, that have been shown to be asymptotically equivalent to the white-noise model. The same technique of finding auxiliary estimators of nuisance parameters and then approximating the conditional distribution given these estimators can also be applied to the density estimation experiment. These sufficient statistics are a step in the direction of finding an approximation to the density estimation experiment on two dimensional sample spaces.

This project seeks to find better techniques for estimating a signal in the presence of noise by constructing approximations that are appropriate for large sample sizes. The work will have an impact on the theory of nonparametric curve estimation, the basis for applications such as analyzing financial series or medical images, as well as a broad impact on research and education in mathematical statistics. In many areas of technology today, large sets of data that do not follow a typical pattern are difficult for scientists to analyze. Non-regular estimation problems like this might come up in problems such as filtering noisy medical images or signals, estimating distributions of plant species, or analyzing financial series. signal interpretation, financial analysis, and other applications. In this project, the investigator transforms the problem into a form that can be solved by already existing statistical tools. As a result, existing methods of analysis can be applied to a wider range of technological problems.

Project Report

This project mainly looked into approximations in statistical problems where there are a large number of observations. These approximations serve to simplify the decision making process or to clarify the issues involved in the problems A common statistical problem is to estimate the average value of a series of measurements taken over time when there is some kind of noise in the measurements and that average value is changing over time as well. When thinking about the sort of noise that is applied to the observations, it is more difficult when there is dependence among the values of the noise at different points in time especially if this dependence reaches across long periods of time. This project sought to reformulate the problem into simpler pieces by computing important functions of the data (these could be consider estimators of properties of the unknown average function.) Reorganizing the data in this way can lead to simplifying the problem without leading to significant loss of information about the average. One application of these methods was to a biological experiment that looked at tissue from diseased mice. The hypothesis was that tissue samples from treated mice would react to stimuli in a smoother fashion than samples from untreated mice. The results of the tests on the tissue samples were traces of the responses over time, and these traces could be decomposed to form a measure of the "roughness" of the traces. We found a significant difference in the roughness measures that indicated that the treatment lead to less roughness. The measure of roughness compared the dependence and variability of the traces when considered at different time scales. Rougher signals showed greater variability in a short stretch of time relative to the overall variability of the signal.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0805481
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2008-07-01
Budget End
2011-06-30
Support Year
Fiscal Year
2008
Total Cost
$110,101
Indirect Cost
Name
University of California Santa Barbara
Department
Type
DUNS #
City
Santa Barbara
State
CA
Country
United States
Zip Code
93106