Nonparametric and Semi-parametric Models for High Dimensional Data

The investigator will focus on statistical models, theory, algorithms and applications geared towards the analysis of high-dimensional and in particular functional data. Semiparametric methods are particularly appropriate for such data since usually little is known about the structure of these data, while at the same time a dimension reduction step is necessary in order to avoid the "curse of dimension". Dimension reduction in the form of projections by fitting single index or multiple index models, or by truncating the number of terms included in an expansion of functional data, is therefore a major emphasis of this project. Another emphasis of this project are statistical methods that take into account that curve data often are random curves that are subject to individually varying time scales. This leads to models, theory, methodology and algorithms for time warping of functional data. Curve data are abundant in genetics where dissemination of gene expression profiles is of highest interest and also in the field of aging and mortality. The investigator will develop methods for functional regression, correlation, discriminant and cluster analysis. These methods will provide tools to establish relationships between random functions and allow classification of observed sample curves into distinct categories.

Large and increasingly complex data that are being collected in scientific and other experimental and observational studies are often data that may be viewed as curves or functions. Such data often contain valuable information about the time-dynamics of physical and biological phenomena, and advance statistical techniques are needed to extract it. For example, recordings of repeated cDNA expression data with genetic microarrays may contain valuable information about the dynamics of gene activation patterns and gene regulation. Other examples where such data play a major role concern the relationship between reproduction and aging, the dynamic structure of aging and longevity, or the relationship between various blood proteins that are recorded continuously. The investigator will develop statistical methods and models specifically designed for the analysis and interpretation of such data.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
0204869
Program Officer
Grace Yang
Project Start
Project End
Budget Start
2002-08-01
Budget End
2005-07-31
Support Year
Fiscal Year
2002
Total Cost
$158,070
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618