Innovative methodology for Functional Data Analysis facilitates improved data analysis for longitudinal studies, e-commerce online bidding, genomic studies, (bio)demography and many other areas of the social, biological and physical sciences problems. The proposed functional approaches provide highly flexible ways to characterize such data, and especially to study their time-dynamic aspects. The investigator extends the applicability of Functional Data Analysis to data structures that have not been widely considered within this framework. This includes point processes, high-dimensional (large p, small n) data and sparsely observed stochastic processes as they occur in longitudinal and repeated measurements data. Especially for high-dimensional non-functional and point process data, Functional Data Analysis approaches have the potential to lead to transformative rather than merely incremental improvements. The investigator develops flexible functional and varying-coefficient models for functional regression and correlation. Current modeling approaches are too restrictive to be of broad applicability and more general models are needed. Similarly, the subarea of curve warping has seen much development lately but there remain many important open questions to be investigated. The investigator combines theoretical analysis, simulations, and data applications to conduct this research and applies the methods to data from e-commerce, biodemography, longitudinal studies and gene expression.

The investigator develops statistical methodology that is immediately useful for the analysis of large and complex data in genomics, demography and biodemography. These new analysis tools, which fall into the field of Functional Data Analysis, are geared towards gaining a better understanding of time-dependent processes. These include a variety of commonly observed phenomena such as growth, aging, bidding during an online auction, or repeated observations of a recurring incident such as an asthma attack. The methods developed by the investigator elucidate the underlying dynamics of such phenomena. Application of these methods in particular enables insights into the mechanisms of aging and longevity, the dynamics of on-line auctions, and other instances of e-commerce. The investigator extends the scope of these methods further such that for example improved prediction of specific risks becomes feasible, based on a recording of a subject's gene expression profile.

Project Report

New statistical methodology was developed in the areas of nonparametric statistics and functional data analysis. Functional data analysis is a rapidly expanding area within statistics. It serves the need to analyze new types of data of longitudinal or time-dynamic type, where one records one or several functions per subject or experimental unit. Specifically, in this project the following results were achieved: (a) Improved analysis of longitudinal studies, with applications across the sciences; (b) Enabling the analysis of data that involve many density or hazard functions, such as encountered in life table-based quantification of mortality or the survival of units. (c) Novel methodology for Empirical Dynamics, aiming to quantify patterns of time-dynamic changes such as price changes in online auctions. A closely related problem is the estimation of derivatives from sparse and unevenly spaced data, for which also a solution was found. (d) New methods for the analysis of very high-dimensional data, which are encountered in genomics and other areas, where increasingly large amounts of data are generated. In such applications, the number of predictors for outcomes of interest can be very large, while the number of subjects for which predictors and outcomes are observed is of small or modest size. Current methods are not able to overcome problems that arise when predictors in such settings are highly correlated or many predictors are simultaneously important. For such situations, new functional approaches were developed. (e) Methods for determining how changes in the shapes of predictor functions are associated with changes in outcome measures, through quantifying such changes in the form of functional gradients. (f) Several other statistical modeling approaches were brought under the umbrella of functional data analysis, thus demonstrating the strength of the functional approach. This approach was shown to lead to better results than previous methods in common situations. The functional models and methodology to achieve these results were carefully studied, in terms of mathematical, computational and data analytic properties. The new methodology was coded in software. The resulting Matlab code "pace" is publicly available. Several graduate students were trained to participate in both statistical methodology and in collaborative applied research. Several students graduated with PhDs on topics in functional data analysis and related areas. The results of the research conducted in this project informed the curriculum of graduate classes. The newly developed methods were successfully applied in a number of interdisciplinary projects with good success, leading to new scientific insights. The results were disseminated in publications and in invited talks at national and international conferences and at universities.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0806199
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
2008-07-15
Budget End
2011-06-30
Support Year
Fiscal Year
2008
Total Cost
$180,015
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618