Functional data analysis (FDA) deals with infinite-dimensional data in the form of random functions. Such data have become increasingly common due to new technology to record and store massive data. The field has gained much traction and research has accelerated, but there remain many unsolved problems and new opportunities for research. This research focuses on four projects that address: 1) an open problem regarding the choice of the domain of interest in a regression setting with a functional covariate and scalar response, 2) implementing the RKHS (reproducing kernel Hilbert space) approach for conventional functional linear models when the functional covariates are observed sparsely, 3) dynamic modeling for multivariate functional data, and 4) challenges for the analysis of functional snippet data, for which each subject is observed in a different interval much shorter than the domain of the functional data. The developed methods will be applied to various data with functional components to evaluate the effect of pollutants on lung cancer mortality and to explore the interaction of these pollutants. The proposed research thus has direct impacts on public health research. In addition, the proposed approaches for functional snippets have broad applications in accelerated longitudinal studies, which are common in social and health sciences. The computer code of developed algorithms will be integrated into an existing R-package, fdapace, on CRAN. The research findings will be incorporated into graduate curricula, undergraduate and graduate research projects, and short courses at workshops, and be presented at professional meetings.

Project 1 is important for interpreting the influence of a functional covariate, yet to date, there is no algorithm that can reliably identify the relevant domain and the theory is incomplete. We propose to resolve these open problems through a new framework that involves a dynamic RKHS approach to overcome the challenges. This has the potential to break new ground in the well-established field of RKHS. A weakness of the RKHS approach is that it has difficulty to handle sparsely observed functional covariates. In Project 2, we propose a solution by imputing incomplete functional covariates and show that the regression coefficient function can be recovered through the imputed functional covariates. A new line of theory will be developed to deal with the approximation errors in the Karhunen-Lo'eve expansion for functional data. These new results will facilitate future research that involves imputation for functional data. Project 3 aims at modeling the derivatives of multivariate functional data using the component processes as covariates. We propose a concurrent approach that avoids an ill-posed inverse problem and has the advantage to accommodate time-lags of the predictor component processes. Project 4 deals with another open problem in FDA. We propose two nonparametric approaches for functional snippets and will develop supporting theory. These new approaches provide a new frontier of research in FDA, as once the covariance can be estimated accurately, existing FDA approaches, such as principal component analysis, classification or clustering, can be readily adapted for functional snippets.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1914917
Program Officer
Huixia Wang
Project Start
Project End
Budget Start
2019-07-01
Budget End
2022-06-30
Support Year
Fiscal Year
2019
Total Cost
$130,764
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618