The proposed research concerns Principal Component Analysis (PCA) and its various regularizations: Sparse PCA and Functional PCA. The investigator proposes five research projects to address major challenges in these areas. PCA is a ubiquitous multivariate analysis technique for dimension reduction. Regularization of PCA becomes essential for large dimensionality, especially in the unconventional ``High Dimension-Low Sample Size'' (HDLSS) setting. HDLSS has become a common feature of data encountered in many divergent fields such as medical imaging and microarray analysis, but is outside of the domain of classical multivariate analysis. The first project studies asymptotic properties of PCA and Sparse PCA when the number of variables is much larger than the sample size; consistency and strong inconsistency regions will be characterized; various asymptotic frameworks will be considered. The results offer theoretical insights into appropriate understanding of the results from PCA and Sparse PCA. The next four projects add innovative and valuable analysis tools to the field of functional data analysis. The first two aim at developing two-way functional PCA techniques for non-standard data including those with exponential family distributions and hazard rates. The last two deal with dependent functional data such as spatial-temporal data and time series of curves.

The proposed research is motivated by and will have immediate beneficial impacts on cancer and neuro disorder research, medical imaging, and workforce management of labor-intensive service systems. In addition, the developed statistical methods will be useful in fields far beyond these motivating applications, such as demography, quantitative sociology, financial econometrics, and spatial-temporal modeling. Complementary activities are planned to foster the dissemination of the research results quickly and broadly. The problems addressed are also of broad interest to general society, in terms of pressing issues such as human risky behaviors, health care policies, social security planning, and worker productivity. The research activities are a natural venue for training graduate students in these exciting new research areas. The collaborative projects and the relevant data have natural second uses in the classroom. Methods proposed are useful for developing an advanced statistics research topic course. Strong mentoring of junior female scientists and minority students is another important component of the proposal.

National Science Foundation (NSF)
Division of Mathematical Sciences (DMS)
Standard Grant (Standard)
Application #
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of North Carolina Chapel Hill
Chapel Hill
United States
Zip Code