This research is in the related areas of Statistical Learning and Object Oriented Data Analysis (OODA). There are major challenges in these areas that are addressed by a team of researchers, who bring different but complementary skill sets to explore. Statistical Learning is widely recognized as a very active area of interdisciplinary research, which lives between statistics, computer science, and optimization. With state-of-art optimization tools, this research offers a set of new approaches for statistical learning, including new penalties for regularization, further developments of large margin classifiers both theoretically and numerically, as well as nonparametric-based probability calibration for hard margin classifiers. In addition, new visualization and analytical tools for ``High Dimension-Low Sample Size'' (HDLSS) data are developed. Such development is extremely important since HDLSS has become a common feature of data encountered in many divergent fields such as medical imaging and micro-array analysis for gene expression but is outside of the domain of classical statistical multivariate analysis. OODA is a generalization of the recently very productive area of Functional Data Analysis (FDA). In FDA, curves are data points and variation in a family of curves is the focus of analysis. OODA extends this notion to populations where the data points are more complex objects, such as images, shape representations, and even tree-structured objects. The proposed research offers a set of new tools for FDA, including exponential family functional principal components analysis (PCA), robust functional PCA, curve discrimination, and forecasting and dynamic updating of time series of curves. Proposed research will also advance OODA for data on smooth manifolds and tree-structured objects.

The main application area of the research is in health and medicine and civil infrastructure. The research is motivated by and will have beneficial impacts on cancer research, medical imaging, call center management, and network traffic modeling. However, the developed statistical methods will be useful in fields far beyond those motivating this research, such as demography/epidemiology, financial economics and spatial-temporal modeling. The team consists of a good mix of well established senior researchers and young junior researchers. Strong mentoring at several levels is an important component of this project. First, there is strong training of graduate students, in these exciting new research areas, with the goal of giving them the background, and skills needed to start their own research careers. Second, there is strong mentoring of the junior researchers, by the more experienced members of the research team. In addition to working closely together on research projects, the junior researchers will learn the skills of advising PhD students, through joint supervision together with the more senior members. The team continues to disseminate the research results quickly and broadly through collaborative work, academic presentations, and journal publications. Web pages are created to enable quick access to user-friendly and accessible software implementations of new methods as well as technical reports and relevant references.

National Science Foundation (NSF)
Division of Mathematical Sciences (DMS)
Standard Grant (Standard)
Application #
Program Officer
Gabor J. Szekely
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of North Carolina Chapel Hill
Chapel Hill
United States
Zip Code