The principal investigator and his collaborators aim to develop effective data modeling paradigms that are sufficiently simple for statistical inference. Current scientific investigations, as well as industrial applications, produce and rely on massive, high-dimensional and possibly corrupted data sets. A major focus of applied mathematicians and statisticians in this area has been on quantitative geometric data modeling. In order to effectively analyze large data and obtain meaningful statistical inference, the underlying geometric models need to be sufficiently simple. The proposal suggests mathematical paradigms for such effective geometric models. It plans to develop rigorous mathematical theory for these paradigms combined with carefully designed numerical strategies addressing specific and important applications. Despite the recent progress in this area, there are many open directions, several of which this research project addresses.
More specifically, the proposal focuses on several important directions of geometric data modeling. One direction aims to address modern issues in single robust subspace modeling with respect to new paradigms of learning and computation that have hardly been addressed so far in this setting. Another direction will explore important issues in modeling data by multiple subspaces or manifolds with new paradigms and perspectives. The proposal will also emphasize specific paradigms of low-rank and sparse modeling, which are induced by important applications, such as approximate nearest subspace for object recognition, improved feature tracking, structure from motion in computer vision, and sparse modeling in the atmospheric sciences.