The object of this proposal is the analysis of existing methods, and the development of new ones, for the task of multi-manifold learning, where the data is assumed to be comprised of low-dimensional structures. The main focus will be in studying the potential of spectral methods for clustering and modeling of low-dimensional surfaces embedded in high dimensions; in designing new spectral-based approached to the task of detection of low-dimensional objects in point clouds; and the analysis of popular manifold learning algorithms, especially in terms of robustness to outliers. A number of applications will be specifically addressed, for example, motion segmentation, structure from motion, classification of face images, segmentation of diffusion tensor images and the characterization of cosmological models in astrophysics.

Modern high-dimensional datasets often exhibit low-dimensional structures. Such situations arise in image processing; e.g., in target tracking, where a typical trajectory defines a curve through successive frames; and also in medical imaging, e.g., in the examination of vascular networks. The study of the galaxy distribution, which contains filamentary and sheet-like structures, is another example. Traditional methods are known to be ineffective in this context and the last decade has seen a massive amount of research aiming at improving on these classical tools. A number of approaches for multi-manifold modeling have been suggested, mostly by computer scientists and engineers. Applied mathematicians and statisticians have different perspectives to offer and their contribution is needed, not only in designing new algorithms but also (and perhaps especially) in providing theoretical foundations, which researchers in the field have been asking for. The research in this proposal will address both issues, developing rigorous mathematical theory combined with carefully designed numerical strategies addressing specific applications, such as motion segmentation, structure from motion, classification of face images, medical imaging and the characterization of cosmological models in astrophysics. The PIs will share their findings through publications and software, all available online to the scientific and engineering communities.

Project Report

Current scientific investigations as well as industrial applications produce and rely on massive high-dimensional and possibly corrupted data sets. It is often observed that many common high-dimensional data sets are actually intrinsically low-dimensional. The main focus of the work supported by this award was on quantitative geometric data modeling, and in particular, modeling data by mixtures of manifolds or merely subspaces. Such modeling shows up in extremely important practical data sets, for example, in motion segmentation of videos and cold dark matter structure-formation. It often provides a more accurate framework than the common single subspace modeling. The increase in modeling accuracy engenders an increase in theoretical and computational complexity because there are many more quantifications, such as the number of subspaces or manifolds, angles of intersections, the smoothness of the manifolds, etc. The computational strategies for these recent models and their mathematical analysis are thus typically challenging. A major achievement of the work supported by this award was the theoretical and numerical study of modeling data by multiple manifolds. The PI and his collaborators developed practical algorithms for this purpose and theoretically justified them (that is, they showed that the algorithms could correctly recover the underlying manifolds). The theory guided the development of the algorithms and indicated limitations of other proposals. It has implications to the analysis of the well known algorithm of spectral clustering for the case where the underlying clusters lie on low-dimensional manifolds. In particular, the theory indicated how to utilize local low-dimensional structure to improve the accuracy of the common spectral clustering in this case. The established theory can be easily utilized and extended by new comers to this important area of research. The PI and his collaborators also proposed algorithms for modeling data by multiple subspaces, that is, when the manifolds are linear (this case appears in problems of motion segmentaion, face clustering and imaging). They even proposed specialized algorithms for the case, where the manifolds are quadratic (this case appears in problems of two-view geometry of computer vision). The PI and his collaborators also carefully studied the robustness of all of these algorithms to outliers. Here outliers are data points arising from a model, which is different than that of the underlying manifolds. In addition to the substantial theoretical guarantees of robust multi-manifold modeling, the work supported by this award also indicated various difficulties and obstacles of such modeling. In particular, it indicated the impossibility of a fully convex framework for doing it. However, the PI and his collaborators suggested a convex framework for robust modeling of data by a single subspace. They numerically demonstrated and theoretically justified the competitive speed and robustness (both to outliers and noise) of this method. In parallel to the rigorous mathematical efforts, the PI and his collaborators applied their methods to various problems in computer vision, machine learning and the atmospheric sciences. In particular, state-of-the-art results have been demonstrated to real data of motion segmentation, two-view geometry and face recognition. Graduate and undergraduate students were actively involved in these developments and have become young experts in these areas, who benefit our community with their knowledge and skills. All codes and preprints have been shared with the public.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0915064
Program Officer
Leland M. Jameson
Project Start
Project End
Budget Start
2009-09-15
Budget End
2013-08-31
Support Year
Fiscal Year
2009
Total Cost
$362,384
Indirect Cost
Name
University of Minnesota Twin Cities
Department
Type
DUNS #
City
Minneapolis
State
MN
Country
United States
Zip Code
55455