The investigators will study the problem of computing informative, structure-preserving mappings between related data sets, and especially data sets of a geometric character, such as images, videos, GPS traces, 3D scans, or microarray data. Unlike classical data fusion where the goal is to find exact correspondences, the main emphasis here is to be able to map data at different levels of abstraction and to incorporate uncertainty and ambiguity directly into the map formulation. This raises challenging issues, both at the representational and the algorithmic levels. The aim is to develop efficient multi-resolution techniques through which data set relationships can be compactly encoded, compared, etc. --- making data relationships into precise, tangible objects that can be explored, just like the data themselves. The work will involve tools from a wide variety of mathematical disciplines, including aspects of differential geometry and topology, functional and harmonic analysis, algebraic and computational topology, machine learning and statistics, discrete and continuous optimization, scientific computing, and finally discrete algorithms and data structures.

Across all human activities, from science and engineering to medicine, commerce, and defense, massive data sets are becoming more and more available and more and more crucial to improved efficiency and enhanced functionality. As our data sets grow in size and number, they become increasingly interconnected and inter-related. This is because data is captured about the same or related entities in the physical or virtual words (e.g., different images of the same building, different logs of the same user), as well as because the data itself reflects an underlying reality that has symmetries, regularities, and other shared structure. Thus it makes sense to analyze data sets jointly, exploiting this shared structure to do individual operations on data sets better. Through the above mapping machinery it will be possible to organize data collections into (possibly overlapping) groups of related sets or parts thereof, separating what is common from what is variable within each group and across groups, and understanding the main axes of variability.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1228304
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2012-09-01
Budget End
2017-08-31
Support Year
Fiscal Year
2012
Total Cost
$784,996
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305