In the modern data-centric era, one is constantly faced with the task of extracting intelligent summaries out of diverse, complex data. This task is becoming increasingly challenging as the data becomes more complex. Recent work has demonstrated that topological ideas and concepts can be powerful in extracting essential structures/features that are hidden in data. Although existing topological methods are promising and powerful, they are limited when analyzing data that is laced with complex maps (e.g, non-real valued functions) and temporal components.

This project aims to broaden the scope of topological techniques and methodologies for analyzing such complex data. Specifically, the PIs will investigate novel methodologies and computational issues to address key challenges caused by complexity in modern data: the diverse properties/information associated with data, the dynamic/time-varying behavior of data, and the sheer volume of the data. The project will provide a theoretical understanding of a recently proposed framework, called Mapper, and its extension to a multiscale formulation. It will explore the use of persistence methodologies, including zigzag constructions, for understanding the time-varying aspects of data.

The geometric and topological ideas behind this project will bring new perspectives to the important field of computational data analysis. A successful algorithmic theory for summarizing and characterizing complex and dynamic data with topological techniques can provide a powerful tool for data exploration and analysis in various fields of science and engineering. The educational impact of this project is in a large synergy between mathematics and computer science motivated by real applications. The findings from the project are planned to be part of the course materials that the PIs develop. This project will train graduate students who will develop skills in mathematics and theoretical computer science, most notably in algorithms and topology, in writing efficient software, and its application to analyzing data sets. The combination of such skills is becoming increasingly essential in modern data science.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1526513
Program Officer
Joseph Maurice Rojas
Project Start
Project End
Budget Start
2015-09-01
Budget End
2019-08-31
Support Year
Fiscal Year
2015
Total Cost
$399,999
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210