Many data sets undergo transformations either before, after, during the collection process which are most often corrected by smoothing, de-noising, or registration. Topological Data Analysis (TDA) makes it possible to extract robust signatures from data that are invariant to these transformations. However, in TDA, even a relatively small data set can easily blow up to fill memory when considering the space needed for edges, triangles, and other simplices that represent (or estimate) the connectivity of the underlying data. There is a need for fast, parallel, and distributed algorithms that partition the input in a principled way that leads to both strong theoretical guarantees and also practical performance. This project aims to fill this need by combining nested dissection, a well-studied theory from numerical analysis (NA) with persistent homology, the main technique of TDA, with the expectation that both fields will benefit. A potential broader impact of the project is to improve communication between researchers in TDA and NA. The PI will train both undergraduate and graduate researchers and incorporate advanced concepts in combinatorial topology in undergraduate and graduate curricula.

The specific aim is to develop a theory of nested dissection on simplicial complexes that allows for fast, parallel computation of persistent homology. A second specific aim is to develop and analyze new efficient data representations for the partially reduced simplicial complexes that appear in the course of persistent homology computation. This will involve a topological generalization of the Union-Find problem, a new approach to combining persistent homology and discrete Morse theory, an extension of the theory of nested dissection to work directly over quotient vector spaces (such as homology groups over fields), and also a separator theory that applies to filtrations or other situations where the underlying graph or complex is changing in time. Preliminary examples indicate that these extensions may produce significantly better theoretical guarantees. A third specific aim is to implement this approach, compare it with, and possibly integrate it with existing open source codes.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1464379
Program Officer
Jack S. Snoeyink
Project Start
Project End
Budget Start
2015-02-01
Budget End
2018-01-31
Support Year
Fiscal Year
2014
Total Cost
$173,034
Indirect Cost
Name
University of Connecticut
Department
Type
DUNS #
City
Storrs
State
CT
Country
United States
Zip Code
06269