A central problem in data-driven scientific inquiry is how to interpret structures in large data sets uncovered by modern tools. The field of topological data analysis provides a potential solution via the language of homology, which encodes features of interest as cycles. These, in principle, can be located and understood as generators, which reveal explicit structure in the original data. However, fundamental mathematical and computational challenges have restricted most topological analyses to the study of persistence diagrams, numerical summaries that omit generators and, thus, dramatically limit modeling power and explainability. This project draws on diverse ideas from the mathematical domains of algebraic topology, numerical linear algebra, category and order-lattice theory, computation, and combinatorics, and from the scientific and engineering domains of biological aggregations, brain, and medical imaging. It provides ample opportunities for training mathematical scientists for the mastery of these tools, and for developing new, exploratory methods in STEM teaching and learning.
The ExHACT project will provide the tools needed to realize the full modeling and explanatory capability of generating cycles by creating a unified theoretical and computational tool set for persistent homological algebra. Recent results in the fields of matroid theory and exact categories (from which the project draws its name) developed by one of the PIs provide the foundation for efficiently performing the necessary computations using well-understood matrix manipulations. The PIs will capitalize on this new opportunity by developing theoretical and computational tools for the study of persistent generators, induced homomorphisms of persistence modules, exact and spectral sequences, and relative persistent homology, among other methods. They will augment this computational core with data visualization capabilities to facilitate graphical exploration of homological data in an intuitive fashion for scientists without extensive mathematical background, and provide new tools for existing research groups that currently apply topological methods in materials science, neuroscience, biochemistry, and biological aggregations. ExHACT will also enable custom functionality and workflows to be built by more experienced users, providing a stable community platform for the development of new methodologies in topological data analysis. All software functionality will be extensively documented, including both technical specifications and detailed use cases, in order to make a full suite of computation and visualization capabilities accessible to a broad audience.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.