In our modern society, technical advances constantly give rise to new data collection mechanisms that allow scientists to ask questions of ever-increasing complexities. The field of statistics is pressed to develop methodologies that can be used for answering such scientific challenges. Many of these scientific challenges can be addressed by first attempting to decrease size and complexity of the data through what is called feature extraction. In a second step these features are used to devise sound statistical methodologies. This project considers extracting information about the shape of the data, and is developing and studying statistical and machine learning methodologies based on these features. The outcomes of this project are expected to impact the fields of statistics, machine learning and various fields of application. Existing collaborations of the PI with biologists, animal scientists and material scientists, will also facilitate dissemination to these fields. This project will also provide training opportunities for both graduate and undergraduate students.

The contributions of this project will lie in the intersection of statistics and machine learning. They will enhance toolboxes and advance knowledge through the derivation of novel methodologies, through developing a deep understanding of these methodologies by performing relevant theoretical analyses, and by providing implementations of the methodologies that are useful for practical purposes. More specifically, this project will (i) study the role of the Hodge Laplacian in the context of analyzing higher-relational data, including its role in clustering; (ii) derive theoretical results about stabilizing statistics such as Betti-curves that allow the construction of asymptotically exact bootstrap based confidence intervals for functionals of persistent Betti-curves; (iii) develop a novel, statistically enhanced persistence diagram thought the use of multiple testing, and (iv) Implementations will allow the exploration of finite sample properties of the new methods.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
2015575
Program Officer
Pena Edsel
Project Start
Project End
Budget Start
2020-07-01
Budget End
2023-06-30
Support Year
Fiscal Year
2020
Total Cost
$200,000
Indirect Cost
Name
University of California Davis
Department
Type
DUNS #
City
Davis
State
CA
Country
United States
Zip Code
95618