Diffusion (or inference) geometries , provide tools for organization of massive digital data sets. Like differential calculus, they are used to build global inference relations between objects by combining ``infinitesimal" (linear) models. Harmonic analysis on such structures leads to multiscale folder building paradigms leading to powerful tools for functional regression and analysis of massive complex data. These geometric methods provide new insights in classical differential geometry enabling explicit embedding theorems and coordinate systems for Riemannian manifolds.

The multiscale analytic methods lead to a systematic analysis tool for seemingly unstructured data bases, and enables the automatic generation of Ontologies and data induced languages. These methods are broadly applicable in medical diagnostics, in the organization and analysis of psychological questionnaires, as well as in all aspects of machine learning from data mining to machine vision.

Project Report

Professors Coifman and Jones have developed an array of mathematical methods, based on Harmonic Analysis to process complex high dimensional Data clouds. These methods involve building organizations or geometries of points , structuring the data while simultaneously building tools to process queries (or functions) on the data. The tools developed by the PI’s are natural extensions of ideas in mathematical analysis , in which complex transformations or operators on functions need to be processed efficiently with complete precision. Applications obtained in this project , professor Jones in collaboration with V Rokhlin new efficient and faster , numerical algorithms for solving nearest neighbor problems. (this is typically a bottleneck in retrieving and organizing massive data). He has also proved (with Maggioni and Schul that complex data can be effectively parameterized by eigenvectors of appropriate Laplace operators. These results are of course tied to theoretical advances in combinatorial geometry ( see references) Professor Coifman has developed data agnostic analytic tools to process large arrays of data (databases) so as to reduce their complexity and to enable automated data processing In particular the tensor product system developed enables automated data learning . The methods are a version of Harmonic analysis on tensor product which enable a systematic data driven learning of both the data and features of the data . This applies directly for the organization of medical data bases into demographic profiles , as well as a corresponding grouping of patients symptomatic features. Applications to analysis of questionnaires to the organization of neuronal cells in the brain, as well as to numerical analysis have been developed.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
0802635
Program Officer
Bruce P. Palka
Project Start
Project End
Budget Start
2008-07-01
Budget End
2012-06-30
Support Year
Fiscal Year
2008
Total Cost
$435,436
Indirect Cost
Name
Yale University
Department
Type
DUNS #
City
New Haven
State
CT
Country
United States
Zip Code
06520