In a query-by-humming system, a user sings part of a melody and the computer identifies the songs that contain the melody. In a sign spotting system, a sign language user searches for occurrences of specific signs in a video database of sign language content. These are two example applications where users want to retrieve the best matching subsequences in a time series database given a query sequence. This project is developing methods for efficient subsequence matching in large time-series databases using the popular Dynamic Time Warping (DTW) distance measure. Embeddings are being designed that partially convert the subsequence matching problem into the much more manageable problem of similarity search in a vector space. This conversion allows leveraging the full arsenal of vector indexing and metric indexing methods for speeding up subsequence matching. The proposed methods will be applicable in a wide variety of time series domains, including, e.g., stock market modeling, seismic activity analysis, and sensor-based health monitoring. To showcase the commercial, social, and educational impact of the research, the project will produce three demonstration systems: a query-by-humming system, a handwritten document search-by-keyword system, and a sign spotting system. The results of the research are being integrated into these systems to achieve efficient retrieval in the presence of large amounts of data. The creation and dissemination of large, real-world datasets for these three systems will be an additional contribution of the project.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0812309
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2008
Total Cost
$224,998
Indirect Cost
Name
Boston University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215