PI: Xindong Wu; Co-PIs: Abdullah N. Arslan and Xingquan Zhu U of Vermont & State Agricultural College

Pattern Matching with Wildcards and Length Constraints

This research defines a unique problem of pattern matching with wildcards and length constraints, and aims to design efficient algorithms for the problem. Given a pattern P and a text T, a substring S in T is a matching string of P if (1) the number of wildcards between each two consecutive pattern letters in S and (2) the length of S are both bounded by the user's specifications. The project seeks to find the maximum number of ``distinct'' occurrences of P in T. This is a complex problem that integrates both local constraints (in the form of gaps between consecutive pattern letters) and global length constraints in pattern matching.

The research team will start with an existing preliminary design and further investigate the pattern matching problem, by (1) exploring the time complexity of the problem, (2) designing new, efficient algorithms to deal with some special cases, and (3) applying these efficient algorithms in practical problems in text indexing, gene sequence analysis, network security and stream data mining.

Project Start
Project End
Budget Start
2005-07-15
Budget End
2008-06-30
Support Year
Fiscal Year
2005
Total Cost
$200,000
Indirect Cost
Name
University of Vermont & State Agricultural College
Department
Type
DUNS #
City
Burlington
State
VT
Country
United States
Zip Code
05405