Multi-pattern matching is the problem of finding whether and where a set of patterns appear in a data stream. An example of such a problem is scanning Internet traffic for malware signature patterns. Important considerations in multi-pattern matching are the amount of memory required to store the patterns and the speed with which the stream can be scanned through. Storage space can be reduced either by compressing the patterns or by approximately 'sketching' them. This project aims to advance this field by seeking ways to enable multi-pattern search with (1) minimum possible space, (2) fastest possible time, and (3) high accuracy. The project combines theoretical advances with practical implementations. Theoretical approaches involve sketching, streaming, hashing and succinct data structures. Practical goals include developing faster hardware-based implementations and cache-aware implementations.

Multi-pattern matching is central to applications such as computer virus detection, network intrusion detection and information extraction. It resides at the cross-section of several fields of computer science, including theory, networking, databases and bioinformatics. Recent development of tools in pattern matching such as succinct data structures and streaming algorithms make this an exciting time to advance the state-of-the-art in this field. Several research opportunities for undergraduate students will evolve from this project. Results will be disseminated through publications in conferences and journals.

Project Start
Project End
Budget Start
2012-09-01
Budget End
2016-08-31
Support Year
Fiscal Year
2012
Total Cost
$184,261
Indirect Cost
Name
Louisiana State University
Department
Type
DUNS #
City
Baton Rouge
State
LA
Country
United States
Zip Code
70803