Mining sequential patterns and structured patterns (e.g., trees, lattices, and graphs) are important data mining tasks, with broad applications, such as the analysis of customer purchase sequences and Web page structures, understanding of disease treatments, scientific experiments, transportation and production processes, detection of anomaly and unusual patterns, discovery of bio-molecule sequences and structures, and so on.

This project is to perform a systematic investigation of the principles, algorithms, and applications of scalable sequential and structured pattern mining, which covers the following issues: (1) development of highly scalable mining algorithms, including mining max-patterns, closed patterns and top-k patterns; (2) investigation of highly flexible mining methodologies, including mining of multi-dimensional multi-level sequential and structured patterns and constraint-based mining; (3) extension of the scope to cover sequential or structured pattern-based clustering; and (4) application of multi-dimensional, multi-level sequential or structured pattern mining for intrusion detection, Web mining, and other important applications. This will lead to a set of efficient, scalable, and flexible sequential and structured pattern mining methods for scientific and industrial applications.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0209199
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2002-08-15
Budget End
2006-07-31
Support Year
Fiscal Year
2002
Total Cost
$165,000
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820