Mining sequential patterns and structured patterns (e.g., trees, lattices, and graphs) are important data mining tasks, with broad applications, such as the analysis of customer purchase sequences and Web page structures, understanding of disease treatments, scientific experiments, transportation and production processes, detection of anomaly and unusual patterns, discovery of bio-molecule sequences and structures, and so on.
This project is to perform a systematic investigation of the principles, algorithms, and applications of scalable sequential and structured pattern mining, which covers the following issues: (1) development of highly scalable mining algorithms, including mining max-patterns, closed patterns and top-k patterns; (2) investigation of highly flexible mining methodologies, including mining of multi-dimensional multi-level sequential and structured patterns and constraint-based mining; (3) extension of the scope to cover sequential or structured pattern-based clustering; and (4) application of multi-dimensional, multi-level sequential or structured pattern mining for intrusion detection, Web mining, and other important applications. This will lead to a set of efficient, scalable, and flexible sequential and structured pattern mining methods for scientific and industrial applications.