Network intrusion prevention systems (IPSes) play an important role in protecting computers against attacks originating from the network. Signature matching is a performance-critical operation that each IPS must perform: after storing a reassembled TCP-level byte stream or a field of a higher level protocol in a buffer, the IPS needs to decide whether it matches any of the signatures that describe known attacks. This project investigates methods for representing signatures that allow fast matching, require little memory, and can support complex signatures expressed as regular expressions.
Currently used representations, such as deterministic finite-state automata (DFAs) and non-deterministic finite-state automata (NFAs), have severe drawbacks. In general, DFAs enable fast matching but are space inefficient and NFAs are concise but are slow to match against. Solutions based on multiple DFAs have intermediary matching speeds and memory requirements. One of the core reasons why such solutions provide unfavorable speed versus memory tradeoffs is the state-space explosion problem.
This project focuses on a novel signature representation that neutralizes statespace explosion: extended finite automata (XFAs). XFAs extend DFAs with a few bytes of ""scratch memory"" used to store bits which record auxiliary information during the matching, or counters which record progress. When an accepting state is reached, the scratch memory is checked and a match is declared only if it holds suitable values. Preliminary results on signature sets from Snort and Cisco IPS show that compared to solutions using multiple DFAs, XFAs can be 10 times smaller and at the same time 5 times faster in software.