Software is a common target of attacks on the current computing / communications infrastructure. Software continues to be vulnerable to attacks that exploit obscure or misunderstood language and program features. Detection of these software exploits (also called "malware") will therefore be needed for the forseeable future as one part of an effective defense. Virus checkers detect many known exploits, and are now widely used, but attackers have adapted by obfuscating and mutating their code to evade virus checkers.

Such techniques make precise identification of malware extremely difficult. This project will use key characteristics of attack code for identification purposes. Important features of this approach include: advanced disassembly techniques; translation of code into an intermediate form more amenable to analysis, and more resistant to obfuscation; static reconstruction of program control flow and data flow; and, extraction of properties of interest, followed by analysis of these properties. The properties of interest include the characteristic behaviors of encryption and compression, and the system calls executed by the code. Rather than relying on exact matching of these properties for malware identification, approximate matching will be used. Static analysis will be the focus, to avoid the performance penalties of dynamic execution monitoring. The application of data mining to identify important malware features, and construct high-level patterns or signatures in a completely automated way, will also be investigated. The method will additionally help identify malware relationships, with applications to forensics, recovery of attack strategies, and identification of new classes of attacks (including zero-day attacks).

The method will resist the introduction of noise, or targeted evasion by malware writers, and will provide much better protection against polymorphic and metamorphic exploit code, and new attack variations. A database of patterns / characteristics for known software exploits will be maintained and made public. Educational materials about malware detection will be developed and disseminated, and training of female researchers will continue to be a priority.

Project Start
Project End
Budget Start
2008-09-01
Budget End
2012-01-31
Support Year
Fiscal Year
2008
Total Cost
$268,510
Indirect Cost
Name
North Carolina State University Raleigh
Department
Type
DUNS #
City
Raleigh
State
NC
Country
United States
Zip Code
27695