Browse Prior Art Database

Malicious Code Fingerprinting: A Resilient and Efficient Method for Detecting Text-Based Malicious Code

IP.com Disclosure Number: IPCOM000238140D
Publication Date: 2014-Aug-05
Document File: 3 page(s) / 40K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a resilient and efficient method for detecting text-based malicious code for protecting networks using high-performance tools such as Intrusion Prevention Systems (IPS).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

Malicious Code Fingerprinting: A Resilient and Efficient Method for Detecting Text-Based Malicious Code

Problem Statement

Due to the flexibility in how text-based malicious code can be represented, detecting them at high speed using limited computing resources and without compromising detection capabilities is a challenge. Known solutions to the problem including exact string searching and regular expression matching have known drawbacks such as the increase use of CPU and/or memory resources as the number of strings to search for or the number of regular expressions to match for increases. New and clever methods need to be developed to solve this problem.

Method Description

The Malicious Code Fingerprinting method is a resilient and efficient method for detecting text-based malicious code. The method is resilient to minor changes to the text-based malicious code wherein minor changes such as the addition of garbage code/data/markup and modification of unimportant parts of the code would still result in a detection. The method is also well adaptable or can be optimized to run with minimal memory usage, run in streaming mode, and run at very fast speeds. Thus, the method is very well suited for Intrusion Detection/Prevention (IDS/IPS) Systems and Antivirus software which are expected to run at very high speeds without compromising detection capabilities.

Method Details

The Malicious Code Fingerprinting method involves breaking a text-based malicious code such as JavaScript* and HTML into tokens, then for each token that is deemed "interesting", meaning the token is part of the detection, a value called the "fingerprint" is updated using the token's content and a small mathematical transformation such as a hash function - this is a type of rolling transformation for the purpose of identifying malicious code interpolated within a greater amount of code. Then every time the fingerprint is updated, the fingerprint is matched against a list of known malicious fingerprints. If the resulting fingerprint is found to be a malicious fingerprint, detection will be triggered.

The described fingerprinting method is resilient against any addition, removal, modification of any non-interesting tokens (e.g. addition of garbage code/data/markup, modifications of easily replaceable variable/function names, etc.) in the malicious code. Also important is that the operations needed for fingerprinting are simple operations such as hash functions which can be easily optimized to run with minimal computing resources.

Method Flow Chart

The flow chart below describes how the Malicious Code Fingerprinting...