Browse Prior Art Database

Optimal Tri-State Classifier Using ROC Analysis

IP.com Disclosure Number: IPCOM000031662D
Original Publication Date: 2004-Oct-04
Included in the Prior Art Database: 2004-Oct-04

Publishing Venue

IBM

Abstract

Classifiers that in certain cases abstain from classification can significantly reduce the total misclassification cost. However, setting the parameters for such abstaining classifiers is often done in an ad-hoc manner. Our invention disclosure proposes a method to improve the selection of parameters of an abstaining classifier using ROC analysis. Our method builds an improved classifier in a two models: cost-based model and a bounded one. We also show an embodiment and how our method can be used with a cross-validation and a ROC curve building algorithm to reduce the mislcassification cost in classification systems.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 16% of the total text.

Page 1 of 11

Optimal Tri-State Classifier Using ROC Analysis

A classifier is a function that assigns a class label from a finite and discrete set of labels to an instance. For example, given information about a news article, a classifier might specify the topic it deals with; given a image, a classifier might decide on which letter of an alphabet it depicts. Classifiers can be constructed automatically or built by human experts. In this invention disclosure we focus on the first type.

One way to build classifiers automatically is to use supervised machine learning techniques, which consist of the following two stages: learning stage and testing stage. In the learning stage, the classifier is given a set of instances together with correct class labels, which allows it to modify its internal structure. In the testing stage, the classifier is presented with unlabeled instances and it assigns the correct class labels.

A big subset of classification tasks are binary classification tasks, where the classifier assigns one of the two classes. In most of these cases the classifier tests an instance with respect to a particular property (e.g., an object is an enemy plane; an e-mail is a legitimate email or not; a security event is an intrusion). Binary classification might seem a bit restrictive, but it is very well understood and has some properties which make it particularly appealing. Without loss of generality, it is assumed that a binary classifier uses two classes, a positive one (denoted by +) and a negative one (denoted by -).

Table 1: Confusion and cost matrices for alert classification. The positive class (+) denotes true alerts and the negative class (-) denotes false alerts. The columns represent classes assigned given by the system; the rows represent actual classes.

More formally, a classifier C is defined as a function C(i) : i -> l , where i belongs to {i1, i2, . . . , in} and l belongs to {l1, l2, . . . , lk} . A binary classifier Cb is defined as Cb(i) : i -> l, where i belongs to {i1, i2, . . . , in} and l belongs to {+, -} .

The performance of a classifier using k class labels is described by means of a k × k dimensional matrix C , known as confusion matrix. Rows in C represent actual class labels and columns represent class labels assigned by the classifier. Element C[i, j] represents the number of instances of class i classified as class j by the system. For a binary classifier Cb , the elements of the matrix are called true positives (tp ), false negatives (fn ), false positives (fp ) and true negatives (tn ) as shown in Table 1(a).

1

[This page contains 1 picture or other non-text object]

Page 2 of 11

Many real world classification problems are asymmetrical, which means that some misclassifications are more ''costly'' that the others. This can be modelled in a cost-sensitive setup by introducing so called cost matrix Co with identical meaning of rows and columns as the confusion matrix. The value of Co[i, j] represents the cost...