Browse Prior Art Database

Procedure for Partitioning Classes Into Sub-Classes for Classification Purposes

IP.com Disclosure Number: IPCOM000037116D
Original Publication Date: 1989-Nov-01
Included in the Prior Art Database: 2005-Jan-29
Document File: 4 page(s) / 43K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

The problem of classification is to allocate a given parameter vector to one of several predetermined classes. The most prominent approach to classification is to construct linear classifiers which partition the parameter space into piecewise linear cells in 1-1 correspondence with the classes. A parameter vector may then be classified by determining in which cell it is located. This approach may not succeed, however, if any of the classes is non-homogeneous, i.e., if any of the classes is actually a super-class consisting of several unrelated sub-classes. This invention provides a means of determining how many sub-classes a given class should be decomposed into. The sub-class locations in the parameter space are then determined via standard cluster-analytic techniques.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 40% of the total text.

Page 1 of 4

Procedure for Partitioning Classes Into Sub-Classes for Classification Purposes

The problem of classification is to allocate a given parameter vector to one of several predetermined classes. The most prominent approach to classification is to construct linear classifiers which partition the parameter space into piecewise linear cells in 1-1 correspondence with the classes. A parameter vector may then be classified by determining in which cell it is located. This approach may not succeed, however, if any of the classes is non-homogeneous, i.e., if any of the classes is actually a super-class consisting of several unrelated sub-classes. This invention provides a means of determining how many sub-classes a given class should be decomposed into. The sub-class locations in the parameter space are then determined via standard cluster-analytic techniques. Thereafter, the parameter vectors may be allotted to their nearest sub-class, so as to relabel the training data in terms of sub-classes rather than classes.

To facilitate the subsequent construction of linear classifiers, sub-classes are sought which are distributed spherically in the parameter space. The convenience of spherical clusters stems from the fact that the covariance matrix of a spherically distributed population is proportional to the indentity matrix, and hence simple-minded distance measures are adequate throughout.

The principle of the invention is as follows. For any given class, we wish to position a set of spheres in the parameter space such that any randomly selected parameter vector from the given class will be inside at least one sphere with high probability, and such that the total volume of the spheres is minimized. Sub- classes are then defined in 1-1 correspondence with the resulting spheres.

We will assume the existence of some training data consisting of N parameter vectors drawn from the class under analysis. We will also assume that N has been chosen large enough so that an upper bound on the actual number of sub-classes is given by U = N/max (10, D), where D is the dimension of the parameter vectors. For larger values of U the clusters may be too sparse and the results will become increasingly unreliable. A lower value of U than N/max (10,D) may be specified by the experimenter.

The procedure begins by picking random seeds and clustering the training data using the K-means algorithm. The number of clusters is then reduced one by one until only one cluster remains. Since there could be as many as U clusters, the procedure picks S > U initial seeds so as to increase the probability that all clusters will be represented among the seeds. In addition, various statistics are computed while the number of clusters declines from S to U, so it is important that S be chosen big enough for this purpose. The procedure is as follows:

Step 1. If U has not been specified by the experimenter, set U = N/max (10,D). Set S = max (2U, U + R), where R = 20 is a reasonable value...