Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Fast Runtime Implementation of K Nearest Neighbor Classifiers

IP.com Disclosure Number: IPCOM000110419D
Original Publication Date: 1992-Nov-01
Included in the Prior Art Database: 2005-Mar-25
Document File: 5 page(s) / 256K

Publishing Venue

IBM

Related People

Alexopoulos, S: AUTHOR [+3]

Abstract

This invention provides a fast runtime implementation of K nearest neighbor (1,2) classifier for classification (e.g., pattern recognition) application using fixed-point features (common in real applications). The runtime algorithm is feasible for both software and realtime hardware implementation, and the training algorithm has a feasible runtime as well. NN to 2n Tree Mapping Algorithm

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 33% of the total text.

Fast Runtime Implementation of K Nearest Neighbor Classifiers

       This invention provides a fast runtime implementation of
K nearest neighbor (1,2) classifier for classification (e.g., pattern
recognition) application using fixed-point features (common in real
applications).  The runtime algorithm is feasible for both software
and realtime hardware implementation, and the training algorithm has
a feasible runtime as well.
NN to 2n Tree Mapping Algorithm

      In what follows we will describe an algorithm that results in a
classifier (tree) that returns the index (1 to N) of the nearest
training vector from a different class and that there are N classes.
In the usual case the number of classes, M, is much less than the
number of training vectors, N.  We choose this simple case because it
is easier to describe.  Treating each training vector as a separate
class will result in a large tree, but it can be deterministically
pruned, as described in [3,4], to combine nodes of the same class to
handle the usual situation where one has many more training vectors
than classes (M < N).

      For each of the     = N(N - 1)/2 pairs of training vectors
there is a hyperplane that divides the feature space into two
regions, each corresponding to all the points closer to one vector
than the other.  The set of all N(N - 1)/2 of these hyperplanes
divides feature space into a set of hyperpolyhedra, where all the
points in a given hyperpolyhedron are closest to the same training
vector.  We will refer to this set of hyperpolyhedra as the elemental
set corresponding to the training set.  In a Vornoi tesselation of
feature space corresponding to the training set, one such
hyperpolyhedron (referred to as a Vornoi region) that contains all
the points that are closer to that vector than any other is formed
around each training vector.  The set of all the hyperpolyhedra
formed by the above mentioned separating hyperplanes constitutes a
finer subdivision of feature space than the Vornoi tessalation
because each Vornoi region will, in general, be a union of several of
the elemental hyperpolyhedra.  See Fig. 1.

      Now consider the 2n tree that will be used to implement the NN
classifier.  The 2n tree is the n-dimensional generalization of the
quad tree, which corresponds to n = 2.  Corresponding to every node
in the tree there is a hypercube in feature space and the hypercubes
corresponding to the 2n children of any node are the 2n hypercubes
formed by bisecting every side of the hypercube corresponding to the
parent.  Imagine starting with the complete-tree representation of
the entire feature space, where the leaves correspond to the atomic
cells (or "buckets") of feature space.  Thus there are 2nh leaves and

                            (Image Omitted)

total nodes, where n is the number of features and h is the number of
bits to which the features are encoded (e.g., 8).  To reiterate,
there is a hype...