Publishing Venue
The IP.com Prior Art Database
Abstract
A method of identifying pertinent transform coefficients in an object classification method for classifying objects in a digital image of the type that includes the steps of performing a frequency domain transformation on the digital image to produce transform coefficients, extracting a feature vector from the transform coefficients, and classifying objects based on the extracted feature vector, includes the steps of: collecting first and second samples of digital images of a first class of objects and a second class of objects, respectively; performing a frequency domain transform on the digital images in the first and second samples to produce first and second samples of transform coefficients; equalizing the band energy of the transform coefficients in the first and second samples of transform; computing a discriminant measure of the band energy equalized transform coefficients; and selecting a subset of transform coefficient locations based on their discriminant measure.
OBJECT CLASSIFICATION USING SPACE-FREQUENCY DOMAIN
TRANSFORM COEFFICIENT FEATURE VECTORS
As
discussed in Zhu's paper ("Fast Face Detection Using Subspace Discriminant
Wavelet Features," IEEE CVPR 2000, pp. 636-641), computation complexity is
a real problem for all object classifiers. Small window sizes using down-sampled image regions are
often adopted to reduce feature dimensionalities. Essentially, these classifiers rely on low frequency
information of an object pattern due to the low resolution representation
associated with small template size.
High Frequency features such as edges are not fully captured by these
classifiers, though such features are the more prominent ones to characterize
an object. Another common way to
further reduce computational complexity is through low dimensional
representation such as eigenspace decomposition and principal component
analysis which are, in general, not optimal for discrimination purposes. To capture adequate spatial and
frequency information of objects in an image, a discriminant subspace algorithm
is proposed in Zhu's paper to use wavelet packet analysis to find a low
dimensional subspace with maximum discrimination information among all possible
subspaces.
For
a set of orthonormal basis that spans a wavelet packet space/subspace, Zhu
defines a discriminant measure that is calculated by averaging normalized
square distance in the wavelet packet space/subspace for all training signals
from two classes. The discriminant
measure defines a class separability associated with the wavelet packet
space/subspace for two classes.
The maximum value of the discriminant measure is 1 because the wavelet
packet decomposition is designed to preserve original signal information if a
full binary tree structure is used.
If the decomposition is performed in a subspace, then the discriminant
measure is less than one. By
allowing the discriminant measure to be less than one, the classification
problem can be reduced into a subspace or a partial binary tree structure,
thus, reducing the dimensionality.
As reported in Zhu's paper, by dropping 5% discriminant power, the
dimensionality can be reduced by 78%.
The remaining 23% decriminant information retaining 95% discriminant
power is from places containing most energies in the wavelet packet space. Examining the reported results reveals
that the preserved discriminant information is mostly contributed by a few
subspaces (sub-bands) associated with operations of low pass filters when image
data are used. This is
understandable since for most images, energies are contained in low frequency
signals.
In
classifying objects from different classes using wavelet packet analysis, the
motivation is to search for a few dominant features with strong
discriminability in all possible frequency and spatial locations. However, the fact that the wavelet
packet decomposition keeps an uneven energy distribution among the subspaces
(sub-bands) prevents exploring features in high frequency...