OBJECT CLASSIFICATION USING SPACE-FREQUENCY DOMAIN TRANSFORM COEFFICIENT FEATURE VECTORS
Publication Date: 2002-Apr-30
The IP.com Prior Art Database
A method of identifying pertinent transform coefficients in an object classification method for classifying objects in a digital image of the type that includes the steps of performing a frequency domain transformation on the digital image to produce transform coefficients, extracting a feature vector from the transform coefficients, and classifying objects based on the extracted feature vector, includes the steps of: collecting first and second samples of digital images of a first class of objects and a second class of objects, respectively; performing a frequency domain transform on the digital images in the first and second samples to produce first and second samples of transform coefficients; equalizing the band energy of the transform coefficients in the first and second samples of transform; computing a discriminant measure of the band energy equalized transform coefficients; and selecting a subset of transform coefficient locations based on their discriminant measure.
As discussed in Zhu's paper ("Fast Face Detection Using Subspace Discriminant Wavelet Features," IEEE CVPR 2000, pp. 636-641), computation complexity is a real problem for all object classifiers. Small window sizes using down-sampled image regions are often adopted to reduce feature dimensionalities. Essentially, these classifiers rely on low frequency information of an object pattern due to the low resolution representation associated with small template size. High Frequency features such as edges are not fully captured by these classifiers, though such features are the more prominent ones to characterize an object. Another common way to further reduce computational complexity is through low dimensional representation such as eigenspace decomposition and principal component analysis which are, in general, not optimal for discrimination purposes. To capture adequate spatial and frequency information of objects in an image, a discriminant subspace algorithm is proposed in Zhu's paper to use wavelet packet analysis to find a low dimensional subspace with maximum discrimination information among all possible subspaces.
For a set of orthonormal basis that spans a wavelet packet space/subspace, Zhu defines a discriminant measure that is calculated by averaging normalized square distance in the wavelet packet space/subspace for all training signals from two classes. The discriminant measure defines a class separability associated with the wavelet packet space/subspace for two classes. The maximum value of the discriminant measure is 1 because the wavelet packet decomposition is designed to preserve original signal information if a full binary tree structure is used. If the decomposition is performed in a subspace, then the discriminant measure is less than one. By allowing the discriminant measure to be less than one, the classification problem can be reduced into a subspace or a partial binary tree structure, thus, reducing the dimensionality. As reported in Zhu's paper, by dropping 5% discriminant power, the dimensionality can be reduced by 78%. The remaining 23% decriminant information retaining 95% discriminant power is from places containing most energies in the wavelet packet space. Examining the reported results reveals that the preserved discriminant information is mostly contributed by a few subspaces (sub-bands) associated with operations of low pass filters when image data are used. This is understandable since for most images, energies are contained in low frequency signals.
In classifying objects from different classes using wavelet packet analysis, the motivation is to search for a few dominant features with strong discriminability in all possible frequency and spatial locations. However, the fact that the wavelet packet decomposition keeps an uneven energy distribution among the subspaces (sub-bands) prevents exploring features in high frequency...