Browse Prior Art Database

Method and System for Utilizing an Expectation Maximization Algorithm for Clustering Rows of a Matrix for TrueNorth Implementation

IP.com Disclosure Number: IPCOM000236882D
Publication Date: 2014-May-21
Document File: 5 page(s) / 192K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for utilizing an expectation maximization algorithm for clustering rows of a matrix for Truenorth implementation.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 42% of the total text.

Page 01 of 5

Method and System for Utilizing an Expectation Maximization Algorithm for Clustering Rows of a Matrix for

Succinct Representation

Abstract

A method and system is disclosed for utilizing an expectation maximization algorithm for clustering rows of a matrix for succinct representation.

Description

Disclosed is a method and system for Expectation Maximization (EM) algorithm for clustering rows of a matrix for succinctly representing a given matrix in hardware, or to minimize the memory required to store such a matrix for use by software.

The method and system utilizes a known clustering technique to initialize a solution and thereafter utilizes the EM algorithm to iteratively optimize the initial solution.

The method and system clusters rows of ܣ into b buckets and represent each element by a combination of a binary value. Additionally, each element is represented by representative weight of the bucket to which the corresponding row belongs thereby optimizing storage memory and providing efficient approximation. The method includes a step of initializing a number of centroids equivalent to a number of rows and use agglomerative clustering to reduce total number of clusters to ܾ. Here, the EM algorithm is utilized for iteratively refining the clusters until the EM algorithm converges to a local minimum.

The method and system utilizes the EM algorithm that is guaranteed to converge to a local optimum for a problem of clustering rows of the given real valued matrix ܣ with ݉ rows and ݊ columns into ܾ buckets, such that the ݅ th row falls into a bucket.

ܩ(݅) ∈ {1,2, . ܾ}

1


Page 02 of 5

For each column ݆, weights b൛ݏ௝ଵ, ݏ௝ଶ, … … , ݏ

                                      ൟcorresponding to each of the buckets are computed and binary values ݓ௜௝ are selected for each ݅and ݆, such that for a distance measure ݀(ܽ, ܾ) = (ܽ − ܾ), a following sum is minimized:

෍ ෍ ݀(ܣ௜௝, ݏ௝ ீ()ݓ௜௝)

௜ୀଵ

௝ୀଵ

The EM algorithm initializes with a maximum possible number of buckets. For a matrix with ݉ rows, the algorithm starts with ݉ buckets each containing a unique row of the matrix. A centroid is also computed for each bucket, wherein a centroid is a vector containing s-values of all columns for a particular bucket. For a given set of rows, the s-values are computed based on an expectation step of the EM algorithm.

Thereafter, the distance between a pair of centroids is computed as the sum of square of the difference between the corresponding vectors wherein both the vectors are non-zero in a particular column. At every iteration step, buckets with closest centroids are merged to reduce the total number of buckets.

After merging the buckets, the centroid of the merged buckets is recomputed subsequently. The merging process is continued until the total number of buckets is less than ܾ. The resulting set of ܾ buckets are then transferred to the EM algorithm for further optimization.

The expectation step a...