Browse Prior Art Database

Method and System for Fast Retrieval of Entries from a Service Catalog Using Similarity Preserving Semantic Hashing

IP.com Disclosure Number: IPCOM000238113D
Publication Date: 2014-Aug-01
Document File: 4 page(s) / 188K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for fast retrieval of entries from a service catalog using similarity preserving semantic hashing assisted by domain knowledge.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 47% of the total text.

Page 01 of 4

Method and System for Fast Retrieval of Entries from a Service Catalog Using Similarity Preserving Semantic Hashing

In an organization, a set of Information Technology (IT) solutions with required capabilities is required to be found from a service catalog. Traditional solutions either rely on a manual process or a relational database in which all the entries of the service catalog are input and queries are executed. Typical relational databases have good performance for exact queries but it is usually difficult to find a similar match.

Disclosed is a method and system for fast retrieval of entries from a service catalog using similarity preserving semantic hashing assisted by domain knowledge.

The method and system utilizes the semantic hashing to encode each entry in the service catalog into a binary code. Here, each entry in the service catalog which corresponds to a configuration is converted into a much simpler binary code using the semantic hashing algorithms. The configuration is one of, but not limited to, computing, networking and storage configuration.

In addition, the method and system preserves similarity among different configurations during the encoding using the semantic hashing by providing a minimum binary distance between similar configurations. Here, the binary distance is, but need not be limited to, a Hamming distance.

The method and system receives domain knowledge as an input to the semantic hashing when the entries in the service catalog are coded into compact binary codes for determining similarity between different pairs of configurations.

For example, consider two entries in the service catalog having the following format:

Entry 1: Server: x-series; Networking: 10Gbps; Storage: ABC PQR; Software: XYZ

Entry 2: Server: p-series; Networking: 100Gbps; Storage: ABC D1K, Software: DEF

First, the method and system divides property fields of each entry into different interest groups. In an example, server and networking property fields are in interest group 1, storage property field is in interest group 2 and software property field is in interest group 3. In another example, each property field is an individual interest group.

1


Page 02 of 4

Next, the method and system creates a similarity matrix. The similarity matrix is, but need not be limited to, a square matrix. In the similarity matrix, each row is a configuration for one interest group. In an example, the similarity matrix is as shown below:

The method and system embeds domain knowledge in the similarity matrix S, wherein each value in the similarity matrix denotes

the similarity of two configurations. Here, the values in the similarity matrix are provided by a domain expert. The values in the similarity matrix range from 0 to 1. The closer the value is to 1, the more similar the two configurations are. For example, for two configurations "Storage = ABC PQR" and "Storage = ABC D1K", from the same vendor ABC, the similarity is higher compared to two conf...