Browse Prior Art Database

A system and method of cross-lingual tagging and searching

IP.com Disclosure Number: IPCOM000216999D
Publication Date: 2012-Apr-27
Document File: 4 page(s) / 59K

Publishing Venue

The IP.com Prior Art Database

Abstract

The present invention relates to web search and more particularly to system and method of establishing bilingual tag relationship visually and translating query keywords by looking up the bilingual tag pair and enhance cross-lingual search results. The system presents the cluster of bilingual tag clouds for a document and provides the capability for users to incrementally establish the relationship of tags between the source language and the user's language visually over time. The system receives a query request and translates the keywords into source scripting language based on stored bilingual tag pair, and then use the transformed keywords to match the documents one by one. The transformation between original keywords and transformed source script keywords is on per document basis. This invention has the the following major advantages in improving the efficiency in cross-lingual search. The system provides easy-to-use cluster of paired tag clouds to enable users to incrementally build and update bilingual tag mapping visually. The transformation between the user's language and document scripting language makes the search results more accurate. The transformation on per document basis makes the matching be on performed with the document context taken into consideration.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 01 of 4

A system and method of cross-lingual tagging and searching

The present invention relates to web search and more particularly to system and method of establishing bilingual tag relationship visually and translating query keywords by looking up the bilingual tag pair and enhance cross lingual search results.

Nowadays, people surfing on internet often visit web site with its content scripted in other languages. Keywords are commonly used to retrieve the documents the user is interested in. When processing query requests, search engines usually do some pre processing to handle the difference between the script the documents were written and the script of the keywords .

The cross lingual document pre processing can usually be achieved using one of the following approaches:

A) The search engine looks up dictionary and convert the keywords to the scripting language of the documents .

This approach is sometimes not very effective as the meaning of keywords may vary in different documents , and there usually be several synonyms that lower the accuracy as well.


B) Machine translation is conducted on either the search keywords or the documents.

Translation itself is sophisticated and involves understanding of natural languages. The machine translation is usually computing sensitive, and consumes a great deal of resources to process.

This invention introduces a system and method to establish the relationships between tags in document scripting language and tags in other languages, translate users' queries to document scripting language based on bilingual tag pair.

In one aspect, the system presents the cluster of bilingual tag clouds for a document and provides the capability for users to incrementally establish the relationship of tags between the source language and the user's language visually over time.

In another aspect, the system receives a query request and translates the keywords into source scripting language based on stored bilingual tag pair, and then use the transformed keywords to match the documents one by one . The transformation between original keywords and transformed source script keywords is on per document basis.

This invention has the the following major advantages in improving the efficiency in...