Browse Prior Art Database

System and method for matching keyword and image text

IP.com Disclosure Number: IPCOM000198088D
Publication Date: 2010-Jul-26
Document File: 3 page(s) / 139K

Publishing Venue

The IP.com Prior Art Database

Abstract

Filtering out sensitive words from image is of increasing significance. The traditional ways to solve this problem usually extract the text from the images. There are a lot of existing methods that can extract text from images, such as optical character recognition. However, extracting text from an image is a challenging problem when the text is affected by cluttered background. In this invention, we propose a new method to solve this problem. Our invention does not need to extract the text from the image. Alternatively, it converts the sensitive keyword into an image, then matches the two images by shape information to check if the image contains this sensitive keyword or not.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

System and method for matching keyword and image text

1. Background:

1.

, we compare these two images by shape information to check whether

keyword w or not.

  Figure 2 shows an example of this invention. We are going to detect the keyword "A" (figure 2.a) from a given image (figure 2.c). Unfortunately this given image is much cluttered thus we cannot recognize this keyword. Our invention first changes the keyword into image text (figure 2.b), then applies shape matching algorithm to detect this keyword in this given image.

Claim:

1

Figure

Examples of image text

(text in an image

)

which cannot be recognized by existing methods.

  Filtering out sensitive words from web-document (webpage, email, etc.) is always an important task. This task consists of two branches. One is filtering out sensitive words from text document; the other is filtering out sensitive words from image/video document. The former branch has been solved very well, but the latter one has not.

   The traditional ways to solve the latter branch usually extract the text from the images. There are a lot of existing methods that can extract text from images, such as the built-in OCR engine of Google Search. However, extracting text from an image is a challenging problem when the text is affected by cluttered background (as shown in figure 1).

   In this invention, we propose a new method to solve this problem. Our invention does not need to extract the text from the image. Alternatively, it converts the sensitive keyword into an image, then matches the two images by shape information to check if the image contains this sensitive keyword or not.

2. Summary of Invention:

Main Idea:

   Image texts are usually affected by cluttered background so that it is very hard to detect and recognize these texts. Our invention proposes a method and apparatus to measure...