Browse Prior Art Database

A method to detect pattern in documents and add annotation to documents to implement intelligent question and answer system.

IP.com Disclosure Number: IPCOM000236999D
Publication Date: 2014-May-26
Document File: 6 page(s) / 91K

Publishing Venue

The IP.com Prior Art Database

Abstract

The disclosure addresses a method to detect pattern in documents and add annotations to them to enhance a search engine to an intelligent question and answer system by ingesting both annotations and the original documents to the document process (parse and index) pipeline of a typical search engine.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 39% of the total text.

Page 01 of 6

A method to detect pattern in documents and add annotation to documents to implement intelligent question and answer system

answer system.

.

This invention aims at solving a problem in the area of mining facts from large repositories of text. In particular, this invention explains a method to detect pattern in documents and add annotation to them to implement an intelligent question and answer system .

User can search answer from any typical search engine if he/she knows some keywords in the answer. But if the user does not know keywords in the answer, it will be difficult to get the answer unless both question and answer are provided in the documents. In addition, there will be bunch of irrelevant documents searched out.

So there is a need to implement a light-weight question and answer system. This disclosure will address a method to detect pattern in document and add annotation to the document. It can be used to implement a light-weight question and answer system by ingesting the annotation together with the original document to the document process (parse and index) pipeline of the typical search engine.

The method is based on the segmentation result of natural language process of a typical search engine.

The method involves analyzing the document by asking questions beginning with "Where", "What", "When", "Who", "How",etc. The claimed points include extracting the pattern, identifying the candidates of different kinds of dictionary, building the dictionaries used in the pattern, adding annotation to documents where the original text is occurred. Literally, the annotation include both the general words in the pattern and words in original text.

Then the typical search engines can parse and index both the original text and the annotation, hence making the search engine intelligently answer questions. In fact , this method can help to build an intelligent question answer system leveraging the key word

match function of a typical search engine, it is not necessary to train or build a model using large amount of data. Moreover, the result row whose relevance score is high will be in the top if user ask the question in the annotation. It will enhance user experience greatly.

Because this method is based on basic natural language process, this method is applied to document in both English and Chinese on condition that the search engine support these languages.

The advantage of this method is obvious, it can help a key word match search engine to implement an intelligent question-answer system on condition that the search engine allow adding annotation to the documents searched. It will enhance the search experiences of the end user. It does not require the complicated technology.

1



Page 02 of 6

2



Page 03 of 6

3



Page 04 of 6

Refer the above flowchart, the elements in light green are the claimed points of this disclosure. To analyze the document, first questions beginning with "Where" , "What","When" , "How" or "Who" can be asked. Ca...