A design of an algorithm to match users queries with frequently asked questions
Publication Date: 2002-Jan-02
The IP.com Prior Art Database
Disclosed is a design of an algorithm to match users queries with frequently asked questions (FAQs). Benefits include improved relevancy in searches for information about a topic.
A design of an algorithm to match users’ queries with frequently asked questions
Disclosed is a design of an algorithm to match users’ queries with frequently asked questions (FAQs). Benefits include improved relevancy in searches for information about a topic.
The disclosed design is of an innovative algorithm that provides a natural-language interface for FAQs. The algorithm calculates the similarity between the user’s and the FAQ’s questions. After finding the best match, the corresponding answer can be output to the user. Though originally designed for English text, the algorithm is not limited to any specific language.
The algorithm consists of following components:
- Knowledge Base and Dictionaries
▪ Build a Knowledge Base
including the structure of English question sentences
and the definition of a standard question set. All input questions can be
transferred to questions in this set.
▪ Construct a domain-related keyword dictionary and concept dictionary.
- Pre-process of FAQ
▪ Tokenize and stem question-answer pairs of FAQ. Use a domain concept dictionary to map words with concepts.
▪ Build two Inverse Index
Tables, one is for all verbs and the other for the rest words.
Two tables are built automatically followed by the human revision.
• Question Analysis
1. Stem and tokenize the users’ questions and use a domain concept dictionary to map the word set. Get the main features such as WH-word, interrogative key words (for example, often, long, big, soon), pronoun, helping verb, and domain key words.
the abstract question type using interrogative words and interrogative key words.
Abstract question types are represented as vector
Q-HOWLONG, Q-HOWMANY, etc. This value is a Boolean value. Definition of the
question type lies in the completeness of parser.
the question’s structure pattern with the acquired features in step 1.
Use T-rules to transfer these questions to standard question pattern, and at the same time, extract the predicates. Use concept dictionary to transfer all verbs to concept.
• Acquisition of Question Focus
Use question type mark list to scan
the question to get
question focus (see Figure 1).
• Question Matching
Q & A pairs in FAQ to construct the discrete FAQ question space. Each
predicate list and non-predicate list is defined as:
2. Map a new question to the predicate and non-predicate keyw...