Improving search containing country names and country adjectives - country name conversion
Publication Date: 2015-Dec-03
The IP.com Prior Art Database
Described is a country adjective conversion method that converts text and reconstructs sentences that include the step of determining adjective word of country. This method eliminates unrelated documents in search and ,in fact, generates correct candidate answers for a deep question answering system.
Page 01 of 2
Improving search containing country names and country adjectives - conversion
This invention is a method that treats country names and country adjectives or nationalities in a unified way for searches. In practice, this method can improve both recall and precision of searches by 30-50% when dealing with country terms. In a deep question answering system, search queries only find source documents that contain matching words which often cause an issue when source documents only contain a nationality word, but not country name.
For users, it is desirable for the system to recognize the nationality instead of just country and eliminate unrelated sources that do not represent the original question. However, the current deep question answering system does not have a way to recognize the nationality as the adjective of a specific country. In addition to the improvement of effectively having atomically generated country names, this method will also improve answer ranking by returning correct country names instead of nationalities when users are looking for country name instead.
In a current deep question answering system, questions with country terms perform an overly restrictive form of matching. The search query built from Question 1 will only return Document 1, the search query built from Question 2 will only return document 2.
Question 1: Who is the Canadian representative to the United Nations?
Question 2: Who is Canada's representative to the United Nations?
Document 1: "The Canadian representative to the United Nations is Guillermo Rishchynski." Document 2: "Canada's permanent Representative of Canada to the United Nations is Guillermo Rishchynski."
This is problematic in cases where only one of the two documents is present in the corpus. Users have no way of knowing whether to phrase a question with a country name or country adjective. As a result, the system misses questions that it actually has the data to answer. In cases where a user does try both versions of a question, large differences in passage retrieval hurt confidence in the system. Further differences in downstream processes (e.g., answer extraction and ranking) hurt confidence as well. A system and method for improving searches related to c...