Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method and System for Systematically Invoking Crowd-Sourcing Platform to Expand Classes of a Natural Language Classification System

IP.com Disclosure Number: IPCOM000249602D
Publication Date: 2017-Mar-07
Document File: 4 page(s) / 42K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for systematically invoking a crowd-sourcing platform to expand classes of a natural language classification system. The method is utilized for semi-automatic addition of classes in a natural-language based classification system by identifying unrecognized queries and clustering the set of unrecognized queries, where the cluster is automatically sent to the crowd-sourcing platform for resolution, on identifying a sufficient number of requests for the cluster.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 30% of the total text.

1

Method and System for Systematically Invoking Crowd-Sourcing Platform to Expand Classes of a Natural Language Classification System

Disclosed is a method and system for systematically invoking a crowd-sourcing platform to expand the classes of a natural language classification system. The method is utilized for semi-automatic addition of classes in a natural-language based classification system by identifying unrecognized queries and clustering the set of unrecognized queries, where the cluster is automatically sent to the crowd-sourcing platform for resolution, on identifying a sufficient number of requests for the cluster.

The method and system utilizes a natural language classifier to classify the set of utterances representing the same query with different word choices into a single class. Subsequently, the method and system classifies the set of queries into classes as Natural Language Classifier Question (NLC-Question) and answers to the question classes are stored in a separate set of classes referred to as NLC-Answer. Here, answers may include one or more valid responses to a particular class in NLC-Question and the classes in NLC-Answer correspond to the answers to the associated queries in NLC-Question.

Further, the method and system defines two class confidence thresholds, which may include, but need not be limited to, ACCEPT_THRESH and REJECT_THERSH that can be similar for all classes or possibly different for each class. Here, the ACCEPT_THRESH is the confidence above which the system believes a query to be a member of the class and the REJECT_THRESH is the confidence below which the system believes a query is definitely not a member of the class. The method and system also includes a counter for each class, which indicates the number of times the class is accessed.

The method and system enables the natural language classifier to hear queries from users and compares the queries with the NLC-Question to determine whether the queries match with the set of classes in the NLC-Question and responds to the queries. If the method and system determines that the queries do not match with any one of the set of classes in the NLC-Question, in other words has confidence level is below the associated REJECT_THRESH values for each class, the queries are placed into a separate set of classes, which is herein referred as the NLC-Question-Orphan set.

Subsequently, when the natural language classifier receives a query associated with the query in NLC-Question-Orphan at a confidence that exceeds the average upper confidence defined in the NLC-Question, the utterance is merged in with existing class instances to form a common class. The method, then utilizes the counter associated with each of the respective NLC-Question-Orphan class to determine if the number of times the class has been accessed exceeds the

2

threshold, and in this case sends the respective class of queries to the crowd-sourcing platform to receive an answer. Here, the crowd-sour...