Browse Prior Art Database

Improving Named Entity Recognition in Speech through Multi-pass Search Space Pruning Utilizing Redundant Information in Partially Recognized Co-occuring Entities

IP.com Disclosure Number: IPCOM000179615D
Original Publication Date: 2009-Feb-19
Included in the Prior Art Database: 2009-Feb-19
Document File: 2 page(s) / 176K

Publishing Venue

IBM

Abstract

Named Entity Recognition (NER) for voice data is very important because it forms the basis of many important applications like search in audio data, retrieving important audio documents, mining audio content and deriving business intelligence from audio conversations stored in call centers. In this invention we disclose a method to improve the accuracy of named entity recognition in audio signals by exploiting the partially recognized results and the structured information present in the database.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 55% of the total text.

Page 1 of 2

Improving Named Entity Recognition in Speech through Multi -pass Search Space Pruning Utilizing Redundant Information in Partially Recognized Co -occuring Entities

                     Authors
Tanveer A Faruquie, Shajith I Mohamed, L V Subramaniam, Mukesh K Mohania, Shantanu Godbole.

Motivation:

Named Entity Recognition (

of many important applications like search in audio data, retrieving important audio documents, mining audio content and deriving business intelligence from audio conversations stored in call centers.

Named Entity Recognition also finds application in other

voice applications such as Self-service applications and voice portals

Problem:

In this invention we disclose a method to improve the accuracy of named entity recognition in audio signals by exploiting the partially recognized results and the structured information

present in the database.

Background:
Typically the recognition accuracy for telephony conversation speech is from 60-70%. The recognition models have about 50

phones with 3000-8000 context dependent phones and

100k-200k Gaussians. The recognition model handles vocabulary of 100L words and is trained over large corpora. This large amount of data makes the search space for decoding very large. This is also true when we want to recognize Named Entities, such as names, contact numbers, date of birth etc,

present in conversational speech. The reason why the accuracy of named entity

is low is again because of large search space. There are relatively large number of branch out in the search graph at any given point of time.

Embodimen...