Browse Prior Art Database

Mixed language text alignment by phone similarity table

IP.com Disclosure Number: IPCOM000198082D
Publication Date: 2010-Jul-26
Document File: 2 page(s) / 93K

Publishing Venue

The IP.com Prior Art Database

Abstract

In recent years, there are more and more mixed language audio contents. However, many speech recognition systems can only handle one language well. Thus, when a mixed language is met, errors of speech recognition will occur. Though the recognized text can be totally different from the correct spelling in another language, its pronunciation may in fact sound like the pronunciation of the correct result. In the past, such similarity in sound is neglected. Here, we propose to use phone similarity table to describe the similarity between the phonemes of different languages. Then we will be able to utilize the similarity to perform cross-lingual text alignment and other tasks.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Mixed language text alignment by phone similarity table

1. Background: What is the problem solved by your invention? Describe known solutions to this problem (if any). What are the drawbacks of such known solutions, or why is an additional solution required? Cite any relevant technical documents or references.

In recent years, there are more and more mixed language audio contents. However, many speech recognition systems can only handle one language well. Thus, when a mixed language is met, errors of speech recognition will occur. Though the recognized text can be totally different from the correct spelling in another language, its pronunciation may in fact sound like the pronunciation of the correct result. In the past, such similarity in sound is neglected. Here, we propose to use phone similarity table to describe the similarity between the phonemes of different languages. Then we will be able to utilize the similarity to perform cross-lingual text alignment and other tasks.

2. Summary of Invention: Briefly describe the core idea of your invention (saving the details for questions #3 below). Describe the advantage(s) of using your invention instead of the known solutions described above.

a: Model the similarity between phonemes of different languages.

Here, we propose a similarity table by calculating the similarity between phonemes of different language.

First, we will generate the Gauss model of each phoneme. The Gaussian model can be generated by the training process of speech recognition.

Then, we will calculate the distance between the Gaussian model of phonemes of different language.

Chinese Phoneme

AN1

2: Use phone similarity table to perform cross language alignment task


In the text alignment process, we have a reference text and the speech recognized text. The speech recognized text contains the time information of each syllable. If we can align the reference text and speech recognized text, we can generate a time synchronized close caption with the correct text of the reference text and the time information of the speech recognized text. This time synch...