Browse Prior Art Database

Method of Interval Prosody Target Prediction

IP.com Disclosure Number: IPCOM000016292D
Original Publication Date: 2002-Oct-21
Included in the Prior Art Database: 2003-Jun-21

Publishing Venue

IBM

Abstract

Problem Solved: Prosody model is a very important component in Text-to-Speech(TTS) technology. It is strongly related to the naturalness of synthetic voices. Due to inscience about some unknow factors and the interplay among some known and unknown factors, it is unreasonable to predict the prosody parameters with the precise point estimation algorithms. On the the hand , In the current TTS technology, given the same input text, a TTS system will generate the same synthesized voices. So it sounds too uniform. While for human being, we never speak the same sentence in the same way twice. A method of interval prosody target prediction is invented and described in this disclosure, which facilitates a TTS system to generate different voices given the same input text but still good for perception. It makes a TTS system more similar to the way of a real speaker speaking, hence more natural. Novelty: Method to generate the interval prosody target predictions, instead of the precise point approximation for the prosody targets. Method to generate different voices but still good for perception given the same input text in a TTS system using the interval target predictions.