Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for Automatic Evaluation of Connection Model Parameters for Speech Synthesis Units

IP.com Disclosure Number: IPCOM000062379D
Original Publication Date: 1986-Nov-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 2 page(s) / 48K

Publishing Venue

IBM

Related People

Kaneko, T: AUTHOR [+3]

Abstract

This article describes automatic evaluation of parameters for a connection model for speech synthesis units using DP (dynamic programming)- based matching. In speech synthesis by rule, choice of synthesis units (e.g., phonemes, syllables, mora, etc.) and their connections are very important for the quality of speech. Once synthesis units and the type of connecting method are given, then there comes the problem of optimizing the parameters for the connection model. The most widely used method for such a purpose is to use the human ear. Although the evaluation by human ear is the most desirable as the final evaluation, it is not only fairly listener-dependent but also time-consuming. In view of these points, a more efficient way is proposed for evaluating the connection without using the human ear.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 59% of the total text.

Page 1 of 2

Method for Automatic Evaluation of Connection Model Parameters for Speech Synthesis Units

This article describes automatic evaluation of parameters for a connection model for speech synthesis units using DP (dynamic programming)- based matching. In speech synthesis by rule, choice of synthesis units (e.g., phonemes, syllables, mora, etc.) and their connections are very important for the quality of speech. Once synthesis units and the type of connecting method are given, then there comes the problem of optimizing the parameters for the connection model. The most widely used method for such a purpose is to use the human ear. Although the evaluation by human ear is the most desirable as the final evaluation, it is not only fairly listener-dependent but also time-consuming. In view of these points, a more efficient way is proposed for evaluating the connection without using the human ear. The method proposed here is to use the amount of residual errors of DP-based matching. The drawing shows the procedure of this method for the case where speech units S1 and S2 are connected by a connection function F which has a set of parameters to be determined. In the drawing a feature extracted from synthesized speech with each set of parameters is in turn matched to a feature extracted from natural speech corresponding thereto using DP-based matching. After the matching is completed for all parameter sets, the parameter set that gives the least amount of residual error is selec...