Browse Prior Art Database

Weighted Phonetic Distance Metrics

IP.com Disclosure Number: IPCOM000124572D
Original Publication Date: 2005-Apr-28
Included in the Prior Art Database: 2005-Apr-28
Document File: 6 page(s) / 189K

Publishing Venue

Motorola

Related People

Chen Liu: AUTHOR [+2]

Abstract

This paper presents our work in developing a series of weighted phonetic distance metrics, a major part of our research project “Build acoustic models for a language without native data.” A binary-valued distinctive feature system is employed to quantitatively represent phones. The feature weights are derived statistically from lexica. Weighted phonetic distance metrics are defined. Our cross-language model transfer experiment shows that the performance of the new phonetic distance-based candidates is comparable to that of acoustic distance-based candidates.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 15% of the total text.

Weighted Phonetic Distance Metrics

 

Chen Liu

, Lynette Melnar

Abstract

This paper presents our work in developing a series of weighted phonetic distance metrics, a major part of our research project “Build acoustic models for a language without native data.” A binary-valued distinctive feature system is employed to quantitatively represent phones. The feature weights are derived statistically from lexica. Weighted phonetic distance metrics are defined. Our cross-language model transfer experiment shows that the performance of the new phonetic distance-based candidates is comparable to that of acoustic distance-based candidates.

Introduction

There are many reasons for people in the areas of language and speech to want to measure the degree of phonetic similarity between members of a group of languages or dialects. In fact, phonetic similarity measurement has been used as a basic means in the research on language relationship [Ladefoged, 1969], dialectal relationship [Kessler, 1995; Nerbonne and Heeringa, 1997], historical sound change [

Covington

, 1996; Hartman, 2003], assessment of Children’s articulation [Somers, 1998], and so on, and a number of phonetic distance measures have been developed. Recently people in the application areas such as ASR and TTS have also found a need to adopt the measures of phonetic distance for cross-language model transfer or model sharing [Köhler, 1996; Daalsgaard et al, 1993]. Another case of application of the phonetic distance metric is in the area of speech coding. For example, PGPfone used a special list of words as a codebook and the words in the list were selected so that their phonetic distances were maximized [Zimmermann, 1996].

A necessary step before distance calculation is to conduct quantification of the phonemes to be compared. A phone can naturally be represented by a series of phonetic features that characterize the articulatory detail of the phones in such aspects as voicing, place of articulation, and manner of articulation [e.g., Chomsky and Halle, 1968; Hartman, 1981; IPA, 1999]. Typical phonetic features include ‘sonorant’, ‘continuant’, ‘voicing’, ‘labial’, ‘nasal’, ‘front’, and ‘syllabic’, to name a few. Hence most phonetic distance measures are feature-based metrics.

A phonetic feature is traditionally represented as a binary variable; specifically, the value 0 means absence of the feature in the phone and 1, presence of the feature. Hence each phone is represented quantitatively by a vector with elements of 0’s and 1’s. Phonetic distance is calculated in the same way as distance in a vector space, such as the

Manhattan

distance and the Euclidean distance, among other types [Gildea and Jurafsky, 1996; Nerbonne and Heeringa, 1997; Kondrak, 2003]. The

Manhattan

distance and the Euclidean distance are two widely used distance metrics.

A traditional binary-valued feature system suffers in quantitatively characterizing the phonetic differences precisely,...