Browse Prior Art Database

Trapezoidal DP Matching for Speech Recognition

IP.com Disclosure Number: IPCOM000049912D
Original Publication Date: 1982-Aug-01
Included in the Prior Art Database: 2005-Feb-09
Document File: 3 page(s) / 79K

Publishing Venue

IBM

Related People

Okochi, M: AUTHOR

Abstract

This article describes a concept of a new nonlinear pattern matching method for spoken word recognition. The method, named Trapezoidal DP Matching, preserves time reversibility (matching measure is not changed even if the time sequence of the time series of feature vectors is reversed), gives a better approximation of continuous-time pattern distance than its corresponding conventional DP matching, and is expected to improve recognition accuracy.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Trapezoidal DP Matching for Speech Recognition

This article describes a concept of a new nonlinear pattern matching method for spoken word recognition. The method, named Trapezoidal DP Matching, preserves time reversibility (matching measure is not changed even if the time sequence of the time series of feature vectors is reversed), gives a better approximation of continuous-time pattern distance than its corresponding conventional DP matching, and is expected to improve recognition accuracy.

In spoken word recognition, Dynamic Programming (DP) is widely used for calculating a matching measure (distance or similarity) of patterns of time-series feature vectors, as shown in the drawing.

For two patterns expressed by time-series feature vectors, A: a(1), a(2), ..., a(I) and B: b(1), b(2),..., b(J), the pattern distance D(A,B) is usually defined as the minimum weighted mean of local distances on its matching Paths; (see original), where F: (i(k),j(k)), k=l, 2, ..., K is a matching path, d(i,j) is a local distance of a(i) and b(j), and w(k) is a weighting coefficient.

In the conventional formulations (Rectangular DP Matching), the weighting coefficient w(k) is defined so as to satisfy the following:
(1) The denominator is a constant which does not depend on the

selection of the matching path F. (see original).
(2) w(k) depends upon the unit path p(k) which comes to the node

(i(k),j(k)), where p(k)=(i(k)-i(k-1),j(a)-j(k-1)).

Then the pattern distance (see original) can be calculated by Dynamic Programming.

Various formulations have been proposed which satisfy the conditions (1) and (2) shown in Table 1. However, none of them have all the following properties which seem natural for pattern distance: (P1) Pattern Commutative. Matching measure is identical even

if the two patterns are commuted: D(A,B)=D(B,A).

(P2) Time Reversible. Matching measure is identical even if

time sequence is reversed: D(A',B') = D(A,B) where A' : a(I),

a(I1), ..., a(1) and B' : b(J), b(J1), . . ., b(1).

(P3) Linear Matching Consistent. In case both time-series

feature vectors are the same in length (I=J) and the optimal

matching path is linear, the matching result is the same as

that obtained by linear matching: (see original).

In the Trapezoidal DP Matching, the weighting coefficient w(k) is defined by both the in-coming and out-going paths of the node: w(k)= (s(k) s(k+1)/2 for k=1, 2, ..., K, where s(k) depends upon the unit path p(k), and p(1)=p(K+1)=(1,1). (see original). Since the pattern distance is formulated as follows; where p(k)=Pi(k),pj(k)). Equation 3 can be calculated by Dynamic Programming if the denominator is a constant (=2N) which does not depend upon the selection of the matching path. Because of the relationship in Equation 2, the unit path p(k) and weighting coefficient w(k) for Rectangular DP Matching are applicable to the p(k) and s(k) for Trapezoidal DP Matching. Table 2 shows typical...