Browse Prior Art Database

Optimal Covariance Propogating Linear Transformation for Speech Signal Adaption

IP.com Disclosure Number: IPCOM000111242D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 4 page(s) / 89K

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

Disclosed is a modified adaptation algorithm which eliminates an unneeded sub-word specific linear map from the acoustic space of reference speech to the acoustic space of new speech.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 49% of the total text.

Optimal Covariance Propogating Linear Transformation for Speech Signal

Disclosed is a modified adaptation algorithm which eliminates
an unneeded sub-word specific linear map from the acoustic space of
reference speech to the acoustic space of new speech.

Described in [*]  is an adaptation procedure based on sub-word
dependent linear transformations.  A typical transformation was based
on a set {(X sub t,Y sub t )} of matched pairs of (spectral) vectors
together with centroids mu sub X,mu sub Y and covariance matrices
Sigma sub X,Sigma sub Y.  The transformation was constructed in two
stages.  The first stage is a linear mapping Z = A sub 1 X where A
sub 1 is determined by standard (unconstrained) least squares from
the set of matched vector pairs.  This defines a centroid mu sub Z =
A sub 1 X and a covariance matrix Sigma sub Z = A sub 1 Sigma sub X A
sup T sub 1 in the Z-space.  The second stage is a "metamorphic"
mapping Y = mu sub Y + A sub 2(Z - mu sub Z) which produces the
correct moments mu sub Y,Sigma sub Y and which also exploits some
additional degrees of freedom by inserting a least-squares optimal
rotation R, to wit:

(1)                                         A sub 2 = Sigma sup
<+1/2> sub Y R Sigma sup <-1/2> sub Z.

The complete adaptation mapping for a fixed sub-word unit is
the composite linear transformation defined by the matrix product A =
A sub 1 A sub 2 which correctly propagates the covariance matrix for
the sub word, i.e.,

(2)(1)                                      Sigma sub Y = A Sigma sub
X A sup T.

The algorithm is that described in [*]  with the modification
that we now replace the matrix A above the best (in the sense of
least squares) possible such matrix.  That is we construct a
transformation which propagates the first two moments correctly and
is optimal among all linear transformations that do so.  We thus have
the constrained optimization problem:

(3)                                         minimum midsub A sum from
t of <concat y> sub t - mu sub <Y -> A lparen x sub t - mu sub X
rparen concat sup 2
where A is an arbitrary square matrix of the appropriate dimension
which satisfies (1).  The solution turns out to be the metamorphic
transformation without the preliminary least squares step, i.e., the
metamorphic transformation from the X-space directly to the Y-space
of any given sub-word unit, to wit:

(4)                                         Y = mu sub Y + A lparen X
- mu sub X rparen
where

(5)                                         A = Sigma sup <+1/2> sub
Y R Sigma sup <-1/2> sub X.

Here R is the product of the matrix of left eigenvectors times
the transposed matrix of the right eigenvectors belonging to the
matrix of average cross produc...