Browse Prior Art Database

Adaptive Noise Removal in the Power Spectrum

IP.com Disclosure Number: IPCOM000115376D
Original Publication Date: 1995-Apr-01
Included in the Prior Art Database: 2005-Mar-30
Document File: 2 page(s) / 60K

Publishing Venue

IBM

Related People

Das, S: AUTHOR [+4]

Abstract

There exist a number of noise-removing transformations: x hat = T(y) for making noisy speech look like the training speech recorded in a quiet environment. Most methods for this that we are aware of, such as spectral subtraction, have been linear in the spectrum. In (*), a nonlinear transformations was proposed which exploited inter-utterance segments for sampling the background noise and which could be applied at any stage of signal processing prior to labeling. The present invention defines a mapping that does not require recording background noise but is restricted to the power domain.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Adaptive Noise Removal in the Power Spectrum

      There exist a number of noise-removing transformations:
  x hat = T(y)
  for making noisy speech look like the training speech recorded in a
quiet environment.  Most methods for this that we are aware of, such
as spectral subtraction, have been linear in the spectrum.  In (*), a
nonlinear transformations  was proposed which exploited
inter-utterance segments for sampling the background noise and which
could be applied at any stage of signal processing prior to labeling.
The present invention defines a mapping that does not require
recording background noise but is restricted to the power domain.

      Let I  with values i memberof lbrace 1, 2, ellipsis, k rbrace
denote the random class index corresponding to different types of
speech signals.  Denote by X training data obtained in a (relatively)
noise free environment, and by Y = X+N  the corrupted version of it
where N  denotes the noise.  More precisely, X=(X sub 1, X sub 2)
is a vector of length two containing the real and imaginary parts
of the Fourier transform of the signal at a fixed frequency, while
Y = (Y sub 1, Y sub 2) are the real and imaginary parts of the
Fourier
transform of the corrupted signal.

      Approximate the joint distribution of (X,N) by a Gaussian
mixture as follows: given I=i the random variables X sub 1, X sub
2, N sub 1, N sub 2 are independent having scalar Gaussian
distributions with zero means and variance sigma sup 2 for each of
signal coordinates X sub 1 and X sub 2, and variance gamma sup 2 in
case if the noise coordinates N sub 1 and N sub 2.  Then the
conditional joint distribution of (X,Y)|I=i (given I=i) is four
dimensional Gaussian with mean zero and diagonal covariance matrix:
  diag( sigma sup 2, sigma sup 2, sigma sup 2 + gamma sup 2,
   sigma sup 2 + gamma sup 2).

      Let |X| sup 2 = X sub 1 sup 2 + X sub 2 sup 2   den...