Browse Prior Art Database

Discrimination of Multiple Sound Sources

IP.com Disclosure Number: IPCOM000105444D
Original Publication Date: 1993-Aug-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 68K

Publishing Venue

IBM

Related People

Feig, E: AUTHOR [+2]

Abstract

Disclosed is a method for processing a waveform that is formed by the superposition of sounds from two sources, so as to recover the sound patterns produced by each source separately. The sources are spatially localized and separated from each other, and the sound patterns received at two microphones are used for the reconstruction. The sound patterns from the two sources are assumed to be statistically uncorrelated, or weakly correlated, with each other. The method uses this assumption to infer the time-delay and attenuation properties of the paths from each source to each microphone, and thereby to reconstruct the sound patterns produced by each source.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Discrimination of Multiple Sound Sources

      Disclosed is a method for processing a waveform that is formed
by the superposition of sounds from two sources, so as to recover the
sound patterns produced by each source separately.  The sources are
spatially localized and separated from each other, and the sound
patterns received at two microphones are used for the reconstruction.
The sound patterns from the two sources are assumed to be
statistically uncorrelated, or weakly correlated, with each other.
The method uses this assumption to infer the time-delay and
attenuation properties of the paths from each source to each
microphone, and thereby to reconstruct the sound patterns produced by
each source.

     The method consists of selecting trial values of the time-delay
and attenuation values, determining (for those values) the source
waveforms that would produce the observed waveforms at the
microphones, and computing the correlation between these inferred
source waveforms.  Since the on actual off source waveforms are
uncorrelated with each other, the correlation between the on inferred
off source waveforms provides an indicator of the validity of the
assumed values of time-delay and attenuation.  A search in parameter
space is performed to minimize the correlation between the inferred
source waveforms.  Details of the method are given below.

     The two sources, denoted A and B, produce waveforms A(t) and
B(t) at microphone #1, and waveforms a A(t- tau sub A ) and b B(t-
tau sub B ) at microphone #2.  The relative attenuation factors a,b
and time delays tau sub A,B are assumed constant in time.  The
combined waveforms are therefore

  eqno(1)
  R sub 1 (t)= A(t) + B(t)

  eqno(2)
  R sub 2 (t) = a A(t- tau sub A ) + b B(t- tau sub B )
at microphones #1 and #2 respectively.  The waveforms R sub j (t) are
sampled, multiplied by a windowing function (e.g., a Hamming or
Hanning window), and Fourier-transformed in a manner familiar in the
speech processing art.  In an approximation in which finite-window
effects are ignored, the discrete Fourier transforms of R sub j (t)
are

  eqno(3)
  R...