Browse Prior Art Database

Acoustic Signal Processing Method

IP.com Disclosure Number: IPCOM000105668D
Original Publication Date: 1993-Aug-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 81K

Publishing Venue

IBM

Related People

Linsker, R: AUTHOR

Abstract

Interaural delay (the difference in arrival time of an acoustic signal at two ears, or at two microphones) provides an important cue for spatial localization of a sound source. In addition to being useful in its own right for certain applications, spatial localization of a source can aid in decomposing the superposed sound streams from multiple sources (e.g., speakers) at different locations. This can be useful as a front end to a speech recognition system, and also as part of an enhanced hearing aid for hearing-impaired persons. Disclosed is an improved signal processing method for computing values of interaural delay. In the disclosed method, the cepstrum instead of the cross-correlation is used to infer interaural delay.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Acoustic Signal Processing Method

      Interaural delay (the difference in arrival time of an acoustic
signal at two ears, or at two microphones) provides an important cue
for spatial localization of a sound source.  In addition to being
useful in its own right for certain applications, spatial
localization of a source can aid in decomposing the superposed sound
streams from multiple sources (e.g., speakers) at different
locations.  This can be useful as a front end to a speech recognition
system, and also as part of an enhanced hearing aid for
hearing-impaired persons.  Disclosed is an improved signal processing
method for computing values of interaural delay.  In the disclosed
method, the cepstrum instead of the cross-correlation is used to
infer interaural delay.

     A prior-art method for inferring interaural delay [1,2]
comprises the following steps:

1.  The input at each receiver (L and R) is processed by a bank of
    bandpass filters indexed by (i,L) and (i,R) respectively.  Here
    i=1,...,N indexes the set of bandpass characteristics, which are
    taken to be the same for (i,L) as for (i,R).

2.  The output from each filter is half-wave rectified.  This
    produces outputs denoted L(t,i) and R(t,i) respectively.

3.  Compute best-match delays as follows:

    a.  For each pair of corresponding filters (i,L) and (i,R), the
        cross-correlation is computed for each of a set of possible
        time delays.  The value of the cross-correlation Q( tau ,i),
        for time delay tau and band i, is essentially the running
        average, over a specified recent time interval, of the
        product L(t,i) times R(t+ tau , i).

    b.  For each i, the delay tau that maximizes Q(tau ,i) is
        computed and referred to as the "best-match" delay for band
        i.

4.  The set of best-match delays (for various i and possibly for
    various time windows over which the running cross-correlation is

    computed) is used to generate information about the spatial
    position of the source relative to the two receivers.

     The cepstrum [3] of a set of time series data X(t) is defined as
the power spectrum of the logarithm of the power spectrum of X(t).
When a time series contains an echo [i.e.  contains the sum of a term
Y(t) and a time-delayed and attenuated term alpha Y(t- tau )], the
cepstrum of X(t) tends to reveal a clear peak at tau, even when the
autocorrelation of X(t) reveals no such clear peak [3].

     The disclosed improvement consists of replacing step 3 in the
above algorithm by the following:

    (3'):  For each pair of correspond...