Browse Prior Art Database

Speaker Verification using Discrete Parameter Hidden Markov Models

IP.com Disclosure Number: IPCOM000113356D
Original Publication Date: 1994-Aug-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 91K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+3]

Abstract

For reasons of security, it is desirable to be able to automatically confirm that a speaker really is the person claimed. This is the problem of speaker verification. The following paragraphs specify a successful speaker verification procedure using techniques which are well established in the field of speech recognition.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Speaker Verification using Discrete Parameter Hidden Markov Models

      For reasons of security, it is desirable to be able to
automatically confirm that a speaker really is the person claimed.
This is the problem of speaker verification.  The following
paragraphs specify a successful speaker verification procedure using
techniques which are well established in the field of speech
recognition.

      The ideal speech recognition system would be
speaker-independent, but in fact, speaker-dependent systems perform
better than speaker-independent systems.  The reason for this is that
certain parts of a recognition system are tailored to the voice of a
particular user and do not work well (if at all) on the voice of
another.  Although this is an undesirable obstacle in speech
recognition, we can take advantage of it when speaker-dependency is
required: namely, in speaker verification or identification.

      It will be assumed that some training data is available from
each speaker whose presence has to be verified.

The procedure is as follows:

1.  Select a phonetically-balanced "passphrase".  (A passphrase is
    like a password, but consists of more than one word.)  One
    possible passphrase is "our business has to undergo changes
    constantly".  The passphrase is spoken to the
    speaker-verification system which processes it to verify the
    speaker.  The passphrase may be spoken as isolated words or as a
    continuous utterance.

2.  Create detailed Markov models for the words in the passphrase and
    in the training script using the methods of (5) (isolated speech
    only) or (6) (isolated or continuous speech).  These models,
    which are created automatically from labelled speech, should use
    the same type of labels as employed in Step 4 below.

3.  Perform Steps 4-10 for each speaker separately using the
    available training data.

4.  Compute speaker-dependent discriminating eigenvectors, and
    Gaussian prototypes for labelling as described in (1).
    Optionally, the methods of (1-3), (4) or related methods may be
    used instead.

5.  Compute the Markov model parameters in the usual way (7).

6.  Record the passphrase one or more times, and label each such
    utterance using the eigenvectors and prototypes of the current
    speaker.

7.  Perform Viterbi alignment (7) of each passphrase against the
    labels from Step 6.

8.  Discard all labels aligned with silence models.

9.  Determine the Viterbi-path output probabilities of all labels
    remaining after Step 8.

10. For each separate recording of the passphrase, sort the output
    probabilities of Step 10 and compute the Pth percentile; a
 ...