Browse Prior Art Database

Process for Speech Login to E-mail

IP.com Disclosure Number: IPCOM000123891D
Original Publication Date: 1999-Jun-01
Included in the Prior Art Database: 2005-Apr-05
Document File: 2 page(s) / 69K

Publishing Venue

IBM

Related People

King, RA: AUTHOR [+2]

Abstract

Overview: The problem is to enable users to access their e-mail via voice-only commands from an arbitrary voice device (e.g., a phone). The first problem one encounters is logging in using only one's voice. Since arbitrary (free-form) voice recognition is notoriously bad, one cannot simply speak his/her user ID and hope that it is recognized. Spelling the userID is possible, but still cumbersome and error-prone.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Process for Speech Login to E-mail

   Overview:

   The problem is to enable users to access their e-mail via
voice-only commands from an arbitrary voice device (e.g., a phone).
The first problem one encounters is logging in using only one's
voice.  Since arbitrary (free-form) voice recognition is notoriously
bad, one cannot simply speak his/her user ID and hope that it is
recognized.  Spelling the userID is possible, but still cumbersome
and error-prone.

   The second problem is that voice logins are subject to
listening/replay attacks.  That is, someone can stand next to you at
a pay phone, and hear your user ID and password.

   We address both problems below..

   Login:

   Voice recognition becomes far better when a restricted
grammar is used.  In such a grammar, the voice recognition is told of
a limited number words (or phrases) that it must recognize, and the
user's voice is compared to these fixed words.  (Note that this
disclosure does NOT claim voice recognition technology, only the
[novel?] use of grammars in voice systems to solve a problem.)

   The strategy for the first part of this invention is to
dynamically create a restricted grammar based on the valid user ID
entries.  (E.g., in Unix these are /etc/password.)

   We use JSML running over ViaVoice for our preferred
embodiment.  The canonical format for our grammar is:
    public <user1> = userID1 {userID1};
    public <userN> = userIDN {userIDN};
    public <logon> = logon (<user1> ] ...  ] <userN) {logon};
    Where userIDi is the ith valid ID (e.g. in /etc/password).

   Once this grammar is constructed as a Java String, it is
fed to the voic...