Browse Prior Art Database

Automation test client for speech application

IP.com Disclosure Number: IPCOM000181676D
Original Publication Date: 2009-Apr-09
Included in the Prior Art Database: 2009-Apr-09
Document File: 7 page(s) / 293K

Publishing Venue

IBM

Abstract

There are more requirement to develop and test voice application with the growing of voice market. Up to now, it is still a problem to deliver a unified test framework to measure the accuracy for voice solution quickly. This article designs an automation speech accuracy test framework, which will speed up most voice application development and testing activities.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 22% of the total text.

Page 1 of 7

Automation test client for speech application

1

1.

. Background :

Speech interface will play an increasingly important role in everyday life in the future. Currently, some call centers leverage voice technology to improve the customer experience, while more and more web pages are beingable to communicate with end users through voice interface in the next five years.

If you want to setup a voice enabled system,

you need to do the following:

1. Deploy voice server products. The voice server isused to achieve speech recognition and text-to-speech, which can recognize end users' speech input, then generate special audio response. Speech recognition is the model based pattern recognition technology, in which"model" determines the final accuracy of your applications. Different voice server products provide their own acoustic models, and then will result in different accuracies.
2. Create your voice applications. To use speech recognition in your application,

you need to develop your rules for speech recognition according to your special

application areas. The rules will let voice server know your preferred words, and always be called as grammars /dictionaries. Obviously, different grammars/lexicons will result in different accuracies.

This is a typical voice system structure in call centers today:

VoiceXML Browser

Voice Server

"Speech Recognition" & "Text to Speech" services

Voice Application File Lexicon/Grammar File

T

S

S

R

/

T

HTTP Document Server

Voice ApplicationFile

A

Voice System

Then, the question here is how to find out a way tomeasure voice system accuracy quickly with different voice server products or using different voice application design?

Currently,

you have

2 solutions to test accuracy:
1) Collect pre-record audios, pre-defined grammar files as corpus, then test.

1

[This page contains 27 pictures or other non-text objects]

Page 2 of 7

This test is always run offline and could be automated by some script languages if you like.

The problem is it is very hard to reflect the real-life accuracy which includes different speech context. In online real-life application system, the speech context changes dynamically, while the offline pre-recorded audios only test a part of your applications under the fixed context (normally, "zero" context, means the audio should be treated as the first input).

Also, most of the pre-recorded audios can't reflectexactly real world voices, i.e.

your voice

may be more slow and clear than usual (which means easy to recognized). As a result, we often find the offline accuracy reports are quite better than the actual system performance in daily work.

2) Test the application yourself by making a manualcall (Assume it is a call centre voice application). With this method,

you can interact with the application online and go through many branches quickly.

time. Since "speech" is dynamic,...