Browse Prior Art Database

Requirements for Distributed Control of Automatic Speech Recognition (ASR), Speaker Identification/Speaker Verification (SI/SV), and Text-to-Speech (TTS) Resources (RFC4313)

IP.com Disclosure Number: IPCOM000132358D
Original Publication Date: 2005-Dec-01
Included in the Prior Art Database: 2005-Dec-09
Document File: 21 page(s) / 47K

Publishing Venue

Internet Society Requests For Comment (RFCs)

Related People

D. Oran: AUTHOR

Abstract

This document outlines the needs and requirements for a protocol to control distributed speech processing of audio streams. By speech processing, this document specifically means automatic speech recognition (ASR), speaker recognition -- which includes both speaker identification (SI) and speaker verification (SV) -- and text-to-speech (TTS). Other IETF protocols, such as SIP and Real Time Streaming Protocol (RTSP), address rendezvous and control for generalized media streams. However, speech processing presents additional requirements that none of the extant IETF protocols address.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 6% of the total text.

Network Working Group                                            D. Oran
Request for Comments: 4313                           Cisco Systems, Inc.
Category: Informational                                    December 2005


                Requirements for Distributed Control of
                  Automatic Speech Recognition (ASR),
       Speaker Identification/Speaker Verification (SI/SV), and
                     Text-to-Speech (TTS) Resources

Status of this Memo

   This memo provides information for the Internet community.  It does
   not specify an Internet standard of any kind.  Distribution of this
   memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This document outlines the needs and requirements for a protocol to
   control distributed speech processing of audio streams.  By speech
   processing, this document specifically means automatic speech
   recognition (ASR), speaker recognition -- which includes both speaker
   identification (SI) and speaker verification (SV) -- and
   text-to-speech (TTS).  Other IETF protocols, such as SIP and Real
   Time Streaming Protocol (RTSP), address rendezvous and control for
   generalized media streams.  However, speech processing presents
   additional requirements that none of the extant IETF protocols
   address.

Table of Contents

   1. Introduction ....................................................3
      1.1. Document Conventions .......................................3
   2. SPEECHSC Framework ..............................................4
      2.1. TTS Example ................................................5
      2.2. Automatic Speech Recognition Example .......................6
      2.3. Speaker Identification example .............................6
   3. General Requirements ............................................7
      3.1. Reuse Existing Protocols ...................................7
      3.2. Maintain Existing Protocol Integrity .......................7
      3.3. Avoid Duplicating Existing Protocols .......................7
      3.4. Efficiency .................................................8
      3.5. Invocation of Services .....................................8
      3.6. Location and Load Balancing ................................8

Oran                         Informational                      [Page 1]
RFC 4313          Speech Services Control Requirements     December 2005


      3.7. Multiple Services ..........................................8
      3.8. Multiple Media Sessions ....................................8
      3.9. Users with Disabilities ....................................9
      3.10....