Browse Prior Art Database

Enhanced output facility of conversational systems in mobile phone communication Disclosure Number: IPCOM000014847D
Original Publication Date: 2000-Apr-01
Included in the Prior Art Database: 2003-Jun-20
Document File: 6 page(s) / 97K

Publishing Venue




This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 25% of the total text.

Page 1 of 6

  Enhanced output facility of conversational systems in mobile phone communication


    The hotel information application allows the access of information via telephone in a human-human-like dialog. The backend is connected to a database containing information on hotels. The user can ask for

Hotels in a specific city Information on these hotels, e.g. address, phone number Facilities, e.g. restaurant, pool
Prices of rooms Availability information and reservations is handled by the dialog, but not yet implemented via the backend.

    Information given during the dialog by the system is summarized and sent to the user's mobile phone as an SMS message.

    This paper is organized as follows. Section 2 describes the system architecture. In the sections 2 to 7, the components speech recognition, NLU (natural language understanding), FDM (form based dialog manager), backend and answer generation are described. Finally, we summarize our major findings and outline our future work.

2.1 Architecture NLU/Telephony System

    The system described here is based on the IBM ViaVoice Telephony NLU toolkit. Communication between these components is done via a hub. Figure 1 shows the overall system architecture. The HUB is working as a dispatcher calling and routing information between involved modules. The telephony interface handles basic telephony functions, like accepting and disconnecting calls, detection of hang ups, recording and playing back audio and DTMF tone detection. After recording an utterance, the speech recognition module is invoked. The decoded text is passed to the classer, where simple concepts are identified. The canonicaliser extracts canonical values for these basic concepts. The statistical parser computes the semantic parse from the classed sentence. The dialog manager interprets the parser result in the dialog context, requests required backend information and produces the system reaction for the user utterance, which is then passed to the TTS as well as to a module which generates an SMS message.


Page 2 of 6

  SMS Generator

 TTS Engine

Telephony Interface


Dialog Parser



 Reco Engine



Figure 1: System architecture Infrastructure for sending SMS messages and faxes

    Conversational systems are used as natural human-machine interfaces to information systems or enterprise data. Usually the information flow in telephone based conversational systems is based on spoken input and spoken output. The speaker utters his request and is prompted by a system response. For longer or complicated messages, a spoken response becomes soon very cumbersome. One approach to overcome this shortage is the usage of a fax call back solution where one has to call a dedicated fax server from a fax device. This fax server will then send back requested info as a fax. However, this means the loss of the conversational interface. Furthermore, this solution is hardly applicable in a mobile communication situ...