Browse Prior Art Database

Method and Apparatus for Customizing Text-to-Audio Translation

IP.com Disclosure Number: IPCOM000131128D
Original Publication Date: 2005-Nov-07
Included in the Prior Art Database: 2005-Nov-07
Document File: 1 page(s) / 21K

Publishing Venue

IBM

Abstract

This article describes a method for improving the quality of messages that are read aloud by computer text readers. It addresses two main problems with these systems. The first type of problem involves acronyms/abbreviations that would normally be read as a string of letters, but would more usefully be read in expanded form. The second type of problem involves sequences of punctuation, or large blobs of pasted material that should either not be read, or should be read in some summarized form. An example of this would be a java stack trace (which could be simply read as "Exception stack trace" rather than a sequence of classnames and line numbers) or an ellipsis (dot dot dot, which should not be read at all).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 64% of the total text.

Page 1 of 1

Method and Apparatus for Customizing Text -to-Audio Translation

Several real time chat clients (such as NotesBuddy) provide a convenient feature whereby communicated text is read aloud by a synthesized voice.

For the avid multitasker, this reduces context switches as work may be continued as incoming chats are read. However, there are several pitfalls which have yet to be worked out. One such pitfall is the usage of industry-specific acronyms such as "WAH' (working at home) or "OTP" (on the phone). It would be convenient to have these acronyms read in their expended form. Additionally, content that is copied and pasted into a chat window (such as text art, a Java stack trace, or source code), can result in extremely annoying sequences of synthesized monologue such as "ampersand, period, comma, period." The ability to skip over this type of content would be very helpful.

Accomplishing both of these tasks involves string parsing and pattern recognition. Our particular implementation adds a chain of filters to the front of the text-to-audio translator. Each filter is intended to increase the readability of the input string, either by translating acronyms into phrases, or by filtering a string from being read aloud. Filters may be implemented in any of several fashions, but regular expression substitution (s///) statements are the first that come to mind. A configuration panel for the application in question would provide two text fields: one labeled "match string" a...