Browse Prior Art Database

Speech Editor

IP.com Disclosure Number: IPCOM000038897D
Original Publication Date: 1987-Mar-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 3 page(s) / 76K

Publishing Venue

IBM

Related People

Kuroda, A: AUTHOR [+2]

Abstract

A speech data management scheme for efficient speech data editing is proposed. A speech editor provides a user with a means of speech data editing on a personal computer. For minute and interactive speech data editing, a speech editor shows visualized speech, such as a power profile, on its screen. In an edit session, a user, looking at the visualized speech, points a location where he wants to apply modification, then he issues an edit command by selecting the ICON menu. In a speech editor that enables a user to edit a huge amount of speech data, such as PCM, ADPCM, LSP, edit command execution time is a key factor for good user interface. The data management method, proposed here, provides better user interface in speech editing.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Speech Editor

A speech data management scheme for efficient speech data editing is proposed. A speech editor provides a user with a means of speech data editing on a personal computer. For minute and interactive speech data editing, a speech editor shows visualized speech, such as a power profile, on its screen. In an edit session, a user, looking at the visualized speech, points a location where he wants to apply modification, then he issues an edit command by selecting the ICON menu. In a speech editor that enables a user to edit a huge amount of speech data, such as PCM, ADPCM, LSP, edit command execution time is a key factor for good user interface. The data management method, proposed here, provides better user interface in speech editing. A structure descriptor makes it possible to edit speech without directly moving a huge amount of speech data, and greatly reduces execution time. It is a linear list that represents the logical structure of the speech data file. One node of a structure descriptor is shown in Fig. 1. A pair of L and R pointers represents a section of visualized speech data in the processor storage. Since one point of visualized speech data corresponds to one frame of speech data, visualized speech data is used for the mapping of the speech data file. This means that one node of a structure descriptor indirectly represents a section of a speech data file. During an edit command execution an original node is divided into two or three nodes and a new node is introduced if required so that

(Image Omitted)

they represent sections of the visualized speech data; then the links between the nodes are updated to reflect the new logical structure of the speech data. No physical movement of the speech data is required. A user need not be conscious of this indirect editing. He can edit speech as if he were modifying speech data directly. Moreover, this method can be implemented under standard operating systems, such as MS-DOS, and speech data edited by this method can coexist with other data files (text, image, graphics, etc.). Indirect Speech Data Editing with a Structure Descriptor An execution of MOVE command, which moves a section of speech data to another location, is described as an example of indirect speech data editing with a structure descriptor under following assumptions. . Speech data : 12-bit PCM (8 KHz sampling)

. Visualized speech data : 8-bit power (128 PCM samples per

frame)

. Operating system : MS-DOS on IBM PC-XT In Fig. 2, the speech data represented by the section between the arrows "a" and "b" is logically moved to the location pointed by t...