Browse Prior Art Database

SURROUND VIDEO/MULTI-TRACK AUDIO FOR REAL-TIME GEOLOCATION OF SOUND

IP.com Disclosure Number: IPCOM000008002D
Original Publication Date: 1997-Mar-01
Included in the Prior Art Database: 2002-May-10
Document File: 2 page(s) / 92K

Publishing Venue

Motorola

Related People

M. Landis: AUTHOR [+2]

Abstract

Transcriptions from DictaphoneTM tapes lack information on the speakers identity during inter- views. Knowing the identity of the speaker is criti- cal when creating transcripts for the purpose of a QFD (Quality Function Deployment) analysis. Since QFD interviews are typically composed of a group of interviewees, multiple video cameras and post analysis of the tapes would be required to iden- tify the speaker(s) during the transcription phase. This process is prohibitively expensive. Additionally, conventional solutions including video recording, lapel microphones etc. are intru- sive to the interview process.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Page 1 of 2

0 M

MOTOROLA Technical Developments

SURROUND VIDEO/MULTI-TRACK AUDIO FOR REAL-TIME GEOLOCATION OF SOUND

by M. Landis and A. Hirsbrunner

INTRODUCTION

  Transcriptions from DictaphoneTM tapes lack information on the speakers identity during inter- views. Knowing the identity of the speaker is criti- cal when creating transcripts for the purpose of a QFD (Quality Function Deployment) analysis. Since QFD interviews are typically composed of a group of interviewees, multiple video cameras and post analysis of the tapes would be required to iden- tify the speaker(s) during the transcription phase. This process is prohibitively expensive. Additionally, conventional solutions including video recording, lapel microphones etc. are intru- sive to the interview process.

DETAILED DESCRIPTION

  One solution to this problem is to build a device which is located between interviewer and all of the interviewees, such that it is possible to use a hemi- spherical or fish-eye lens to record the interview. Since most interviews occur with the interviewee(s) across a table from the interviewer(s), it is possible to record the entire scene (above the plane of the table) at once.

  By simultaneously processing multiple tracks of audio from a number of sources placed around the periphery of the device with a DSP system, it is possible to calculate the Mercator or polar posi- tion(s) of the audio source(s) corresponding to the

recorded hemispherical image. This location data can be modulated onto one of the two channels of a stereo video camcorder by the DSP system. The actual (best) audio track can be recorded on the remaining cha...