Browse Prior Art Database

UNIFIED AUDIO MIXER

IP.com Disclosure Number: IPCOM000248112D
Publication Date: 2016-Oct-26
Document File: 4 page(s) / 792K

Publishing Venue

The IP.com Prior Art Database

Related People

Haohai Sun: AUTHOR

Abstract

A unified automatic audio mixer for both a collaboration digital assistant and meeting participants in a video conference. The mixer has two outputs when used for teleconferencing: one output is generated based on signal-to-noise ratio (SNR), sound level (SL), or signal-to-reverberation ratio (SRR) comparisons, and the other output is generated based on a wake-up word (WUW) detection confidence level comparison. When the mixer is not used for teleconferencing, it has only one output, which is generated based on a WUW detection confidence level comparison.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 39% of the total text.

Page 01 of 4

UNIFIED AUDIO MIXER

AUTHORS:

Haohai Sun

CISCO SYSTEMS, INC.

ABSTRACT

    A unified automatic audio mixer for both a collaboration digital assistant and meeting participants in a video conference. The mixer has two outputs when used for teleconferencing: one output is generated based on signal-to-noise ratio (SNR), sound level (SL), or signal-to-reverberation ratio (SRR) comparisons, and the other output is generated based on a wake-up word (WUW) detection confidence level comparison. When the mixer is not used for teleconferencing, it has only one output, which is generated based on a WUW detection confidence level comparison.

DETAILED DESCRIPTION

    Room-based video conferencing endpoint audio mixers can automatically mix all microphone channels in a room (e.g., a meeting room), including table microphones (mics), ceiling mics, wireless mics, beams created by a microphone array, mobile phone mics, and other Internet of Things (IoT) equipment mics in the room. This allows the audio signal to be clearly heard at other endpoints participating in a video conference.

    Audio mixers are also deployed via a multipoint control unit (MCU)/server/cloud, and can mix all the audio channels in a multi-party meeting, including audio signals from meeting rooms, personal endpoints, mobile phone clients, personal computer clients, and other sources.

    Whether deployed at an endpoint or via a MCU/server/cloud, these audio mixers have similar properties. They mix the audio channel(s) with the highest signal-to-noise ratio (SNR), sound level (SL), or signal-to-reverberation ratio (SRR), and output either a mono-channel, stereo channel, or multichannel signal. The mixer output is transmitted to the other endpoints of the video conference and reproduced with speakers.

Copyright 2016 Cisco Systems, Inc.

1


Page 02 of 4

    Digital collaboration assistants exist that are based on a voice control interface. See, e.g., http://www.lightreading.com/enterprise-cloud/cisco-developing-monica-digital- assistant/d/d-id/725561. The mixer output referred to above may be sent to not only the meeting participants (i.e., humans), but also to a collaboration digital assistant (i.e., a machine) when a wake-up word (WUW) is detected. However, traditional mixers optimized for communications may not be suitable for simultaneous use by a digital collaboration assistant.

    Often, in a meeting with multiple attendees (in real or virtual rooms), a first user may speak relatively loudly, while a second user may speak relatively softly to a digital collaboration assistant (e.g., "Monica, where is my next meeting?"; "Monica, start recording"; "Monica, show the PowerPoint on my laptop"; or "Monica, turn off the air conditioning"). In this case, if a traditional audio mixer is employed, the speech signal of the first user may always be dominant because, as mentioned above, all captured audio channels are mixed and sent to both the meeting participants and the digital collaboration assistant when a...