Browse Prior Art Database

Cognitive Audio Management in 360 Degree Video

IP.com Disclosure Number: IPCOM000247265D
Publication Date: 2016-Aug-18
Document File: 6 page(s) / 111K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a system and method to improve the sound emitted to a set of users that are watching a video captured in 360-degrees. A 360-degree video camera is comprised of multiple cameras installed in different sides, and each camera has a unidirectional microphone, allowing not only automatic increase or decrease of volume in association with the zoom feature, but also accounting for configurations in which the speakers are not perfectly aligned.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 28% of the total text.

Page 01 of 6

Cognitive Audio Management in 360 Degree Video

When a videographer captures a 360-degree video and then plays it, the appropriate sound level does not always accompany the video, despite whether the user has a perfectly symmetrical sound system. For example, background noise might obscure the sound of the desired video object (e.g., upon playback, the viewer cannot hear a waterfall because of a lawn mower operating in the background).

Known methods enable the attaching of sound to objects such that the sound emits in a geometrically correct fashion, even if the object is not in view. However, these do not enable the adjustment of the sound of the zoomed-in object (e.g., waterfall) in addition to the surrounding objects (e.g., lawn mower). Another technology enables the automatic increase or decrease of volume in association with the zoom feature; however, this does not account for configurations in which the speakers are not perfectly aligned.

The novel contribution is a 360-degree video camera comprised of multiple cameras installed in different sides. Each camera has a unidirectional microphone. This design allows the videographer to zoom in or out on any side while capturing directional audio in the video. At playback, it accounts for non-symmetrical speaker placement.

This 360-degree video camera with multiple cameras can adjust the sound of the zoomed object as well as the surrounding objects, just as if the videographer is physically positioned in the zoomed-in point (even though said user is not in that position). The associated system adjusts every sound from every direction to impart the feeling that the user is physically in the zoomed-in point/location.

If the speakers are not perfectly aligned, then the system makes adjustments based on the actual alignment of speakers compared to the videographer's sitting position. The system takes the sounds of all surrounding objects to the zoomed-in point makes further appropriate adjustments to sound based on the speaker alignment. If the surround sound is not perfectly symmetrical, then the system might automatically move some sound might to another speaker or adjust volumes based on the asymmetrical distance of the speakers.

In addition, the system can make adjustments based on cognitive analysis, per user configuration. To implement the cognitive-based adjustment, the system must be aware of the videographer's preferences to hear certain sounds. For example, a user is recording a video of a waterfall, but there is a lawnmower in the background. Based on the zoom spot of the waterfall, users who prefer a real-life experience configure the system to increase the lawnmower sound. Users who prefer to focus on the positive sounds close to the zoomed-in subject matter configure the system to increase the sound close to the waterfall and decrease (or not increase) the sound of the lawnmower. Thus, the system can make an adjustment only if desired based on its understanding of the us...