Publishing Venue
IBM
Abstract
Disclosed is a system and method of analyzing multimedia using a
novel approach called the Computational Media Aesthetics. Media aesthetics
is a process of examination of media elements such as lighting, picture
composition, and sound by themselves and a study of their role in
manipulating our perceptual reactions, in communicating messages
aesthetically, and in synthesizing effective productions. We define
computational media aesthetics as the algorithmic study of a number of
image and aural elements in media and the computational analysis of the
principles that have emerged underlying their use and manipulation,
individually or jointly, in the creative art of clarifying, intensifying,
and interpreting some event for the audience. This field enables
distilling techniques and criteria to create efficient, effective, and
predictable messages the first time around in media communications, and to
provide a handle on evaluating relative communication effectiveness of
media elements during TV/film production. Whilst the area of affective
computing aims to understand and enable computers to respond to emotions of
users, computational media aesthetics aims to understand how directors use
visual and aural elements to enhance the emotional experience for the
audience.
Page 1 of 5
System and Method for Computational Media Aesthetics : An Algorithmic Study of the Use of Sound and Images in Digital Video and TV /Film for Augmented Content Annotation and Production, and Mass Communication
Disclosed is a system and method for analyzing multimedia to determine high-level semantic descriptions of its content.
With the explosion of online media and media-based services, a key challenge in the area of media management is automation of content annotation, indexing, and organization for efficient access, search, retrieval, and browsing applications. One of the major failings of current media annotation systems is the semantic gap which refers to the discontinuity between the simplicity of features or content descriptions that can be currently computed automatically and the richness of semantics in user queries posed for media search and retrieval. This paper proposes an approach that targets at bridging the semantic gap and building innovative content annotation and navigation services. The approach is founded upon an understanding of media elements and their role in synthesis and manipulation of program content with a systematic study of media productions. It proposes a framework for computational understanding of the dynamic nature of the narrative structure and techniques via analysis of the integration and sequencing of audio/visual elements. The resulting system will lead to automatic content organization and interpretation that provides high level and high quality content descriptions to aid in search, retrieval, and browsing and also to objective and consistent distillation of the common features of successful audio-visual strategies.
To address this issue, we depart from existing approaches to deriving video content descriptions. Motivated and directed by video production principles, we propose an approach that goes beyond representing what is being directly shown in a video or a movie, and aims to understand the semantics of the content portrayed and to harness the emotional, visual appeal of the content seen. It focuses on deriving a computational scheme to analyze and understand the content of video and its form. Accepted rules and techniques in video production are used by directors worldwide to solve problems presented by the task of transforming a story from a written script to a captivating narration [Arijon, 1976]. These rules, termed as film grammar in the movie domain, refer to repeated use of certain objects, visual imagery, and patterns in many films to instantly invoke a specific cinematic experience to the viewers [ Zettl, 1999, Sobchack and Sobchack, 1987]. The rules and icons serve as shorthand for compressing story information, characters, and themes into known familiar formulae, often becoming the elements of a genre production. They constitute a style or form of artistic expression that is characteristic of content portrayed, and can be considered to be almost idiomatic in the language of a...