Browse Prior Art Database

System and Method for Computational Media Aesthetics: An Algorithmic Study of the Use of Sound and Images in Digital Video and TV/Film for Augmented Content Annotation and Production, and Mass Communication

IP.com Disclosure Number: IPCOM000010281D
Original Publication Date: 2002-Nov-15
Included in the Prior Art Database: 2002-Nov-15
Document File: 5 page(s) / 53K

Publishing Venue

IBM

Abstract

Disclosed is a system and method of analyzing multimedia using a novel approach called the Computational Media Aesthetics. Media aesthetics is a process of examination of media elements such as lighting, picture composition, and sound by themselves and a study of their role in manipulating our perceptual reactions, in communicating messages aesthetically, and in synthesizing effective productions. We define computational media aesthetics as the algorithmic study of a number of image and aural elements in media and the computational analysis of the principles that have emerged underlying their use and manipulation, individually or jointly, in the creative art of clarifying, intensifying, and interpreting some event for the audience. This field enables distilling techniques and criteria to create efficient, effective, and predictable messages the first time around in media communications, and to provide a handle on evaluating relative communication effectiveness of media elements during TV/film production. Whilst the area of affective computing aims to understand and enable computers to respond to emotions of users, computational media aesthetics aims to understand how directors use visual and aural elements to enhance the emotional experience for the audience.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 22% of the total text.

Page 1 of 5

  System and Method for Computational Media Aesthetics : An Algorithmic Study of the Use of Sound and Images in Digital Video and TV /Film for Augmented Content Annotation and Production, and Mass Communication

Disclosed is a system and method for analyzing multimedia to determine high-level semantic descriptions of its content.

With the explosion of online media and media-based services, a key challenge in the area of media management is automation of content annotation, indexing, and organization for efficient access, search, retrieval, and browsing applications. One of the major failings of current media annotation systems is the semantic gap which refers to the discontinuity between the simplicity of features or content descriptions that can be currently computed automatically and the richness of semantics in user queries posed for media search and retrieval. This paper proposes an approach that targets at bridging the semantic gap and building innovative content annotation and navigation services. The approach is founded upon an understanding of media elements and their role in synthesis and manipulation of program content with a systematic study of media productions. It proposes a framework for computational understanding of the dynamic nature of the narrative structure and techniques via analysis of the integration and sequencing of audio/visual elements. The resulting system will lead to automatic content organization and interpretation that provides high level and high quality content descriptions to aid in search, retrieval, and browsing and also to objective and consistent distillation of the common features of successful audio-visual strategies.

To address this issue, we depart from existing approaches to deriving video content descriptions. Motivated and directed by video production principles, we propose an approach that goes beyond representing what is being directly shown in a video or a movie, and aims to understand the semantics of the content portrayed and to harness the emotional, visual appeal of the content seen. It focuses on deriving a computational scheme to analyze and understand the content of video and its form. Accepted rules and techniques in video production are used by directors worldwide to solve problems presented by the task of transforming a story from a written script to a captivating narration [Arijon, 1976]. These rules, termed as film grammar in the movie domain, refer to repeated use of certain objects, visual imagery, and patterns in many films to instantly invoke a specific cinematic experience to the viewers [ Zettl, 1999, Sobchack and Sobchack, 1987]. The rules and icons serve as shorthand for compressing story information, characters, and themes into known familiar formulae, often becoming the elements of a genre production. They constitute a style or form of artistic expression that is characteristic of content portrayed, and can be considered to be almost idiomatic in the language of a...