Browse Prior Art Database

A method to improve the user experience of presentation video

IP.com Disclosure Number: IPCOM000239566D
Publication Date: 2014-Nov-17
Document File: 8 page(s) / 112K

Publishing Venue

The IP.com Prior Art Database

Abstract

This invention discloses a method to improve the user experience of presentation style videos (videos that based on presentation or other education material). It retrieves the videos content based on OCR techniques and voice recognition mechanisms, then auto generates the table of contents.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 43% of the total text.

Page 01 of 8

A method to improve the user experience of presentation video

Methods for content-based search and information retrieval of multimedia data are not prevalent compared with those for text information.

We can use tags to classify main characteristics and short descriptions/titles to provide a summary for video files.

However, uses still need to watch through the video to get the gist of it and locate the places where they are interested in. If users do not leave a record of the time stamp, they need to go through the video again to identify any specific places. There's no quick way to search and retrieve content in the video.

We propose this method to improve the experience of users when they watch videos and search for video content.

This invention discloses a method to improve the user experience of presentation style videos (videos that based on presentation or other education material). It retrieves the videos content based on OCR techniques and voice recognition mechanisms, then auto generates the table of contents. Users can use the table of contents to better understand the video content and structure, and can also use it to navigate through the video. The link in the table of contents will also map to the scroll bar on the video, so that users can know the elapsed time of each segment.

The video retrieves keywords from the presentation content, and generates an index for quick searching.

It also analyzes the video to show important contents by highlighting the corresponding segments in the table of contents and scroll bar . There will be a linkage between video segments, based on the keyword and content. User can jump between the video segment based on the linkage, to focus on the same content.

Advantages:

1. By using the generated table of contents with highlights and indexes, users can easily locate and retrieve the content and search for any content they are interested in.

2. Segmented video contents can be sorted and reused.

3. Video structure can be reorganized by moving the segments around.

How to generate the TOC (

((Table of contents

Table of contents

Table of contents):

):

1



Page 02 of 8

Step 1: Information Recognizor retrieves the frames and text in the video

The text with the visual meta info (time stamp, position in the frame , the font size, style and color) are recognized by OCR from each frame (frame text)

The text with the audio meta info (time stamp, speech speed) are recognized by the speech to text technology from the video (speech text) The frame text and audio te xt will be pushed to the Index Recognizor.

Step 2: Index Recognizor analyses the index information from the frame text and speech text

If the frame text can be recognized as a top-down list by its position in the frame, this text might be the agenda. This frame should be indexed with time stamp.

If the frame text is highlighted at the top of the frame (larger font, different color, etc), this text might be the title of some contents and should...