Browse Prior Art Database

Method and System for Generating Thumbnails for a File

IP.com Disclosure Number: IPCOM000202522D
Publication Date: 2010-Dec-20
Document File: 1 page(s) / 18K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for generating thumbnail for a file based on content of the file. The thumbnail of the file may be generated using one or more of, but not limited to, first page of the file and text in the file that captures the content of the file.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 64% of the total text.

Page 01 of 1

Method and System for Generating Thumbnails for a File

Disclosed is a method and system for automatically generating thumbnail for a file based on content of the file. The thumbnail of the file may be generated using one or more of, but not limited to, first page of the file and text in the file that captures the content of the file.

In order to generate the thumbnail of the file, the file is first uploaded on to a content management system. After uploading the file, a check is performed on the type of the file. In an embodiment, the Multipurpose Internet Mail Extensions (MIME) type is checked. Thereafter, the file is streamed to memory and loaded into a file format reader based on the type of the file. For example, if the file contains MIME type as OpenDocument Format then the file is loaded into OpenDocument Format File Object Model (ODFDOM). In another scenario, if the file contains MIME-type as Microsoft Office* then the file is loaded into Apache** POI. In case the file contains any other MIME type, the file may be loaded to a filter.

Upon loading the file into the file viewer, one or more pages of the file is analyzed starting from the first page. If the first page of the file contains a heading or a title, then the heading or the title is captured. If the first page contain no text then the process is repeated to check for the subsequent pages. In case no text is found in any page of the file, then metadata describing the images in the page is extracted....