Browse Prior Art Database

Method and System for Automatic Metadata Extraction and Question Answering in a File Sharing Application

IP.com Disclosure Number: IPCOM000202537D
Publication Date: 2010-Dec-21
Document File: 2 page(s) / 100K

Publishing Venue

The IP.com Prior Art Database

Abstract

This article describes a system and a method for automatically extraction of metadata in a file sharing application. Also the system will have the capability to answer to questions based on the hosted files.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 55% of the total text.

Page 01 of 2

Method and System for Automatic Metadata Extraction and Question Answering in a File Sharing Application

The existing file sharing applications (like Cattail) provide file storage capabilities but the important drawback is the fact that the file sharing applications (like Cattail) are not aware of the semantic of the files uploaded. Saying in other terms, such an application is just a repository of unstructured data (unstructured, meaning in this context, without metadata or having associated very few and insignificant metadata). It would be beneficial if the file sharing application could also be able to extract information about the content described into these files. Using techniques from Natural Language Processing domain such as Information Extraction and Question Answering an additional module plugged into the file sharing application could be capable of automatic extraction of metadata.

This metadata will be later used to answer user specific queries, capturing in this way a big volume of knowledge encapsulated in the uploaded files. Furthermore, the information extraction will allow logical reasoning to draw inferences based on the logical content of the input data.

One possible embodiment of the solution is presented below:

An additional IE (information extraction) application is plugged into the file sharing application. This IE module can be configured in such a way that will allow the user to set what language to be used for the parsing of text docum...