Browse Prior Art Database

Method and System for Providing Relevancy and Dominance based Text Summarization

IP.com Disclosure Number: IPCOM000241959D
Publication Date: 2015-Jun-11
Document File: 6 page(s) / 146K

Publishing Venue

The IP.com Prior Art Database

Related People

Vinayakumar Kolli: INVENTOR

Abstract

A method and system is disclosed for providing relevancy and dominance based text summarization. The method and system provides more accurate & high quality summarization by bringing importance factor to sentences while extracting best sentences based on local relevancy and titrating the best sentences with each sentence’s importance due to the place-region combination with respect to a document.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 29% of the total text.

Method and System for Providing Relevancy and Dominance based Text Summarization

Abstract

A method and system is disclosed for providing relevancy and dominance based text summarization.  The method and system provides more accurate & high quality summarization by bringing importance factor to sentences while extracting best sentences based on local relevancy and titrating the best sentences with each sentence’s importance due to the place-region combination with respect to a document.

Description

Growing amounts of information available electronically require tools for fast assessing the extent and content of information resources.  This has led to the ways of making that information closer to the original content, but still presenting it in a concise and accurate form.  The goal of text summarization system is to produce a concise representation of a document or set of documents that have been submitted to it. Depending on the use case, the summary can be a generic summary which gives an overall sense of the document(s), or query-relevant summary wherein it tries to highlight the content which most closely corresponds to search query or user question. These desirable use cases motivated an increasing amount of work in the field over last few years.  With the actual huge and continuously growing online text resources, it becomes necessary to help users get quick answers to their queries.  Single document text summarization can be coupled with conventional search engines, and help the users to quickly evaluate the relevance of documents or to navigate through a corpus.  So there needs optimized summarization techniques which considers relevancy of sentences with respect to region and position of whole document. 

Disclosed is a method and system for providing relevancy and dominance based text summarization.  The method and system utilizes an algorithm for providing more accurate and high quality summarization based on novel combination of sentence relevancy computation which includes concepts such as, word ranking, sentence similarity, relevancy factor, positional dominance and regional dominance. 

In an implementation, for a given content C of length N, where N is the number of valid sentences for providing the summary, a set of representative sentences S is presented with a granularity of g where g is the chosen granularity for C.  Subsequently for all s, belonging to the set S, each s includes the maximum words of most relevancies to C in the decreasing order.  Alternatively for all s, belonging to the set S, each s holds similarity with maximum number of sentences in C in decreasing order. 

Design and analysis of text using summarization algorithm allows processing of content C using Natural Language Processing (NLP) to avoid grammatical, linguistic noises and split into valid sentences.  The group S of sentences of size N are fed into sentence rank compute system. The output from this system will be sorted by ranks using m...