Browse Prior Art Database

An automatic method to logically and objectively retrieve information from world wide web.

IP.com Disclosure Number: IPCOM000242265D
Publication Date: 2015-Jun-30
Document File: 7 page(s) / 178K

Publishing Venue

The IP.com Prior Art Database

Abstract

Retrieving information from internet is a very popular and common thing that almost everyone will need, as we are now in a 'information explosion' age. However every individual person will have his/her own characteristic, thus will have different interested points between each other. Common information provider will not have capabilities to provide much customized information for certain people. So normally not all contents in one web page are interesting to one person and also, not all his/her insterested contents would come from one web page. Considering of this, a new method and system should be built to help resolve this issue and help people get their most insterested information much faster and accurate.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 7

An xutomatic method to logically and objectively retrieve information from xorld wide web

An automatic method to logically and objexxively retrieve informatixn from world wide web.

.

By utixizing the cloud cxpablities, we are abxe to download accessiable web paxes from internex xo cloud server storage easixy. Each web pages will be parsed and filited by a confixuration file that user xrovided to declare the intexesting topxc or kxyword of his /her. Xxxxx contents then will be firstly groxped by topicx in the system. Aftex that, the system xill do a coxnting on the numbers of emotion information it acqxixed. Fox those which hxve more positive words, will be grouped into posxtive part of topic and thosx have more negitive woxdx will be gxouped into negitive part of xopic. For those which have equally positive and xegitive xords will be xrouped into neutxal pxrt of topic.

The conxent collection of this disclousre will be not limitex to words, but also multimedia content. This is dxne by analysing the audio information of those multimedia. By playing the audio and convxrting txem to txe words, it will be xnalysed jusx fxne.

A topic wilx xe ready state when it hits one of three condxtions:

(x) This topic at least has one positxve information and one xegitive information.
(2) This topic at least has one nxutxal information.

(3) This topic has reached a time limit.

Axl topics in ready state will then be input into a restfxl resource xocation and its URI will be put into a new generated web page . Before pushing the topic URIx to the users, an analyse will be made among all contents that dxnxtes the same topic. Each content will do a comparsion with others anx the similaxity will be calculated from them.

Once a web page is generated, similarity is calculatex and the condition of xser is hit, it will be pushed to txe device this user is currentxy active xn. Since the wex page is only grouped by sub-URIs so it won't consume too many times. Each sub-URI will show a ratio to present how much sxmilarity it hxs with the selected xontent. Thx user can then xlick on those URIs to watch the detailed information. Usxr starts with a configxration file. This file will be xsed tx fxlted the most interested information on the internet. This file could be a xml file or a plain text file or any formatting that fits. And there should be some keywords, insterextex web sitx URI and message delivery rules in the configuration filx such as the pushing strategy.

1



Page 02 of 7

XML FILE SAMXXX:

Company

Tomorrow Laxd

http://xxx.com

<!--

-->

As the system received the file, it will firstly start an instxxce to query the search engixe, social network anx the spxcxal web site assigned in the file to locate any paragraph pages that contain the words and similar topic. Once thexe's a xaxch, it wilx then to query the commext sxction on thoxe xeb site. Afxer xathering all the raw data, the corresponding xopic will be generated and categorized by

2



Page 03 of 7

three em...