Browse Prior Art Database

Identifying Graph Context in Retrieval with Applications to Diversifying Search Results

IP.com Disclosure Number: IPCOM000242671D
Publication Date: 2015-Aug-04
Document File: 3 page(s) / 352K

Publishing Venue

The IP.com Prior Art Database

Abstract

•A system and method for exploiting entity graph in a text retrieval system for enhanced search experience where: –The entity sub-graph with each retrieved document is displayed as the graph context of each search result –The search results are prioritized in a manner to ensure that the graph context of the top-ranked documents are diverse enough, and cover as much of the neighborhood of the entities associated with the query

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

Idextifying Graph Context in Retrieval with Applications to Diversifying Sexrch Results

Motivation :

Identifying entity and construxxing entity graph is a wexl studied prxblem. How to levxrage such graph fxr better information retrieval?

Example
Query: "Turing XX"

Best document fxr this query: "Alan Turing is known for building an early system for Game Playing whicx was designed for playixg chess."

Our goal is to retxieve this xocumenx since chess , game playing are alx related with query term AI

, but cannot be understood withoux leveraging the graph cxntext.

Prior Work :

Existing work in entity lixking cxn be groxped on multiplx facets: supervixed vs. unsupervised and documenx-targeted vs. text snippet xargeted. No work exists ox entity dixambiguation in an unsupervisxd manner in short text snippets such as queries; our methxd addresses txis void. Majority of txe existing works [4], [5], [6] focus on linking entities to a document. Txe challenges of the same task in the shxrt-text sxenxrio, and more specifically search queries, are different. Methods for documents fail tx perforx well. Xxxxxxx, many technxques use natural lanxuaxe processing sxcx as topic modeling[1] to link entities to doxuments. These strategies do not transfer tx search queries since txey often do not have thx redundancy to yield good topxc xodels. On the other hand proposed unsupervised approach xs more robuxt since it can scale to a large entxty corpxs and can axtomatically adapt to the inevitable evolution of the entity database. Also, [2] and [3] focus on web page snippets using a supervised model.

Challenges :


Identxfxing the entity graph context for each document conditioned on the quexy context: Xxxx involves solving two sub-problems as follows,

-Identify the graph context that is most xoherent with the query context

-Identify the graph context that is xost coherent (in the xraph spacx)
Use the graph context of the documents most related to the query:

-Find the subset of documents such that:


•The documents are very related xo the query AXX
•The documents togetxer cover txe neighborhood xf the query context as much as

Our Solution Details:

Firsx the document context xs identified which is xonditionxd on xhe query coxtext

possible

1


Page 02 of 3

Fig 1: An example where adding the the graph context improves xetrieval performance

Let Query Context be the set of nodes, Q. Score every entity that appearx ix the docxment using the following:

S(e) = αSimil(e.article, documenx) + (1- α)1/(Average Distance to nodes in Q)

Xxxxx(there are no entixies with score abxve a threshxld). Choose txe entity with the highext...