Browse Prior Art Database

Method and System for Determining Compact Communities Containing Query Nodes

IP.com Disclosure Number: IPCOM000197958D
Publication Date: 2010-Jul-23
Document File: 6 page(s) / 143K

Publishing Venue

The IP.com Prior Art Database

Related People

Mauro Sozio: INVENTOR [+2]

Abstract

A method and system for determining compact communities containing query nodes is disclosed. A densely connected subgraph of an input graph is determined that contains a set of query nodes.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 28% of the total text.

Method and System for Determining Compact Communities Containing Query Nodes

Abstract

A method and system for determining compact communities containing query nodes is disclosed.  A densely connected subgraph of an input graph is determined that contains a set of query nodes.

Description

Disclosed is a method and system for determining compact communities containing query nodes.  An undirected graph G = (V, E) is provided in which the set of edges E represents binary relationships among the items V.  Example of the undirected graph may include collaboration network amongst scientists, a protein-interaction network, social networks, email networks, query graphs, etc.  The number of nodes in G may be represented as “n” and the number of edges as “m”.  Further, importance of a node may be denoted by assigning a weight to the node and strength between nodes in G may be denoted by assigning a weight to the edges that connect the nodes.  Additionally, a set of query nodes Q is provided that correspond to the seed of a community that needs to be extracted.  Thus, given an induced subgraph H of G, a function “f” may be defined that measures the goodness of H as an extracted community.  If H is densely connected, the function “f” should ideally take large values.

In order to determine an induced subgraph H = (VH, EH) of graph G such that VH contains Q (Q  VH), H is connected, and f (H) is maximized among all feasible choices for H, a density measure function fm (H) is used.  The density measure function fm (H) is defined as the minimum degree of any node of VH in the induced subgraph H = (VH, EH).  Further, when a distance constraint “d” is provided, an induced subgraph H = (VH, EH) of graph G can be determined such that such that VH contains Q (Q  VH), H is connected, DQ (H) ≤ d, and the minimum degree function fm (H) is maximized among all feasible choices for H.  The minimum degree measure function and the distance constraint are monotone.  A function f is said to be monotone non-increasing if for every graph G and for every induced subgraph H of G, f (H) ≤ f (G).  Similarly, a function f is said to be monotone non-decreasing when f (H) ≥ f (G).

The method and system formulate an objective based on the density of the subgraph, and then optimize this objective using combinatorial algorithms, thereby finding a meaningful community of query nodes.  A greedy algorithm is provided for determining a meaningful community of query nodes for any monotone function and that satisfy any number of monotone constraints.

Initially, the greedy algorithm sets G0 = G as the input graph and thereafter deletes one node at each step.  At the t-th step, a node u is considered that has a minimum degree in Gt-1.  Consequently, at the t-th step, a graph Gt is obtained by deleting the node u and all the edges that are incident to u from Gt-1.  Thereafter, the greedy algorithm terminates at the t-th step if at least one of the que...