Browse Prior Art Database

Approach and System to Enable Large Scale Social Analysis

IP.com Disclosure Number: IPCOM000184998D
Original Publication Date: 2009-Jul-07
Included in the Prior Art Database: 2009-Jul-07
Document File: 2 page(s) / 23K

Publishing Venue

IBM

Abstract

As one of the major components in the Web 2.0 wave, social network sites have gained lots of attention in recent years. Actually, social networks exist everywhere in our daily life. Besides those networks built through all kinds of social network sites, for example, in telecom domain, social network is constructed by the relationship of mobile phone users drawn from call or SMS log.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

Approach and System to Enable Large Scale Social Analysis

The core idea of our invention is to build a hybrid data structure to maintain the relationship/connection information in social networks. This data structure is based on dynamic partition-able table structure like Google BigTable and space-efficient randomized data structure like bloom filter (Figure 1).

Figure 1. The hybrid data structure used to maintain social network data.

As shown in above figure, social relationships starting from a node (for example, the phone call relationship starting from 13901234567) are stored in a single line of this " table". Within a line, each peer-to-peer relationship corresponds to a column and we choose to differentiate "from"and "to"relationships with 2 column families. Each column family could hold any number of relationships, which enables the structure to accommodate the most unsociable and the most socialable nodes.

Besides those columns directly allocated for relationships, a special column named "bloomfilter" is used in each column family to maintain a space-efficient summary of the relationships in the family. By the nature of bloom filters, this column will be concise and could be used calculate many interesting characteristics, such as checking the existence of relationships, the number of common friends, etc.

Social network analysis is the mapping and measuring of relationships and flows between people, groups, organizations, animals, computers or other information/knowledge processing entities. The nodes in the network are the people and groups while the links show relationships or flows between the nodes. A sociometric chart plots the structure of interpersonal relations in a group situation. A sociogram can be drawn on the basis of many different criteria: Social relations, channels of influence, lines of communication etc. In sociometric chart, cohesion analysis is the first concern of social network analysis. A very important query is whether there is a relation between two entities, especially in a sociogram with billions of entities.

For those who are familiar with Google BigTable or Apache Hadoop, it's easy for them to realize the data model of previous section. Taking telecom social network as an example, mobile user's phone number could be used as the row key and column name (except the special "bloom filter"column). The cells in the "table"could be used to store information like SMS content, call time, call length, and so on. Timestamps will be used to keep the history of SMS and calls.

As the size of data stored in the "table" grows, the "table" could be partitioned dynamically leveraging the natural order on mobile phone numbers. Thosewho are familiar with Google BigTalbe or Ap...