Browse Prior Art Database

System and Method for Recognizing Similar Products using Social Media Driven Content Analytics

IP.com Disclosure Number: IPCOM000238247D
Publication Date: 2014-Aug-12
Document File: 2 page(s) / 27K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a system that uses a combination of text and graphic product recognition to distinguish between products and aggregate social media from different sources for the same product. The system runs the gathered social media through a text and graphic recognition engine and determines if the graphics, text, and social media content are for the same product.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

System and Method for Recognizing Similar Products using Social Media Driven Content Analytics

In the realm of social media, a single product can have multiple (perhaps thousands) of websites dedicated to it. Despite the common subject matter for these websites, due to the different origins, the social media content is not uniformly aggregated and not uniformly presented to the manufacturer and customers. When manufacturers and customers research social media for the product information, it is difficult to determine whether the content is consistent across the sites.

Due to the lack of standardized master identifiers (ID) used for products' social media aggregation, it is non-trivial for information technology (IT) systems to combine social media from different sources for the same product. Text data matching can be problematic because a title such as "Brand X shoe" may or may not have a model number and/or could be a category label; the same is true with automated graphic matching. This non-uniformity creates a challenge for IT systems to aggregate the social media content by product.

No robust method exists to aggregate the social media data (i.e. no master IDs), very simplistic matching (by product title), and brute force (multiple site crawlers). A scalable and accurate approach for recognizing similar products using social media driven content analytics is needed.

The novel idea is to build a system that uses a combination of text and graphic product recognition to distinguish between products and aggregate social media from different sources for the same product. The method consists of five core components, listed below.

Social Media collector/crawler/aggregator. This component is commercially avail...