Browse Prior Art Database

Method and System for Efficiently Representing Large Data Sets in a Supply Forecasting System

IP.com Disclosure Number: IPCOM000212326D
Publication Date: 2011-Nov-07
Document File: 3 page(s) / 210K

Publishing Venue

The IP.com Prior Art Database

Related People

Prashant Baronia: INVENTOR [+2]

Abstract

A method and system for efficiently representing large volumes of supply forecasting data sets using minimum amount of space and providing low latency while querying is disclosed.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Method and System for Efficiently Representing Large Data Sets in a Supply Forecasting System

Abstract

A method and system for efficiently representing large volumes of supply forecasting data sets using minimum amount of space and providing low latency while querying is disclosed.

Description

Disclosed is a method and system for efficiently representing large volumes of supply forecasting data sets using minimum amount of space and providing low latency while querying.  The forecasted data includes ad impression data and a Trend file.  The ad impression data includes a sample of ad opportunities which are seen by the system over a period of 7 days.  The ad opportunities consists of many user attributes (like age, gender etc) and page attributes (like content id, ad size etc.).  Further, the ad opportunities contain an hour of week when it was generated but excludes an actual date.  The trend file is the forecast which is generated on each base profile (content id, ad size pair).  These are predictions, which are generated for the next few years on each of the base profiles in the data.

The forecasted data is divided into 128 partitions to make it scalable.  Each partition is independent of other partitions.  Each partition contains an impression file, which has a single ad impression in a line with all the attributes separated by a special delimiter.  A different delimiter separates multi-valued attributes.  The impression file is dictionary compressed to save space.  A dictionary is prepared from all the ad impression data (in form of “key=valu...