Browse Prior Art Database

Visualization of Long Tail Summations for Large Data Series

IP.com Disclosure Number: IPCOM000235945D
Publication Date: 2014-Mar-31
Document File: 3 page(s) / 65K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a new viewing technique applied to a standard bar chart with vertical bar elements, wherein the elements in the data series are pre-sorted from high to low. The technique allows the viewer to directly see the major data elements, while providing an overview of the cumulative effect of the smaller data elements.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 3

Visualization of Long Tail Summations for Large Data Series

Many data series are encountered that have a very large number of values, larger than can be easily shown using a standard bar chart on a conventional monitor.

Among this class of data series, a very common distribution occurs where the high values near the beginning of the series dominate the vertical axis (when viewed in a

standard vertical bar chart), followed by a huge number of much lower values in the series in a logarithmic reduction. Data series with this property are said to have a "long tail distribution", which can sometimes also be characterized in special cases

as a "Zipf Distribution". When the data values in the series are sorted from high to low, it gives rise to a typical view that shows bar size decaying over time.

A problem that occurs when trying to visualize these very large series is that the

cumulative contributions of the smaller data items is easy to overlook due to the items' shrinking significance on the vertical access. Yet in many analytics applications (such as Search Engine Optimization), the sum total of these minor data entries is of critical significance: (http://www.bruceclay.com/newsletter/volume44/longtail.html).

No known prior work addresses this problem through visualization. The closest current approach is in the area of "Histogram Distributions" (http://en.wikipedia.org/wiki/Histogram), in which items are binned based on values; however, this approach applies the histogram uniformly across the entire data series.

The proposed technique begins with a standard bar chart with vertical bar elements, having the elements in the data series pre-sorted from high to low. The technique then provides a way of visually consolidating the small data items at the right end of the chart into a single "summary data item" that is distinct from the other data items and that shows the sum total of the individual entries.

The width of the "summary data item" expands so that it is visually distinct from the other standard bars for individual data elements. In addition, other visual cues are described that enhance the information display in the summary bar.

This approach is novel in that it allows the viewer to directly see the major data elements, while providing an overview of the cumulative effect of the smaller data

...