Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

AUTOMATIC CREATION OF KEYWORD LISTS FOR DETECTION OF ILLEGAL CONTENT STREAMING

IP.com Disclosure Number: IPCOM000245744D
Publication Date: 2016-Apr-04
Document File: 6 page(s) / 91K

Publishing Venue

The IP.com Prior Art Database

Related People

Shalom Mitz: AUTHOR

Abstract

When searching for web sites that illegally stream TV content, keyword lists are used to zoom-in on sites for in-depth checks. The number of monitored events for large broadcasters can reach thousands every day. It is therefore impractical to do this manually. A methodology is presented herein that automates the difficult and expensive manual process of creating the keyword lists.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 40% of the total text.

Page 01 of 6

AUTOMATIC CREATION OF KEYWORD LISTS FOR DETECTION OF ILLEGAL CONTENT STREAMING

 AUTHOR: Shalom Mitz

CISCO SYSTEMS, INC.

ABSTRACT

    When searching for web sites that illegally stream TV content, keyword lists are used to zoom-in on sites for in-depth checks. The number of monitored events for large broadcasters can reach thousands every day. It is therefore impractical to do this manually. A methodology is presented herein that automates the difficult and expensive manual process of creating the keyword lists.

DETAILED DESCRIPTION

    A major threat to the revenue stream of TV broadcasters is the illegal streaming of TV channels. An anti-piracy operation may be employed to eliminate or minimize illegal streaming. Finding the Internet sites that carry and/or point to the illegal streaming is the main challenge faced when implementing the anti-piracy effort.

    When looking for web sites that illegally stream TV content, keyword lists are used to focus on sites for more detailed analysis. Currently, the keyword lists are created in a manual, per-event manner. This manual process is expensive and difficult to implement. For example, knowledge of a local language might be needed by the person creating the list.

    The process of finding sites that illegally stream content is typically divided into two distinct phases:

    First phase: Establish a list of suspected sites. During this phase, a large number of sites are superficially examined.

    Second phase: The suspected sites are examined in-depth, typically by acquiring the video streamed by the site and comparing the video to the legitimate TV signals.

Copyright 2016 Cisco Systems, Inc.

1


Page 02 of 6

This process is shown in Figure 1 below. Figure 1

    The two-phase process is necessary because the second step is very resource consuming and therefore can be performed on a very small fraction of the sites present on the public Internet. Therefore, the first phase, which is much less resource consuming, is used to process a large number of sites and create a "short-list" of suspected sites.

    The first phase relies mainly on a list of keywords in order to locate the suspected sites. The keywords might be compared to text that is collected from sites of interest or

Copyright 2016 Cisco Systems, Inc.
2


Page 03 of 6

by querying existing databases of sites, such as search engines. Sites of interest might be, for example, sites that are referred to by other suspected sites.

A process is presented herein for the automatic creation of the keyword lists.

An example implementation of the first phase involves two procedures:

1) Automatic derivation of weighted keyword list.

    2) Use of the weighted keyword list to derive a list of suspected web sites from a larger list of web sites.

    Figure 2 below illustrates an example of a procedure that implements the first phase.

Figure 2

Copyright 2016 Cisco Systems, Inc.

3


Page 04 of 6

1) Example procedure for the automatic creation of the keyword lists:

1.1) For each monitored T...