Browse Prior Art Database

Task posting strategy based on characterization of online crowd workers

IP.com Disclosure Number: IPCOM000238648D
Publication Date: 2014-Sep-09
Document File: 9 page(s) / 501K

Publishing Venue

The IP.com Prior Art Database

Abstract

Crowdsourcing has emerged as a promising mechanism for getting an online pool of labor to work on a wide range of tasks varying from micro-tasks (e.g., image tagging, event annotation, digitization) to complex tasks (e.g., translation, proof reading, programming, design). In a common platform model (as in Amazon Mechanical Turk (AMT)), requesters post the task(s) on a crowdsourcing platform, and subscribed crowd workers come online and accept and complete tasks of their choice as per their convenience. Modeling behavior of crowd workers in such a ‘nonstandard organization’ is challenging because of the distributed and private nature of the workers. Further, the flexible framework of crowd work makes it challenging to ensure quality, accuracy, and completion of the posted tasks. Thus, it is natural for requesters to try to determine the ‘best’ time to post their tasks. Here the ‘best’ could be with respect to a number of factors (e.g. time to complete, accuracy, cost etc.) that are of relevance to the requesters. In this proposal, our focus is on completion rate and completion time – a crucial factor for many businesses that harness the crowd to perform tasks like form processing (digitization) and have strict completion times as part of the SLAs that need to be met.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 15% of the total text.

Task posting strategy based on characterization of online crowd workers

Crowdsourcing has emerged as a promising mechanism for getting an online pool of labor to work on a wide range of tasks varying from micro-tasks (e.g., image tagging, event annotation, digitization) to complex tasks (e.g., translation, proof reading, programming, design).  In a common platform model (as in Amazon Mechanical Turk (AMT)), requesters post the task(s) on a crowdsourcing platform, and subscribed crowd workers come online and accept and complete tasks of their choice as per their convenience. Modeling behavior of crowd workers in such a ‘nonstandard organization’ is challenging because of the distributed and private nature of the workers.  Further, the flexible framework of crowd work makes it challenging to ensure quality, accuracy, and completion of the posted tasks. Thus, it is natural for requesters to try to determine the ‘best’ time to post their tasks. Here the ‘best’ could be with respect to a number of factors (e.g. time to complete, accuracy, cost etc.) that are of relevance to the requesters. In this proposal, our focus is on completion rate and completion time – a crucial factor for many businesses that harness the crowd to perform tasks like form processing (digitization) and have strict completion times as part of the SLAs that need to be met.

The focus of this proposal is two-fold

1.  Characterize online crowd workers using a stochastic process

2.  Apply the statistical characteristics obtained from above to propose task posting strategy for improving timely completion of the tasks

We characterize the online workers on a platform as a random process and provide probabilistic guarantees on number of workers online at a particular time on the platform. Interestingly, it has been observed that workers tend to pick up most recent jobs posted on the platform. Thus, using the online worker statistics, we can obtain higher visibility busy periods on the platform, and leverage this information to design intelligent posting strategies for improved completion times.

In this proposal, we characterize the online crowd, i.e., we obtain temporal statistics of the workers who are logged in on a crowdsourcing platform. This characterization of online crowd workers is then used to propose a task posting strategies for improved completion times (and completion rates).

Novelty:

The primary novelties in our approach are the following.

  1. To the best of our knowledge, this is the first characterization of temporal statistics of online crowd workers where arrivals are modeled via a non-homogeneous Poisson point process (NHPP) and duration is modeled by a generic distribution.
  2. We present a novel k-coverage analysis-based methodology that provides the probability that there are at least k online workers on a platform at time t. This in turn captures the notion of visibility on a platform that can be leveraged for intelligent posting strategies.
  3. The above anal...