The CloudFactory workforce and why we use it with VELOS

Chrissie Cormack Wood •

Blog 1

A study on quality data processing at scale (for text annotation)

Unlike crowdsourced workers, managed workforces are vetted for skill and character, paid by the hour (not by the number of tasks they perform), and are given opportunities for growth. We believe all of these factors naturally result in a higher quality of work, which is why VELOS has an integrated managed workforce.

Tried and tested

In 2019, we performed a study comparing a leading crowdsourcing platform’s anonymous workorce with a workforce managed by our partner, CloudFactory. We asked both teams to complete a series of the same tasks, to determine which team delivered the highest quality structured datasets, and at what relative cost. To avoid potential bias, the workers didn’t know they were participating in an experiment.

Below, we share the part of the study which specifically focuses on extracting information from unstructured text.

Task: extracting information from unstructured text

Workers were presented with 2,555 descriptions of a product recall issued by the U.S. Consumer Product Safety Commission, where the hazard type was either explicitly stated in the title or buried in the text.

The workers were asked to determine what the hazard type was by choosing from a drop-down menu of nine hazard-type classifications used by the Commission. We provided two additional options: “other” and “not enough information provided.”

The results

Irrespective of the word count of the recall, the crowdsourced workers achieved an accuracy of 50% to 60%, while the managed workers achieved a higher accuracy rate of 75% to 85%.

Understanding why the accuracy of the crowdsourced workers is lower

If we look at the distribution of responses from each of the workforces, we see that while both workforces chose the “not enough information” response with the same frequency, the crowdsourced workers were much more likely to answer “other” – in fact, 4 times more likely.

If we just take the 322 cases where the crowdsourced workers answered “other” we find that the managed workers only classed 10% of these as “other” and correctly classified 74% of the recalls, implying that the information required to make a correct classification was present in the recall text in the vast majority of these cases.

If we break down the time spent on each instance against on the response given, we see the crowdsourced workers took an average of less than 50 seconds before responding “other” while managed workers would spend over twice as long before resorting to this response. It appears that overuse of the “other” category explains some of the 25% accuracy gap between the workforces. However, even when we remove these cases, the managed workers still classify 16% more cases correctly than the crowdsourced ones.

How these results affect cost

There was little dependency between the length of the recall text and the amount of time either workforce spent. There also did not appear to be a meaningful difference between the time it took each workforce to do the task. Both workforces took an average of about 50 seconds to classify each recall. As a result, the managed workers, who were paid by the hour, cost the equivalent of 0.87 units per iteration, slightly higher than the cost of the crowdsourced workers.

This study demonstrates that there can be large differences in the accuracy of data analysed by different workforce types. In this particular task, the managed workforce outperformed the crowdsourced workers in terms of accuracy, even when the effective costs per task were similar for each workforce.

At Hivemind, we have found it can cost up to two times more to use crowdsourced workers, because it often requires a consensus model with multiple people completing or reviewing tasks to achieve passable quality.

So, while crowdsourcing offers a cheaper option it’s rarely as inexpensive as it seems. A managed team is a better choice when quality is important, and you want to be able to iterate or evolve the work.