PUBLICATION

Web User Interface as a Message: Power Law for Fraud Detection in Crowdsourced Labeling

Type

Conference Paper

Year

2021

Authors

Dr.-Ing. Sebastian Heil

senior researcher

Room: 1/B204

Phone: +49 371 531 32861

Fax: +49 371 531 8 32861

Email: sebastian.heil@informatik.tu-chemnitz.de

Prof. Dr.-Ing. Martin Gaedke

professor

Room: 1/B319

Phone: +49 371 531 25530

Fax: +49 371 531 25539

Email: gaedke@informatik.tu-chemnitz.de

Research Area

Web Engineering

Event

21th International Conference on Web Engineering

Published in

Proceedings of the 21th International Conference on Web Engineering

Download

PDF

Abstract

Web Engineering becomes increasingly hungry for training data, as the application of machine learning (ML) methods in the field intensifies. Human-labeled datasets are particularly indispensable for ML-based validation and design of user interfaces (UIs). The production of such datasets is often outsourced to crowdworkers, who typically have lower motivation and payment compared to in-house staff, so the quality of their work becomes the paramount concern. In our paper, we explore the applicability of the trending fraud detection approach based on fit to power law in crowdsourced web UI labeling. On Amazon Mechanical Turk, 298 crowdworkers labeled over 30,000 UI elements in about 500 university homepage screenshots. We found a significant correlation between workers’ precisions and Kolmogorov-Smirnov statistics-based goodness-of-fit between the frequencies of UI elements in a worker’s output and power law. The obtained R² = 0.504 was higher than the R² = 0.432 baseline for the popular time-on-task parameter. Moreover, the distribution of UI elements’ frequencies is much less prone to manipulation by malicious crowdworkers, which is advantageous as a crowdsourced data quality control measure. The findings of our study suggest a certain resemblance between web UIs and natural language texts, in which word frequencies are known to comply with Zipf’s law.

Reference

Heil, Sebastian; Bakaev, Maxim; Gaedke, Martin: Web User Interface as a Message: Power Law for Fraud Detection in Crowdsourced Labeling. Proceedings of the 21th International Conference on Web Engineering, 2021.