PUBLICATION

Assessing Completeness in Training Data for Image-Based Analysis of Web User Interfaces

Type

Conference Paper

Year

2019

Authors

Dr.-Ing. Sebastian Heil

senior researcher

Room: 1/B204

Phone: +49 371 531 32861

Fax: +49 371 531 8 32861

Email: sebastian.heil@informatik.tu-chemnitz.de

Prof. Dr.-Ing. Martin Gaedke

professor

Room: 1/B319

Phone: +49 371 531 25530

Fax: +49 371 531 25539

Email: gaedke@informatik.tu-chemnitz.de

Research Area

Web Engineering

Event

Young Scientist"s Third International Workshop on Trends in Information Processing

Published in

Proceedings of the YSIP-3 Workshop

Download

PDF

Abstract

Analysis of user interfaces (UIs) based on their visual representation (screenshots) is gaining increasing popularity, institutionalizing the HCI vision field. Witnessing the same visual appearance of a UI like a human user provides the advantage of taking into account layouts, whitespace, graphical content, etc. independent of the concrete platform and framework used. However, visual analysis requires significant amounts of training data, particularly for the classifiers that identify UI elements and their types. In our paper we demonstrate how data completeness could be assessed in training datasets produced by crowdworkers, without the need to duplicate the extensive work. In the experimental session, 11 annotators labeled more than 42000 UI elements in nearly 500 web UI screenshots using the LabelImg tool with the pre-defined set of classes corresponding to visually identifiable web page element types. We identify metrics that can be automatically extracted for UI screenshots and construct regression models predicting the expected number of labeled elements in the screenshot. The results can be used in outlier analysis of crowdworkers in any existing microtasking platform.

Reference

Heil, Sebastian; Bakaev, Maxim; Gaedke, Martin: Assessing Completeness in Training Data for Image-Based Analysis of Web User Interfaces. Proceedings of the YSIP-3 Workshop, 2019.