Assessing Completeness in Training Data for Image-Based Analysis of Web User Interfaces
Dr.-Ing. Sebastian Heil
Prof. Dr.-Ing. Martin Gaedke
Young Scientist"s Third International Workshop on Trends in Information Processing
Proceedings of the YSIP-3 Workshop
Analysis of user interfaces (UIs) based on their visual representation (screenshots) is gaining increasing popularity, institutionalizing the HCI vision field. Witnessing the same visual appearance of a UI like a human user provides the advantage of taking into account layouts, whitespace, graphical content, etc. independent of the concrete platform and framework used. However, visual analysis requires significant amounts of training data, particularly for the classifiers that identify UI elements and their types. In our paper we demonstrate how data completeness could be assessed in training datasets produced by crowdworkers, without the need to duplicate the extensive work. In the experimental session, 11 annotators labeled more than 42000 UI elements in nearly 500 web UI screenshots using the LabelImg tool with the pre-defined set of classes corresponding to visually identifiable web page element types. We identify metrics that can be automatically extracted for UI screenshots and construct regression models predicting the expected number of labeled elements in the screenshot. The results can be used in outlier analysis of crowdworkers in any existing microtasking platform.
Heil, Sebastian; Bakaev, Maxim; Gaedke, Martin: Assessing Completeness in Training Data for Image-Based Analysis of Web User Interfaces. Proceedings of the YSIP-3 Workshop, 2019.