Benchmarking Neural Networks-Based Approaches for Predicting Visual Perception of User Interfaces
Deep Learning techniques have become the mainstream and unquestioned standard in many fields, e.g. convolutional neural networks (CNN) for image analysis and recognition tasks. As testing and validation of graphical user interfaces (GUIs) is increasingly relying on computer vision, CNN models that predict such subjective and informal dimensions of user experience as aesthetic or complexity perception start to achieve decent accuracy. They however require huge amounts of human-labeled training data, which are costly or unavailable in the field of Human-Computer Interaction (HCI). More traditional approaches rely on manually engineered features that are extracted from UI images with domain-specific algorithms and are used in “traditional” Machine Learning models, such as feedforward artificial neural networks (ANN) that generally need fewer data. In our paper, we compare the prediction quality of CNN (a modified GoogLeNet architecture) and ANN models to predict visual perception per Aesthetics, Complexity, and Orderliness scales for about 2700 web UIs assessed by 137 users. Our results suggest that the ANN architecture produces smaller Mean Squared Error (MSE) for the training dataset size (N) available in our study, but that CNN should become superior with N > 2912. We also propose the regression model that can help HCI researchers to foretell MSE in their ML experiments.
Bakaev, Maxim; Heil, Sebastian; Chirkov, Leonid; Gaedke, Martin: Benchmarking Neural Networks-Based Approaches for Predicting Visual Perception of User Interfaces, pp. 217-231, 2022.