Improving Visual DNN-based Page Segmentation
Dr.-Ing. Sebastian Heil
Prof. Dr.-Ing. Martin Gaedke
User interfaces (UIs) change over time. For software vendors, it is necessary to control the amount and degree of changes to manage customer impact. They need a solution to analyze UI changes automatically. However, in different browsers, the same HTML/CSS/JS can generate UIs in varying appearances. Existing solutions for Cross-Browser Inconsistencies (XBI) detection rely on the DOM as input, which is not available for source systems in web migration scenarios and parsing of which is framework- and style-dependent. Recent technologies like Web Components allow definition of custom elements, rendering DOM-based analysis approaches infeasible. Thus, a computer-vision-based approach independent of the DOM, using image segmentation techniques, is required.
For visual analysis of UIs, parts of the UI do not only need to be located, but also classified with regard to the type of the UI element they contain. UI Elements can be atomic but also composite. Based on a previous solution, this thesis focuses on improving the recognition rate, measured in terms of precision, recall, accuracy and F-measure, by addressing the following problems: Deep Neural Networks need significant amounts of data for training, the thesis has to address this through a suitable approach for training data creation. Training of large datasets is computationally expensive and thus time-consuming, demanding for a solution to avoid complete re-training when new samples are added to the training dataset. User interface screenshots can have different sizes and screen resolutions, requiring a suitable pre-processing technique that does not deteriorate recognition rate. The target F1-measure is 0.75 or better. The solution created in this thesis must provide appropriate mechanisms to integrate with further processing and analysis steps.
The objective of this thesis is to find an approach or a combination of approaches to solve the above problem in the context of UI object detection based on computer vision and deep learning. This particularly includes the state of the art regarding object detection, deep learning, page and image segmentation. The demonstration of feasibility with an implementation prototype of the concept is part of this thesis as well as a suitable evaluation on a representative test dataset including performance and quality measurements.