Developing a Chatbot-like conversational User Interface to describe Research Datasets
Intelligent Information Management
Dipl.-Inf. André Langer
Prof. Dr.-Ing. Martin Gaedke
When publishing scientific artifacts such as recorded files from an experiment, researchers are encouraged to provide additional structured meta information about certain characteristics of these scientific datasets, as these are normally not self-descriptive. For that purpose, several proposed metadata standards and schemas already exist. Such a meta data description nowadays commonly comprises a title, some information about the author and institution, some other administrative or citational metadata, some simple and maybe ambiguous keywords and an unstructured description of the main content. However, especially for early-career researchers, it is an obstacle to start with research data publishing because they are not aware of relevant existing standards, are bored to fill out extensive submission forms or see it as a time-consuming activity without support or interaction.
The objective of this internship is to do research on an alternative user input interface to gather structured meta information about a scientific dataset that shall be published. After briefly describing the current situation and the vision with a motivating scenario, an analsysis chapter has to identify requirements, relevant guidelines such as the FAIR and CARE principles and existing approaches. A concept has to be designed how a conversational user interface system can be realized and which possibilities exist for an interactive, adaptive dialog with the user. The solution shall then be implemented in a web application as a proof-of-concept based on an appropriate framework, such as Rasa. After interacting with the user, it shall offer the possibility to download the collected and prepared meta information in a structured file format such as RDF/XML or JSON-LD. Finally, the correctness and improved user input interface experience in comparison to traditional static web form for research dataset meta-description could briefly be tested.