PUBLICATION

VISH: Does Your Smart Home Dialogue System Also Need Training Data?

Type

Conference Paper

Year

2020

Authors

Dr.-Ing. Sebastian Heil

senior researcher

Room: A11.320

Phone: +49 371 531 32861

Email: sebastian.heil@informatik.tu-chemnitz.de

Prof. Dr.-Ing. Martin Gaedke

professor

Room: 1/B319

Phone: +49 371 531 25530

Email: gaedke@informatik.tu-chemnitz.de

Research Area

Web Engineering

Event

20th International Conference on Web Engineering

Published in

Web Engineering. ICWE 2020. Lecture Notes in Computer Science, vol 12128

ISBN/ISSN

978-3-030-50577-6

Download

PDF

Abstract

The main objective of smart homes is to improve the quality of life and comfort of their inhabitants through automation systems and ambient intelligence. Voice-based interaction like dialogue systems is the current emerging trend in these systems. Natural Language Understanding (NLU) model can identify the end-users’ intentions in the utterances provided to spoken dialogue systems. The utility of dialogue systems is reliant on the quality of NLU models, which is in turn significantly dependent on the availability of a high-quality and sufficiently large corpus for training, containing diverse utterance structures. However, building such corpora is a complex task even for companies possessing significant human and infrastructure resources. On the other hand, the existing corpora for the smart home domain are either concerned with web services, focus on direct goals only, follow static command structure, or are not publicly available in English language which limits the development of goal-oriented dialogue systems for smart homes. In this paper, we propose a generic method to create training data for the NLU component using a generative grammar-based approach. Our method outputs, Voice Interaction in Smart Home (VISH) dataset consisting of five million unique utterances for the smart home. This dataset can greatly facilitate research in the area of voice-based dialogue systems for smart homes. We evaluate the approach by using VISH to train several state-of-the-art NLU models. Our experiment results demonstrate the capability of the corpus to support the development of goal-oriented voice-based dialogue systems in the context of smart homes.

Reference

Noura, Mahda; Heil, Sebastian; Gaedke, Martin: VISH: Does Your Smart Home Dialogue System Also Need Training Data?. Web Engineering. ICWE 2020. Lecture Notes in Computer Science, vol 12128, pp. 171--187, 2020.