ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot

Dimitra Anastasiou


Abstract
Machine Translation (MT)-empowered chatbots are not established yet, however, we see an amazing future breaking language barriers and enabling conversation in multiple languages without time-consuming language model building and training, particularly for under-resourced languages. In this paper we focus on the under-resourced Luxembourgish language. This article describes the experiments we have done with a dataset containing administrative questions that we have manually created to offer BERT QA capabilities to a multilingual chatbot. The chatbot supports visual dialog flow diagram creation (through an interface called BotStudio) in which a dialog node manages the user question at a specific step. Dialog nodes can be matched to the user’s question by using a BERT classification model which labels the question with a dialog node label.
Anthology ID:
2022.sigul-1.27
Volume:
Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Maite Melero, Sakriani Sakti, Claudia Soria
Venue:
SIGUL
SIG:
SIGUL
Publisher:
European Language Resources Association
Note:
Pages:
207–212
Language:
URL:
https://aclanthology.org/2022.sigul-1.27
DOI:
Bibkey:
Cite (ACL):
Dimitra Anastasiou. 2022. ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot. In Proceedings of the 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages, pages 207–212, Marseille, France. European Language Resources Association.
Cite (Informal):
ENRICH4ALL: A First Luxembourgish BERT Model for a Multilingual Chatbot (Anastasiou, SIGUL 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.sigul-1.27.pdf