German Dialect Identification and Mapping for Preservation and Recovery

Aynalem Tesfaye Misganaw, Sabine Roller


Abstract
Many linguistic projects which focus on dialects do collection of audio data, analysis, and linguistic interpretation on the data. The outcomes of such projects are good language resources because dialects are among less-resources languages as most of them are oral traditions. Our project Dialektatlas Mittleres Westdeutschland (DMW) 1 focuses on the study of German language varieties through collection of audio data of words and phrases which are selected by linguistic experts based on the linguistic significance of the words (and phrases) to distinguish dialects among each other. We used a total of 7,814 audio snippets of the words and phrases of eight different dialects from middle west Germany. We employed a multilabel classification approach to address the problem of dialect mapping using Support Vector Machine (SVM) algorithm. The experimental result showed a promising accuracy of 87%.
Anthology ID:
2022.eurali-1.10
Volume:
Proceedings of the Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Atul Kr. Ojha, Sina Ahmadi, Chao-Hong Liu, John P. McCrae
Venue:
EURALI
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
65–69
Language:
URL:
https://aclanthology.org/2022.eurali-1.10
DOI:
Bibkey:
Cite (ACL):
Aynalem Tesfaye Misganaw and Sabine Roller. 2022. German Dialect Identification and Mapping for Preservation and Recovery. In Proceedings of the Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference, pages 65–69, Marseille, France. European Language Resources Association.
Cite (Informal):
German Dialect Identification and Mapping for Preservation and Recovery (Misganaw & Roller, EURALI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eurali-1.10.pdf