Are modern neural ASR architectures robust for polysynthetic languages?

Eric Le Ferrand, Zoey Liu, Antti Arppe, Emily Prud’hommeaux


Abstract
Automatic speech recognition (ASR) technology is frequently proposed as a means of preservation and documentation of endangered languages, with promising results thus far. Among the endangered languages spoken today, a significant number exhibit complex morphology. The models employed in contemporary language documentation pipelines that utilize ASR, however, are predominantly based on isolating or inflectional languages, often from the Indo-European family. This raises a critical concern: building models exclusively on such languages may introduce a bias, resulting in better performance with simpler morphological structures. In this paper, we investigate the performance of modern ASR architectures on morphologically complex languages. Results indicate that modern ASR architectures appear less robust in managing high OOV rates for morphologically complex languages in terms of word error rate, while character error rates are consistently higher for isolating languages.
Anthology ID:
2024.findings-emnlp.166
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2953–2963
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.166/
DOI:
10.18653/v1/2024.findings-emnlp.166
Bibkey:
Cite (ACL):
Eric Le Ferrand, Zoey Liu, Antti Arppe, and Emily Prud’hommeaux. 2024. Are modern neural ASR architectures robust for polysynthetic languages?. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2953–2963, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Are modern neural ASR architectures robust for polysynthetic languages? (Le Ferrand et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.166.pdf