Probing of pretrained multilingual models on the knowledge of discourse

Mary Godunova, Ekaterina Voloshina


Abstract
With the raise of large language models (LLMs), different evaluation methods, including probing methods, are gaining more attention. Probing methods are meant to evaluate LLMs on their linguistic abilities. However, most of the studies are focused on morphology and syntax, leaving discourse research out of the scope. At the same time, understanding discourse and pragmatics is crucial to building up the conversational abilities of models. In this paper, we address the problem of probing several models of discourse knowledge in 10 languages. We present an algorithm to automatically adapt existing discourse tasks to other languages based on the Universal Dependencies (UD) annotation. We find that models perform similarly on high- and low-resourced languages. However, the overall low performance of the models’ quality shows that they do not acquire discourse well enough.
Anthology ID:
2024.codi-1.8
Volume:
Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024)
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes, Chuyuan Li
Venues:
CODI | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
78–90
Language:
URL:
https://aclanthology.org/2024.codi-1.8
DOI:
Bibkey:
Cite (ACL):
Mary Godunova and Ekaterina Voloshina. 2024. Probing of pretrained multilingual models on the knowledge of discourse. In Proceedings of the 5th Workshop on Computational Approaches to Discourse (CODI 2024), pages 78–90, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Probing of pretrained multilingual models on the knowledge of discourse (Godunova & Voloshina, CODI-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.codi-1.8.pdf
Supplementary material:
 2024.codi-1.8.SupplementaryMaterial.zip
Video:
 https://aclanthology.org/2024.codi-1.8.mp4