Large Language Models for Multilingual Slavic Named Entity Linking

Rinalds Vīksna, Inguna Skadiņa, Daiga Deksne, Roberts Rozis


Abstract
This paper describes our submission for the 4th Shared Task on SlavNER on three Slavic languages - Czech, Polish and Russian. We use pre-trained multilingual XLM-R Language Model (Conneau et al., 2020) and fine-tune it for three Slavic languages using datasets provided by organizers. Our multilingual NER model achieves 0.896 F-score on all corpora, with the best result for Czech (0.914) and the worst for Russian (0.880). Our cross-language entity linking module achieves F-score of 0.669 in the official SlavNER 2023 evaluation.
Anthology ID:
2023.bsnlp-1.20
Volume:
Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023)
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Jakub Piskorski, Michał Marcińczuk, Preslav Nakov, Maciej Ogrodniczuk, Senja Pollak, Pavel Přibáň, Piotr Rybak, Josef Steinberger, Roman Yangarber
Venue:
BSNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
172–178
Language:
URL:
https://aclanthology.org/2023.bsnlp-1.20
DOI:
10.18653/v1/2023.bsnlp-1.20
Bibkey:
Cite (ACL):
Rinalds Vīksna, Inguna Skadiņa, Daiga Deksne, and Roberts Rozis. 2023. Large Language Models for Multilingual Slavic Named Entity Linking. In Proceedings of the 9th Workshop on Slavic Natural Language Processing 2023 (SlavicNLP 2023), pages 172–178, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Large Language Models for Multilingual Slavic Named Entity Linking (Vīksna et al., BSNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.bsnlp-1.20.pdf
Video:
 https://aclanthology.org/2023.bsnlp-1.20.mp4