ÚFAL LatinPipe at EvaLatin 2024: Morphosyntactic Analysis of Latin

Milan Straka, Jana Straková, Federica Gamba


Abstract
We present LatinPipe, the winning submission to the EvaLatin 2024 Dependency Parsing shared task. Our system consists of a fine-tuned concatenation of base and large pre-trained LMs, with a dot-product attention head for parsing and softmax classification heads for morphology to jointly learn both dependency parsing and morphological analysis. It is trained by sampling from seven publicly available Latin corpora, utilizing additional harmonization of annotations to achieve a more unified annotation style. Before fine-tuning, we train the system for a few initial epochs with frozen weights. We also add additional local relative contextualization by stacking the BiLSTM layers on top of the Transformer(s). Finally, we ensemble output probability distributions from seven randomly instantiated networks for the final submission. The code is available at https://github.com/ufal/evalatin2024-latinpipe.
Anthology ID:
2024.lt4hala-1.24
Volume:
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Rachele Sprugnoli, Marco Passarotti
Venues:
LT4HALA | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
207–214
Language:
URL:
https://aclanthology.org/2024.lt4hala-1.24
DOI:
Bibkey:
Cite (ACL):
Milan Straka, Jana Straková, and Federica Gamba. 2024. ÚFAL LatinPipe at EvaLatin 2024: Morphosyntactic Analysis of Latin. In Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024, pages 207–214, Torino, Italia. ELRA and ICCL.
Cite (Informal):
ÚFAL LatinPipe at EvaLatin 2024: Morphosyntactic Analysis of Latin (Straka et al., LT4HALA-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lt4hala-1.24.pdf