Automated Extraction of Hypo-Hypernym Relations for the Ukrainian WordNet

Nataliia Romanyshyn, Dmytro Chaplynskyi, Mariana Romanyshyn


Abstract
WordNet is a crucial resource in linguistics and natural language processing, providing a detailed and expansive set of lexico-semantic relationships among words in a language. The trend toward automated construction and expansion of WordNets has become increasingly popular due to the high costs of manual development. This study aims to automate the development of the Ukrainian WordNet, explicitly concentrating on hypo-hypernym relations that are crucial building blocks of the hierarchical structure of WordNet. Utilizing the linking between Princeton WordNet, Wikidata, and multilingual resources from Wikipedia, the proposed approach successfully mapped 17% of Princeton WordNet (PWN) content to Ukrainian Wikipedia. Furthermore, the study introduces three innovative strategies for generating new entries to fill in the gaps of the Ukrainian WordNet: machine translation, the Hypernym Discovery model, and the Hypernym Instruction-Following LLaMA model. The latter model shows a high level of effectiveness, evidenced by a 41.61% performance on the Mean Overlap Coefficient (MOC) metric. With the proposed approach that combines automated techniques with expert human input, we provide a reliable basis for creating the Ukrainian WordNet.
Anthology ID:
2024.unlp-1.7
Volume:
Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Mariana Romanyshyn, Nataliia Romanyshyn, Andrii Hlybovets, Oleksii Ignatenko
Venue:
UNLP
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
51–60
Language:
URL:
https://aclanthology.org/2024.unlp-1.7
DOI:
Bibkey:
Cite (ACL):
Nataliia Romanyshyn, Dmytro Chaplynskyi, and Mariana Romanyshyn. 2024. Automated Extraction of Hypo-Hypernym Relations for the Ukrainian WordNet. In Proceedings of the Third Ukrainian Natural Language Processing Workshop (UNLP) @ LREC-COLING 2024, pages 51–60, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Automated Extraction of Hypo-Hypernym Relations for the Ukrainian WordNet (Romanyshyn et al., UNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.unlp-1.7.pdf