Reconstruction of Cuneiform Literary Texts as Text Matching

Fabian Simonjetz, Jussi Laasonen, Yunus Cobanoglu, Alexander Fraser, Enrique Jiménez


Abstract
Ancient Mesopotamian literature is riddled with gaps, caused by the decay and fragmentation of its writing material, clay tablets. The discovery of overlaps between fragments allows reconstruction to advance, but it is a slow and unsystematic process. Since new pieces are found and digitized constantly, NLP techniques can help to identify fragments and match them with existing text collections to restore complete literary works. We compare a number of approaches and determine that a character-level n-gram-based similarity matching approach works well for this problem, leading to a large speed-up for researchers in Assyriology.
Anthology ID:
2024.lrec-main.1197
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
13712–13721
Language:
URL:
https://aclanthology.org/2024.lrec-main.1197
DOI:
Bibkey:
Cite (ACL):
Fabian Simonjetz, Jussi Laasonen, Yunus Cobanoglu, Alexander Fraser, and Enrique Jiménez. 2024. Reconstruction of Cuneiform Literary Texts as Text Matching. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 13712–13721, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Reconstruction of Cuneiform Literary Texts as Text Matching (Simonjetz et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.1197.pdf