Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse

Ekaterina Borisova, Raia Abu Ahmad, Leyla Garcia-Castro, Ricardo Usbeck, Georg Rehm


Abstract
In the realm of Machine Learning and Deep Learning, there is a need for high-quality annotated data to train and evaluate supervised models. An extensive number of annotation tools have been developed to facilitate the data labelling process. However, finding the right tool is a demanding task involving thorough searching and testing. Hence, to effectively navigate the multitude of tools, it becomes essential to ensure their findability, accessibility, interoperability, and reusability (FAIR). This survey addresses the FAIRness of existing annotation software by evaluating 50 different tools against the FAIR principles for research software (FAIR4RS). The study indicates that while being accessible and interoperable, annotation tools are difficult to find and reuse. In addition, there is a need to establish community standards for annotation software development, documentation, and distribution.
Anthology ID:
2024.law-1.4
Volume:
Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII)
Month:
March
Year:
2024
Address:
St. Julians, Malta
Editors:
Sophie Henning, Manfred Stede
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
29–45
Language:
URL:
https://aclanthology.org/2024.law-1.4
DOI:
Bibkey:
Cite (ACL):
Ekaterina Borisova, Raia Abu Ahmad, Leyla Garcia-Castro, Ricardo Usbeck, and Georg Rehm. 2024. Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse. In Proceedings of The 18th Linguistic Annotation Workshop (LAW-XVIII), pages 29–45, St. Julians, Malta. Association for Computational Linguistics.
Cite (Informal):
Surveying the FAIRness of Annotation Tools: Difficult to find, difficult to reuse (Borisova et al., LAW-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.law-1.4.pdf
Video:
 https://aclanthology.org/2024.law-1.4.mp4