OLALA: Object-Level Active Learning for Efficient Document Layout Annotation

Zejiang Shen, Weining Li, Jian Zhao, Yaoliang Yu, Melissa Dell


Abstract
Layout detection is an essential step for accurately extracting structured contents from historical documents. The intricate and varied layouts present in these document images make it expensive to label the numerous layout regions that can be densely arranged on each page. Current active learning methods typically rank and label samples at the image level, where the annotation budget is not optimally spent due to the overexposure of common objects per image. Inspired by recent progress in semi-supervised learning and self-training, we propose OLALA, an Object-Level Active Learning framework for efficient document layout Annotation. OLALA aims to optimize the annotation process by selectively annotating only the most ambiguous regions within an image, while using automatically generated labels for the rest. Central to OLALA is a perturbation-based scoring function that determines which objects require manual annotation. Extensive experiments show that OLALA can significantly boost model performance and improve annotation efficiency, facilitating the extraction of masses of structured text for downstream NLP applications.
Anthology ID:
2022.nlpcss-1.19
Volume:
Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)
Month:
November
Year:
2022
Address:
Abu Dhabi, UAE
Editors:
David Bamman, Dirk Hovy, David Jurgens, Katherine Keith, Brendan O'Connor, Svitlana Volkova
Venue:
NLP+CSS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
170–182
Language:
URL:
https://aclanthology.org/2022.nlpcss-1.19
DOI:
10.18653/v1/2022.nlpcss-1.19
Bibkey:
Cite (ACL):
Zejiang Shen, Weining Li, Jian Zhao, Yaoliang Yu, and Melissa Dell. 2022. OLALA: Object-Level Active Learning for Efficient Document Layout Annotation. In Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS), pages 170–182, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):
OLALA: Object-Level Active Learning for Efficient Document Layout Annotation (Shen et al., NLP+CSS 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.nlpcss-1.19.pdf