Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation

Dan Li, Zi Long Zhu, Janneke van de Loo, Agnes Masip Gomez, Vikrant Yadav, Georgios Tsatsaronis, Zubair Afzal


Abstract
Extreme multi-label text classification is a prevalent task in industry, but it frequently encounters challenges in terms of machine learning perspectives, including model limitations, data scarcity, and time-consuming evaluation. This paper aims to mitigate these issues by introducing novel approaches. Firstly, we propose a label ranking model as an alternative to the conventional SciBERT-based classification model, enabling efficient handling of large-scale labels and accommodating new labels. Secondly, we present an active learning-based pipeline that addresses the data scarcity of new labels during the update of a classification system. Finally, we introduce ChatGPT to assist with model evaluation. Our experiments demonstrate the effectiveness of these techniques in enhancing the extreme multi-label text classification task.
Anthology ID:
2023.emnlp-industry.30
Volume:
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track
Month:
December
Year:
2023
Address:
Singapore
Editors:
Mingxuan Wang, Imed Zitouni
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
313–321
Language:
URL:
https://aclanthology.org/2023.emnlp-industry.30
DOI:
10.18653/v1/2023.emnlp-industry.30
Bibkey:
Cite (ACL):
Dan Li, Zi Long Zhu, Janneke van de Loo, Agnes Masip Gomez, Vikrant Yadav, Georgios Tsatsaronis, and Zubair Afzal. 2023. Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 313–321, Singapore. Association for Computational Linguistics.
Cite (Informal):
Enhancing Extreme Multi-Label Text Classification: Addressing Challenges in Model, Data, and Evaluation (Li et al., EMNLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.emnlp-industry.30.pdf
Video:
 https://aclanthology.org/2023.emnlp-industry.30.mp4