A Joint Framework for Ancient Chinese WS and POS Tagging Based on Adversarial Ensemble Learning

Shuxun Yang


Abstract
Ancient Chinese word segmentation and part-of-speech tagging tasks are crucial to facilitate the study of ancient Chinese and the dissemination of traditional Chinese culture. Current methods face problems such as lack of large-scale labeled data, individual task error propagation, and lack of robustness and generalization of models. Therefore, we propose a joint framework for ancient Chinese WS and POS tagging based on adversarial ensemble learning, called AENet. On the basis of pre-training and fine-tuning, AENet uses a joint tagging approach of WS and POS tagging and treats it as a joint sequence tagging task. Meanwhile, AENet incorporates adversarial training and ensemble learning, which effectively improves the model recognition efficiency while enhancing the robustness and generalization of the model. Our experiments demonstrate that AENet improves the F1 score of word segmentation by 4.48% and the score of part-of-speech tagging by 2.29% on test dataset compared with the baseline, which shows high performance and strong generalization.
Anthology ID:
2022.lt4hala-1.27
Volume:
Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Rachele Sprugnoli, Marco Passarotti
Venue:
LT4HALA
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
174–177
Language:
URL:
https://aclanthology.org/2022.lt4hala-1.27
DOI:
Bibkey:
Cite (ACL):
Shuxun Yang. 2022. A Joint Framework for Ancient Chinese WS and POS Tagging Based on Adversarial Ensemble Learning. In Proceedings of the Second Workshop on Language Technologies for Historical and Ancient Languages, pages 174–177, Marseille, France. European Language Resources Association.
Cite (Informal):
A Joint Framework for Ancient Chinese WS and POS Tagging Based on Adversarial Ensemble Learning (Yang, LT4HALA 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.lt4hala-1.27.pdf