Enhancing Distantly Supervised Named Entity Recognition with Strong Label Guided Lottery Training

Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang


Abstract
In low-resource Named Entity Recognition (NER) scenarios, only a limited quantity of strongly labeled data is available, while a vast amount of weakly labeled data can be easily acquired through distant supervision. However, weakly labeled data may fail to improve the model performance or even harm it due to the inevitable noise. While training on noisy data, only certain parameters are essential for model learning, termed safe parameters, whereas the other parameters tend to fit noise. In this paper, we propose a noise-robust learning framework where safe parameters can be identified with guidance from the small set of strongly labeled data, and non-safe parameters are suppressed during training on weakly labeled data for better generalization. Our method can effectively mitigate the impact of noise in weakly labeled data, and it can be easily integrated with data level noise-robust learning methods for NER. We conduct extensive experiments on multiple datasets and the results show that our approach outperforms the state-of-the-art methods.
Anthology ID:
2024.lrec-main.524
Volume:
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
Venues:
LREC | COLING
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
5922–5929
Language:
URL:
https://aclanthology.org/2024.lrec-main.524
DOI:
Bibkey:
Cite (ACL):
Zhiyuan Ma, Jintao Du, Changhua Meng, and Weiqiang Wang. 2024. Enhancing Distantly Supervised Named Entity Recognition with Strong Label Guided Lottery Training. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 5922–5929, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Enhancing Distantly Supervised Named Entity Recognition with Strong Label Guided Lottery Training (Ma et al., LREC-COLING 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.lrec-main.524.pdf