SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation

Sadat Shahriar, Thamar Solorio


Abstract
Subjectivity and difference of opinion are key social phenomena, and it is crucial to take these into account in the annotation and detection process of derogatory textual content. In this paper, we use four datasets provided by SemEval-2023 Task 11 and fine-tune a BERT model to capture the disagreement in the annotation. We find individual annotator modeling and aggregation lowers the Cross-Entropy score by an average of 0.21, compared to the direct training on the soft labels. Our findings further demonstrate that annotator metadata contributes to the average 0.029 reduction in the Cross-Entropy score.
Anthology ID:
2023.semeval-1.12
Volume:
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Giovanni Da San Martino, Harish Tayyar Madabushi, Ritesh Kumar, Elisa Sartori
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
94–100
Language:
URL:
https://aclanthology.org/2023.semeval-1.12
DOI:
10.18653/v1/2023.semeval-1.12
Bibkey:
Cite (ACL):
Sadat Shahriar and Thamar Solorio. 2023. SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation. In Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 94–100, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
SafeWebUH at SemEval-2023 Task 11: Learning Annotator Disagreement in Derogatory Text: Comparison of Direct Training vs Aggregation (Shahriar & Solorio, SemEval 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.semeval-1.12.pdf
Video:
 https://aclanthology.org/2023.semeval-1.12.mp4