How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments

Won Ik Cho, Jihyung Moon


Abstract
Social consensus has been established on the severity of online hate speech since it not only causes mental harm to the target, but also gives displeasure to the people who read it. For Korean, the definition and scope of hate speech have been discussed widely in researches, but such considerations were hardly extended to the construction of hate speech corpus. Therefore, we create a Korean online hate speech dataset with concrete annotation guideline to see how real world toxic expressions concern sociolinguistic discussions. This inductive observation reveals that hate speech in online news comments is mainly composed of social bias and toxicity. Furthermore, we check how the final corpus corresponds with the definition and scope of hate speech, and confirm that the overall procedure and outcome is in concurrence with the sociolinguistic discussions.
Anthology ID:
2021.nlp4dh-1.3
Volume:
Proceedings of the Workshop on Natural Language Processing for Digital Humanities
Month:
December
Year:
2021
Address:
NIT Silchar, India
Editors:
Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
Venue:
NLP4DH
SIG:
Publisher:
NLP Association of India (NLPAI)
Note:
Pages:
13–22
Language:
URL:
https://aclanthology.org/2021.nlp4dh-1.3
DOI:
Bibkey:
Cite (ACL):
Won Ik Cho and Jihyung Moon. 2021. How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments. In Proceedings of the Workshop on Natural Language Processing for Digital Humanities, pages 13–22, NIT Silchar, India. NLP Association of India (NLPAI).
Cite (Informal):
How Does the Hate Speech Corpus Concern Sociolinguistic Discussions? A Case Study on Korean Online News Comments (Cho & Moon, NLP4DH 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.nlp4dh-1.3.pdf