Korean-Specific Emotion Annotation Procedure Using N-Gram-Based Distant Supervision and Korean-Specific-Feature-Based Distant Supervision

Young-Jun Lee, Chae-Gyun Lim, Ho-Jin Choi


Abstract
Detecting emotions from texts is considerably important in an NLP task, but it has the limitation of the scarcity of manually labeled data. To overcome this limitation, many researchers have annotated unlabeled data with certain frequently used annotation procedures. However, most of these studies are focused mainly on English and do not consider the characteristics of the Korean language. In this paper, we present a Korean-specific annotation procedure, which consists of two parts, namely n-gram-based distant supervision and Korean-specific-feature-based distant supervision. We leverage the distant supervision with the n-gram and Korean emotion lexicons. Then, we consider the Korean-specific emotion features. Through experiments, we showed the effectiveness of our procedure by comparing with the KTEA dataset. Additionally, we constructed a large-scale emotion-labeled dataset, Korean Movie Review Emotion (KMRE) Dataset, using our procedure. In order to construct our dataset, we used a large-scale sentiment movie review corpus as the unlabeled dataset. Moreover, we used a Korean emotion lexicon provided by KTEA. We also performed an emotion classification task and a human evaluation on the KMRE dataset.
Anthology ID:
2020.lrec-1.199
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1603–1610
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.199
DOI:
Bibkey:
Cite (ACL):
Young-Jun Lee, Chae-Gyun Lim, and Ho-Jin Choi. 2020. Korean-Specific Emotion Annotation Procedure Using N-Gram-Based Distant Supervision and Korean-Specific-Feature-Based Distant Supervision. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1603–1610, Marseille, France. European Language Resources Association.
Cite (Informal):
Korean-Specific Emotion Annotation Procedure Using N-Gram-Based Distant Supervision and Korean-Specific-Feature-Based Distant Supervision (Lee et al., LREC 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.lrec-1.199.pdf