Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts

Atnafu Lambebo Tonja, Mesay Gemeda Yigezu, Olga Kolesnikova, Moein Shahiki Tash, Grigori Sidorov, Alexander Gelbukh


Abstract
Language Identification at the Word Level in Kannada-English Texts. This paper describes the system paper of CoLI-Kanglish 2022 shared task. The goal of this task is to identify the different languages used in CoLI-Kanglish 2022. This dataset is distributed into different categories including Kannada, English, Mixed-Language, Location, Name, and Others. This Code-Mix was compiled by CoLI-Kanglish 2022 organizers from posts on social media. We use two classification techniques, KNN and SVM, and achieve an F1-score of 0.58 and place third out of nine competitors.
Anthology ID:
2022.icon-wlli.4
Volume:
Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts
Month:
December
Year:
2022
Address:
IIIT Delhi, New Delhi, India
Editors:
Bharathi Raja Chakravarthi, Abirami Murugappan, Dhivya Chinnappa, Adeep Hane, Prasanna Kumar Kumeresan, Rahul Ponnusamy
Venue:
ICON
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
18–24
Language:
URL:
https://aclanthology.org/2022.icon-wlli.4
DOI:
Bibkey:
Cite (ACL):
Atnafu Lambebo Tonja, Mesay Gemeda Yigezu, Olga Kolesnikova, Moein Shahiki Tash, Grigori Sidorov, and Alexander Gelbukh. 2022. Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts. In Proceedings of the 19th International Conference on Natural Language Processing (ICON): Shared Task on Word Level Language Identification in Code-mixed Kannada-English Texts, pages 18–24, IIIT Delhi, New Delhi, India. Association for Computational Linguistics.
Cite (Informal):
Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts (Lambebo Tonja et al., ICON 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.icon-wlli.4.pdf