2024
pdf
bib
abs
Overview of the Third Shared Task on Speech Recognition for Vulnerable Individuals in Tamil
Bharathi B
|
Bharathi Raja Chakravarthi
|
Sripriya N
|
Rajeswari Natarajan
|
Suhasini S
Proceedings of the Fourth Workshop on Language Technology for Equality, Diversity, Inclusion
The overview of the shared task on speech recognition for vulnerable individuals in Tamil (LT-EDI-2024) is described in this paper. The work comes with a Tamil dataset that was gath- ered from elderly individuals who identify as male, female, or transgender. The audio sam- ples were taken in public places such as marketplaces, vegetable shops, hospitals, etc. The training phase and the testing phase are when the dataset is made available. The task required of the participants was to handle audio signals using various models and techniques, and then turn in their results as transcriptions of the pro- vided test samples. The participant’s results were assessed using WER (Word Error Rate). The transformer-based approach was employed by the participants to achieve automatic voice recognition. This overview paper discusses the findings and various pre-trained transformer- based models that the participants employed.
pdf
bib
abs
Findings of the Shared Task on Multimodal Social Media Data Analysis in Dravidian Languages (MSMDA-DL)@DravidianLangTech 2024
Premjith B
|
Jyothish G
|
Sowmya V
|
Bharathi Raja Chakravarthi
|
K Nandhini
|
Rajeswari Natarajan
|
Abirami Murugappan
|
Bharathi B
|
Saranya Rajiakodi
|
Rahul Ponnusamy
|
Jayanth Mohan
|
Mekapati Reddy
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
This paper presents the findings of the shared task on multimodal sentiment analysis, abusive language detection and hate speech detection in Dravidian languages. Through this shared task, researchers worldwide can submit models for three crucial social media data analysis challenges in Dravidian languages: sentiment analysis, abusive language detection, and hate speech detection. The aim is to build models for deriving fine-grained sentiment analysis from multimodal data in Tamil and Malayalam, identifying abusive and hate content from multimodal data in Tamil. Three modalities make up the multimodal data: text, audio, and video. YouTube videos were gathered to create the datasets for the tasks. Thirty-nine teams took part in the competition. However, only two teams, though, turned in their findings. The macro F1-score was used to assess the submissions
pdf
bib
abs
Overview of Second Shared Task on Sentiment Analysis in Code-mixed Tamil and Tulu
Lavanya Sambath Kumar
|
Asha Hegde
|
Bharathi Raja Chakravarthi
|
Hosahalli Shashirekha
|
Rajeswari Natarajan
|
Sajeetha Thavareesan
|
Ratnasingam Sakuntharaj
|
Thenmozhi Durairaj
|
Prasanna Kumar Kumaresan
|
Charmathi Rajkumar
Proceedings of the Fourth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages
Sentiment Analysis (SA) in Dravidian codemixed text is a hot research area right now. In this regard, the “Second Shared Task on SA in Code-mixed Tamil and Tulu” at Dravidian- LangTech (EACL-2024) is organized. Two tasks namely SA in Tamil-English and Tulu- English code-mixed data, make up this shared assignment. In total, 64 teams registered for the shared task, out of which 19 and 17 systems were received for Tamil and Tulu, respectively. The performance of the systems submitted by the participants was evaluated based on the macro F1-score. The best method obtained macro F1-scores of 0.260 and 0.584 for code-mixed Tamil and Tulu texts, respectively.
2023
pdf
bib
abs
Findings of the Shared Task on Multimodal Abusive Language Detection and Sentiment Analysis in Tamil and Malayalam
Premjith B
|
Jyothish Lal G
|
Sowmya V
|
Bharathi Raja Chakravarthi
|
Rajeswari Natarajan
|
Nandhini K
|
Abirami Murugappan
|
Bharathi B
|
Kaushik M
|
Prasanth Sn
|
Aswin Raj R
|
Vijai Simmon S
Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages
This paper summarizes the shared task on multimodal abusive language detection and sentiment analysis in Dravidian languages as part of the third Workshop on Speech and Language Technologies for Dravidian Languages at RANLP 2023. This shared task provides a platform for researchers worldwide to submit their models on two crucial social media data analysis problems in Dravidian languages - abusive language detection and sentiment analysis. Abusive language detection identifies social media content with abusive information, whereas sentiment analysis refers to the problem of determining the sentiments expressed in a text. This task aims to build models for detecting abusive content and analyzing fine-grained sentiment from multimodal data in Tamil and Malayalam. The multimodal data consists of three modalities - video, audio and text. The datasets for both tasks were prepared by collecting videos from YouTube. Sixty teams participated in both tasks. However, only two teams submitted their results. The submissions were evaluated using macro F1-score.
pdf
bib
abs
Overview of the Second Shared Task on Speech Recognition for Vulnerable Individuals in Tamil
Bharathi B
|
Bharathi Raja Chakravarthi
|
Subalalitha Cn
|
Sripriya Natarajan
|
Rajeswari Natarajan
|
S Suhasini
|
Swetha Valli
Proceedings of the Third Workshop on Language Technology for Equality, Diversity and Inclusion
This paper manifest the overview of the shared task on Speech Recognition for Vulnerable individuals in Tamil(LT-EDI-ACL2023). Task is provided with an Tamil dataset, which is collected from elderly people of three different genders, male, female and transgender. The audio samples were recorded from the public locations like hospitals, markets, vegetable shop, etc. The dataset is released in two phase, training and testing phase. The partcipants were asked to use different models and methods to handle audio signals and submit the result as transcription of the test samples given. The result submitted by the participants was evaluated using WER (Word Error Rate). The participants used the transformer-based model for automatic speech recognition. The results and different pre-trained transformer based models used by the participants is discussed in this overview paper.