SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish

Mariia Chizhikova, Pilar López-Úbeda, Manuel C. Díaz-Galiano, L. Alfonso Ureña-López, M. Teresa Martín-Valdivia


Abstract
This paper covers participation of the SINAI team in Tasks 5 and 10 of the Social Media Mining for Health (#SSM4H) workshop at COLING-2022. These tasks focus on leveraging Twitter posts written in Spanish for healthcare research. The objective of Task 5 was to classify tweets reporting COVID-19 symptoms, while Task 10 required identifying disease mentions in Twitter posts. The presented systems explore large RoBERTa language models pre-trained on Twitter data in the case of tweet classification task and general-domain data for the disease recognition task. We also present a text pre-processing methodology implemented in both systems and describe an initial weakly-supervised fine-tuning phase alongside with a submission post-processing procedure designed for Task 10. The systems obtained 0.84 F1-score on the Task 5 and 0.77 F1-score on Task 10.
Anthology ID:
2022.smm4h-1.8
Volume:
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Graciela Gonzalez-Hernandez, Davy Weissenbacher
Venue:
SMM4H
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
27–30
Language:
URL:
https://aclanthology.org/2022.smm4h-1.8
DOI:
Bibkey:
Cite (ACL):
Mariia Chizhikova, Pilar López-Úbeda, Manuel C. Díaz-Galiano, L. Alfonso Ureña-López, and M. Teresa Martín-Valdivia. 2022. SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish. In Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, Workshop & Shared Task, pages 27–30, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
SINAI@SMM4H’22: Transformers for biomedical social media text mining in Spanish (Chizhikova et al., SMM4H 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.smm4h-1.8.pdf