KU-DMIS at MEDIQA-CORR 2024: Exploring the Reasoning Capabilities of Small Language Models in Medical Error Correction

Hyeon Hwang, Taewhoo Lee, Hyunjae Kim, Jaewoo Kang


Abstract
Recent advancements in large language models (LM) like OpenAI’s GPT-4 have shown promise in healthcare, particularly in medical question answering and clinical applications. However, their deployment raises privacy concerns and their size limits use in resource-constrained environments.Smaller open-source LMs have emerged as alternatives, but their reliability in medicine remains underexplored.This study evaluates small LMs in the medical field using the MEDIQA-CORR 2024 task, which assesses the ability of models to identify and correct errors in clinical notes. Initially, zero-shot inference and simple fine-tuning of small models resulted in poor performance. When fine-tuning with chain-of-thought (CoT) reasoning using synthetic data generated by GPT-4, their performance significantly improved. Meerkat-7B, a small LM trained with medical CoT reasoning, demonstrated notable performance gains. Our model outperforms other small non-commercial LMs and some larger models, achieving a 73.36 aggregate score on MEDIQA-CORR 2024.
Anthology ID:
2024.clinicalnlp-1.51
Volume:
Proceedings of the 6th Clinical Natural Language Processing Workshop
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Tristan Naumann, Asma Ben Abacha, Steven Bethard, Kirk Roberts, Danielle Bitterman
Venues:
ClinicalNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
526–536
Language:
URL:
https://aclanthology.org/2024.clinicalnlp-1.51
DOI:
10.18653/v1/2024.clinicalnlp-1.51
Bibkey:
Cite (ACL):
Hyeon Hwang, Taewhoo Lee, Hyunjae Kim, and Jaewoo Kang. 2024. KU-DMIS at MEDIQA-CORR 2024: Exploring the Reasoning Capabilities of Small Language Models in Medical Error Correction. In Proceedings of the 6th Clinical Natural Language Processing Workshop, pages 526–536, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
KU-DMIS at MEDIQA-CORR 2024: Exploring the Reasoning Capabilities of Small Language Models in Medical Error Correction (Hwang et al., ClinicalNLP-WS 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.clinicalnlp-1.51.pdf