Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.

Clement Christophe, Tathagata Raha, Svetlana Maslenkova, Muhammad Umar Salman, Praveenkumar Kanithi, Marco AF Pimentel, Shadab Khan


Abstract
Large Language Models (LLMs) have demonstrated significant potential in revolutionizing clinical applications. In this study, we investigate the efficacy of four techniques in adapting LLMs for clinical use-cases: continuous pretraining, instruct fine-tuning, NEFTune, and prompt engineering. We employ these methods on Mistral 7B and Mixtral 8x7B models, leveraging a large-scale clinical pretraining dataset of 50 billion tokens and an instruct fine-tuning dataset of 500 million tokens. Our evaluation across various clinical tasks reveals nuanced insights. While continuous pretraining beyond 250 billion tokens yields marginal improvements, instruct fine-tuning emerges as a more influential factor. Notably, NEFTune, designed primarily to enhance generation quality, surprisingly demonstrates additional gains on our benchmark. These findings underscore the importance of tailoring fine-tuning strategies and exploring innovative techniques to optimize LLM performance in the clinical domain.
Anthology ID:
2024.findings-emnlp.618
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
10549–10561
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.618/
DOI:
10.18653/v1/2024.findings-emnlp.618
Bibkey:
Cite (ACL):
Clement Christophe, Tathagata Raha, Svetlana Maslenkova, Muhammad Umar Salman, Praveenkumar Kanithi, Marco AF Pimentel, and Shadab Khan. 2024. Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs.. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 10549–10561, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs. (Christophe et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.618.pdf