Can LLMs Replace Clinical Doctors? Exploring Bias in Disease Diagnosis by Large Language Models

Yutian Zhao, Huimin Wang, Yuqi Liu, Wu Suhuang, Xian Wu, Yefeng Zheng


Abstract
The bias of disease prediction in Large Language Models (LLMs) is a critical yet underexplored issue, with potential implications for healthcare outcomes and equity. As LLMs increasingly find applications in healthcare, understanding and addressing their biases becomes paramount. This study focuses on this crucial topic, investigating the bias of disease prediction in models such as GPT-4, ChatGPT, and Qwen1.5-72b across gender, age range, and disease judgment behaviors. Utilizing a comprehensive real-clinical health record dataset of over 330,000 entries, we uncover that all three models exhibit distinct biases, indicating a pervasive issue of unfairness. To measure this, we introduce a novel metric–the diagnosis bias score, which reflects the ratio of prediction numbers to label numbers. Our in-depth analysis, based on this score, sheds light on the inherent biases in these models. In response to these findings, we propose a simple yet effective prompt-based solution to alleviate the observed bias in disease prediction with LLMs. This research underscores the importance of fairness in AI, particularly in healthcare applications, and offers a practical approach to enhance the equity of disease prediction models.
Anthology ID:
2024.findings-emnlp.814
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
13914–13935
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.814/
DOI:
10.18653/v1/2024.findings-emnlp.814
Bibkey:
Cite (ACL):
Yutian Zhao, Huimin Wang, Yuqi Liu, Wu Suhuang, Xian Wu, and Yefeng Zheng. 2024. Can LLMs Replace Clinical Doctors? Exploring Bias in Disease Diagnosis by Large Language Models. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 13914–13935, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Can LLMs Replace Clinical Doctors? Exploring Bias in Disease Diagnosis by Large Language Models (Zhao et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.814.pdf