Mehdi Parviz


2025

pdf bib
Bias in Danish Medical Notes: Infection Classification of Long Texts Using Transformer and LSTM Architectures Coupled with BERT
Mehdi Parviz | Rudi Agius | Carsten Niemann | Rob Van Der Goot
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)

Medical notes contain a wealth of information related to diagnosis, prognosis, and overall patient care that can be used to help physicians make informed decisions. However, like any other data sets consisting of data from diverse demographics, they may be biased toward certain subgroups or subpopulations. Consequently, any bias in the data will be reflected in the output of the machine learning models trained on them. In this paper, we investigate the existence of such biases in Danish medical notes related to three types of blood cancer, with the goal of classifying whether the medical notes indicate severe infection. By employing a hierarchical architecture that combines a sequence model (Transformer and LSTM) with a BERT model to classify long notes, we uncover biases related to demographics and cancer types. Furthermore, we observe performance differences between hospitals. These findings underscore the importance of investigating bias in critical settings such as healthcare and the urgency of monitoring and mitigating it when developing AI-based systems.

2011

pdf bib
Using Language Models and Latent Semantic Analysis to Characterise the N400m Neural Response
Mehdi Parviz | Mark Johnson | Blake Johnson | Jon Brock
Proceedings of the Australasian Language Technology Association Workshop 2011