%0 Conference Proceedings %T Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks %A Siska, Charlotte %A Marazopoulou, Katerina %A Ailem, Melissa %A Bono, James %Y Ku, Lun-Wei %Y Martins, Andre %Y Srikumar, Vivek %S Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) %D 2024 %8 August %I Association for Computational Linguistics %C Bangkok, Thailand %F siska-etal-2024-examining %R 10.18653/v1/2024.acl-long.560 %U https://aclanthology.org/2024.acl-long.560/ %U https://doi.org/10.18653/v1/2024.acl-long.560 %P 10406-10421