InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation

Anwen Hu, Shizhe Chen, Liang Zhang, Qin Jin


Abstract
Automatic image captioning evaluation is critical for benchmarking and promoting advances in image captioning research. Existing metrics only provide a single score to measure caption qualities, which are less explainable and informative. Instead, we humans can easily identify the problems of captions in details, e.g., which words are inaccurate and which salient objects are not described, and then rate the caption quality. To support such informative feedback, we propose an Informative Metric for Reference-free Image Caption evaluation (InfoMetIC). Given an image and a caption, InfoMetIC is able to report incorrect words and unmentioned image regions at fine-grained level, and also provide a text precision score, a vision recall score and an overall quality score at coarse-grained level. The coarse-grained score of InfoMetIC achieves significantly better correlation with human judgements than existing metrics on multiple benchmarks. We also construct a token-level evaluation dataset and demonstrate the effectiveness of InfoMetIC in fine-grained evaluation. Our code and datasets are publicly available at https://github.com/HAWLYQ/InfoMetIC.
Anthology ID:
2023.acl-long.178
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3171–3185
Language:
URL:
https://aclanthology.org/2023.acl-long.178
DOI:
10.18653/v1/2023.acl-long.178
Bibkey:
Cite (ACL):
Anwen Hu, Shizhe Chen, Liang Zhang, and Qin Jin. 2023. InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3171–3185, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation (Hu et al., ACL 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.acl-long.178.pdf
Video:
 https://aclanthology.org/2023.acl-long.178.mp4