Hiding in Plain Sight: Insights into Abstractive Text Summarization

Vivek Srivastava, Savita Bhat, Niranjan Pedanekar


Abstract
In recent years, there has been growing interest in the field of abstractive text summarization with focused contributions in relevant model architectures, datasets, and evaluation metrics. Despite notable research advances, previous works have identified certain limitations concerning the quality of datasets and the effectiveness of evaluation techniques for generated summaries. In this context, we examine these limitations further with the help of three quality measures, namely, Information Coverage, Entity Hallucination, and Summarization Complexity. As a part of this work, we investigate two widely used datasets (XSUM and CNNDM) and three existing models (BART, PEGASUS, and BRIO) and report our findings. Some key insights are: 1) Cumulative ROUGE score is an inappropriate evaluation measure since few high-scoring samples dominate the overall performance, 2) Existing summarization models have limited capability for information coverage and hallucinate to generate factual information, and 3) Compared to the model generated summaries, the reference summaries have lowest information coverage and highest entity hallucinations reiterating the need of new and better reference summaries.
Anthology ID:
2023.insights-1.8
Volume:
Proceedings of the Fourth Workshop on Insights from Negative Results in NLP
Month:
May
Year:
2023
Address:
Dubrovnik, Croatia
Editors:
Shabnam Tafreshi, Arjun Akula, João Sedoc, Aleksandr Drozd, Anna Rogers, Anna Rumshisky
Venues:
insights | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
67–74
Language:
URL:
https://aclanthology.org/2023.insights-1.8
DOI:
10.18653/v1/2023.insights-1.8
Bibkey:
Cite (ACL):
Vivek Srivastava, Savita Bhat, and Niranjan Pedanekar. 2023. Hiding in Plain Sight: Insights into Abstractive Text Summarization. In Proceedings of the Fourth Workshop on Insights from Negative Results in NLP, pages 67–74, Dubrovnik, Croatia. Association for Computational Linguistics.
Cite (Informal):
Hiding in Plain Sight: Insights into Abstractive Text Summarization (Srivastava et al., insights-WS 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.insights-1.8.pdf
Video:
 https://aclanthology.org/2023.insights-1.8.mp4