SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism

Mehwish Fatima, Tim Kolber, Katja Markert, Michael Strube


Abstract
Cross-lingual science journalism is a recently introduced task that generates popular science summaries of scientific articles different from the source language for non-expert readers. A popular science summary must contain salient content of the input document while focusing on coherence and comprehensibility. Meanwhile, generating a cross-lingual summary from the scientific texts in a local language for the targeted audience is challenging. Existing research on cross-lingual science journalism investigates the task with a pipeline model to combine text simplification and cross-lingual summarization. We extend the research in cross-lingual science journalism by introducing a novel, multi-task learning architecture that combines the aforementioned NLP tasks. Our approach is to jointly train the two high-level NLP tasks in SimCSum for generating cross-lingual popular science summaries. We investigate the performance of SimCSum against the pipeline model and several other strong baselines with several evaluation metrics and human evaluation. Overall, SimCSum demonstrates statistically significant improvements over the state-of-the-art on two non-synthetic cross-lingual scientific datasets. Furthermore, we conduct an in-depth investigation into the linguistic properties of generated summaries and an error analysis.
Anthology ID:
2023.newsum-1.3
Volume:
Proceedings of the 4th New Frontiers in Summarization Workshop
Month:
December
Year:
2023
Address:
Singapore
Editors:
Yue Dong, Wen Xiao, Lu Wang, Fei Liu, Giuseppe Carenini
Venue:
NewSum
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
24–40
Language:
URL:
https://aclanthology.org/2023.newsum-1.3
DOI:
10.18653/v1/2023.newsum-1.3
Bibkey:
Cite (ACL):
Mehwish Fatima, Tim Kolber, Katja Markert, and Michael Strube. 2023. SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism. In Proceedings of the 4th New Frontiers in Summarization Workshop, pages 24–40, Singapore. Association for Computational Linguistics.
Cite (Informal):
SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism (Fatima et al., NewSum 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.newsum-1.3.pdf
Supplementary material:
 2023.newsum-1.3.SupplementaryMaterial.txt