Recent Trends in Linear Text Segmentation: A Survey

Iacopo Ghinassi, Lin Wang, Chris Newell, Matthew Purver


Abstract
Linear Text Segmentation is the task of automatically tagging text documents with topic shifts, i.e. the places in the text where the topics change. A well-established area of research in Natural Language Processing, drawing from well-understood concepts in linguistic and computational linguistic research, the field has recently seen a lot of interest as a result of the surge of text, video, and audio available on the web, which in turn require ways of summarising and categorizing the mole of content for which linear text segmentation is a fundamental step. In this survey, we provide an extensive overview of current advances in linear text segmentation, describing the state of the art in terms of resources and approaches for the task. Finally, we highlight the limitations of available resources and of the task itself, while indicating ways forward based on the most recent literature and under-explored research directions.
Anthology ID:
2024.findings-emnlp.174
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3084–3095
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.174/
DOI:
10.18653/v1/2024.findings-emnlp.174
Bibkey:
Cite (ACL):
Iacopo Ghinassi, Lin Wang, Chris Newell, and Matthew Purver. 2024. Recent Trends in Linear Text Segmentation: A Survey. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 3084–3095, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Recent Trends in Linear Text Segmentation: A Survey (Ghinassi et al., Findings 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.findings-emnlp.174.pdf