Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing

Behzad Shayegh, Yuqiao Wen, Lili Mou


Abstract
We address unsupervised discontinuous constituency parsing, where we observe a high variance in the performance of the only previous model in the literature. We propose to build an ensemble of different runs of the existing discontinuous parser by averaging the predicted trees, to stabilize and boost performance. To begin with, we provide comprehensive computational complexity analysis (in terms of P and NP-complete) for tree averaging under different setups of binarity and continuity. We then develop an efficient exact algorithm to tackle the task, which runs in a reasonable time for all samples in our experiments. Results on three datasets show our method outperforms all baselines in all metrics; we also provide in-depth analyses of our approach.
Anthology ID:
2024.acl-long.808
Volume:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
August
Year:
2024
Address:
Bangkok, Thailand
Editors:
Lun-Wei Ku, Andre Martins, Vivek Srikumar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
15135–15156
Language:
URL:
https://aclanthology.org/2024.acl-long.808
DOI:
10.18653/v1/2024.acl-long.808
Bibkey:
Cite (ACL):
Behzad Shayegh, Yuqiao Wen, and Lili Mou. 2024. Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 15135–15156, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):
Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing (Shayegh et al., ACL 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.acl-long.808.pdf