LIG approach for IWSLT09

Fethi Bougares, Laurent Besacier, Hervé Blanchon


Abstract
This paper describes the LIG experiments in the context of IWSLT09 evaluation (Arabic to English Statistical Machine Translation task). Arabic is a morphologically rich language, and recent experimentations in our laboratory have shown that the performance of Arabic to English SMT systems varies greatly according to the Arabic morphological segmenters applied. Based on this observation, we propose to use simultaneously multiple segmentations for machine translation of Arabic. The core idea is to keep the ambiguity of the Arabic segmentation in the system input (using confusion networks or lattices). Then, we hope that the best segmentation will be chosen during MT decoding. The mathematics of this multiple segmentation approach are given. Practical implementations in the case of verbatim text translation as well as speech translation (outside of the scope of IWSLT09 this year) are proposed. Experiments conducted in the framework of IWSLT evaluation campaign show the potential of the multiple segmentation approach. The last part of this paper explains in detail the different systems submitted by LIG at IWSLT09 and the results obtained.
Anthology ID:
2009.iwslt-evaluation.9
Volume:
Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 1-2
Year:
2009
Address:
Tokyo, Japan
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
60–64
Language:
URL:
https://aclanthology.org/2009.iwslt-evaluation.9
DOI:
Bibkey:
Cite (ACL):
Fethi Bougares, Laurent Besacier, and Hervé Blanchon. 2009. LIG approach for IWSLT09. In Proceedings of the 6th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 60–64, Tokyo, Japan.
Cite (Informal):
LIG approach for IWSLT09 (Bougares et al., IWSLT 2009)
Copy Citation:
PDF:
https://aclanthology.org/2009.iwslt-evaluation.9.pdf
Presentation:
 2009.iwslt-evaluation.9.Presentation.pdf