HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task
Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, Ying Qin
Correct Metadata for
Abstract
This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En<->Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi->En and En->Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En->Multi system and an average improvement of 4.6 BLEU scores regarding the Multi->En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation.- Anthology ID:
- 2020.wat-1.8
- Volume:
- Proceedings of the 7th Workshop on Asian Translation
- Month:
- December
- Year:
- 2020
- Address:
- Suzhou, China
- Editors:
- Toshiaki Nakazawa, Hideki Nakayama, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Win Pa Pa, Ondřej Bojar, Shantipriya Parida, Isao Goto, Hidaya Mino, Hiroshi Manabe, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 92–97
- Language:
- URL:
- https://aclanthology.org/2020.wat-1.8/
- DOI:
- 10.18653/v1/2020.wat-1.8
- Bibkey:
- Cite (ACL):
- Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, and Ying Qin. 2020. HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task. In Proceedings of the 7th Workshop on Asian Translation, pages 92–97, Suzhou, China. Association for Computational Linguistics.
- Cite (Informal):
- HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task (Yu et al., WAT 2020)
- Copy Citation:
- PDF:
- https://aclanthology.org/2020.wat-1.8.pdf
Export citation
@inproceedings{yu-etal-2020-hw, title = "{HW}-{TSC}`s Participation in the {WAT} 2020 Indic Languages Multilingual Task", author = "Yu, Zhengzhe and Wu, Zhanglin and Chen, Xiaoyu and Wei, Daimeng and Shang, Hengchao and Guo, Jiaxin and Li, Zongyao and Wang, Minghan and Li, Liangyou and Lei, Lizhi and Yang, Hao and Qin, Ying", editor = "Nakazawa, Toshiaki and Nakayama, Hideki and Ding, Chenchen and Dabre, Raj and Kunchukuttan, Anoop and Pa, Win Pa and Bojar, Ond{\v{r}}ej and Parida, Shantipriya and Goto, Isao and Mino, Hidaya and Manabe, Hiroshi and Sudoh, Katsuhito and Kurohashi, Sadao and Bhattacharyya, Pushpak", booktitle = "Proceedings of the 7th Workshop on Asian Translation", month = dec, year = "2020", address = "Suzhou, China", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2020.wat-1.8/", doi = "10.18653/v1/2020.wat-1.8", pages = "92--97", abstract = "This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En{\ensuremath{<}}-{\ensuremath{>}}Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition{---}using only the officially provided data. Using transformer as a baseline, our Multi-{\ensuremath{>}}En and En-{\ensuremath{>}}Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-{\ensuremath{>}}Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-{\ensuremath{>}}En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation." }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="yu-etal-2020-hw"> <titleInfo> <title>HW-TSC‘s Participation in the WAT 2020 Indic Languages Multilingual Task</title> </titleInfo> <name type="personal"> <namePart type="given">Zhengzhe</namePart> <namePart type="family">Yu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zhanglin</namePart> <namePart type="family">Wu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xiaoyu</namePart> <namePart type="family">Chen</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Daimeng</namePart> <namePart type="family">Wei</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hengchao</namePart> <namePart type="family">Shang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jiaxin</namePart> <namePart type="family">Guo</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zongyao</namePart> <namePart type="family">Li</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Minghan</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Liangyou</namePart> <namePart type="family">Li</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Lizhi</namePart> <namePart type="family">Lei</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hao</namePart> <namePart type="family">Yang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ying</namePart> <namePart type="family">Qin</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2020-12</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 7th Workshop on Asian Translation</title> </titleInfo> <name type="personal"> <namePart type="given">Toshiaki</namePart> <namePart type="family">Nakazawa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hideki</namePart> <namePart type="family">Nakayama</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chenchen</namePart> <namePart type="family">Ding</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Raj</namePart> <namePart type="family">Dabre</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anoop</namePart> <namePart type="family">Kunchukuttan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Win</namePart> <namePart type="given">Pa</namePart> <namePart type="family">Pa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shantipriya</namePart> <namePart type="family">Parida</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Isao</namePart> <namePart type="family">Goto</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hidaya</namePart> <namePart type="family">Mino</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hiroshi</namePart> <namePart type="family">Manabe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Katsuhito</namePart> <namePart type="family">Sudoh</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sadao</namePart> <namePart type="family">Kurohashi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Pushpak</namePart> <namePart type="family">Bhattacharyya</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Suzhou, China</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En\ensuremath<-\ensuremath>Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi-\ensuremath>En and En-\ensuremath>Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-\ensuremath>Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-\ensuremath>En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation.</abstract> <identifier type="citekey">yu-etal-2020-hw</identifier> <identifier type="doi">10.18653/v1/2020.wat-1.8</identifier> <location> <url>https://aclanthology.org/2020.wat-1.8/</url> </location> <part> <date>2020-12</date> <extent unit="page"> <start>92</start> <end>97</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T HW-TSC‘s Participation in the WAT 2020 Indic Languages Multilingual Task %A Yu, Zhengzhe %A Wu, Zhanglin %A Chen, Xiaoyu %A Wei, Daimeng %A Shang, Hengchao %A Guo, Jiaxin %A Li, Zongyao %A Wang, Minghan %A Li, Liangyou %A Lei, Lizhi %A Yang, Hao %A Qin, Ying %Y Nakazawa, Toshiaki %Y Nakayama, Hideki %Y Ding, Chenchen %Y Dabre, Raj %Y Kunchukuttan, Anoop %Y Pa, Win Pa %Y Bojar, Ondřej %Y Parida, Shantipriya %Y Goto, Isao %Y Mino, Hidaya %Y Manabe, Hiroshi %Y Sudoh, Katsuhito %Y Kurohashi, Sadao %Y Bhattacharyya, Pushpak %S Proceedings of the 7th Workshop on Asian Translation %D 2020 %8 December %I Association for Computational Linguistics %C Suzhou, China %F yu-etal-2020-hw %X This paper describes our work in the WAT 2020 Indic Multilingual Translation Task. We participated in all 7 language pairs (En\ensuremath<-\ensuremath>Bn/Hi/Gu/Ml/Mr/Ta/Te) in both directions under the constrained condition—using only the officially provided data. Using transformer as a baseline, our Multi-\ensuremath>En and En-\ensuremath>Multi translation systems achieve the best performances. Detailed data filtering and data domain selection are the keys to performance enhancement in our experiment, with an average improvement of 2.6 BLEU scores for each language pair in the En-\ensuremath>Multi system and an average improvement of 4.6 BLEU scores regarding the Multi-\ensuremath>En. In addition, we employed language independent adapter to further improve the system performances. Our submission obtains competitive results in the final evaluation. %R 10.18653/v1/2020.wat-1.8 %U https://aclanthology.org/2020.wat-1.8/ %U https://doi.org/10.18653/v1/2020.wat-1.8 %P 92-97
Markdown (Informal)
[HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task](https://aclanthology.org/2020.wat-1.8/) (Yu et al., WAT 2020)
- HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task (Yu et al., WAT 2020)
ACL
- Zhengzhe Yu, Zhanglin Wu, Xiaoyu Chen, Daimeng Wei, Hengchao Shang, Jiaxin Guo, Zongyao Li, Minghan Wang, Liangyou Li, Lizhi Lei, Hao Yang, and Ying Qin. 2020. HW-TSC’s Participation in the WAT 2020 Indic Languages Multilingual Task. In Proceedings of the 7th Workshop on Asian Translation, pages 92–97, Suzhou, China. Association for Computational Linguistics.