How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task
Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, Anders Søgaard
Abstract
This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.- Anthology ID:
- 2021.wat-1.24
- Volume:
- Proceedings of the 8th Workshop on Asian Translation (WAT2021)
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Toshiaki Nakazawa, Hideki Nakayama, Isao Goto, Hideya Mino, Chenchen Ding, Raj Dabre, Anoop Kunchukuttan, Shohei Higashiyama, Hiroshi Manabe, Win Pa Pa, Shantipriya Parida, Ondřej Bojar, Chenhui Chu, Akiko Eriguchi, Kaori Abe, Yusuke Oda, Katsuhito Sudoh, Sadao Kurohashi, Pushpak Bhattacharyya
- Venue:
- WAT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 205–211
- Language:
- URL:
- https://aclanthology.org/2021.wat-1.24
- DOI:
- 10.18653/v1/2021.wat-1.24
- Bibkey:
- Cite (ACL):
- Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, and Anders Søgaard. 2021. How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 205–211, Online. Association for Computational Linguistics.
- Cite (Informal):
- How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task (Aralikatte et al., WAT 2021)
- Copy Citation:
- PDF:
- https://aclanthology.org/2021.wat-1.24.pdf
- Data
- PMIndia, mC4
Export citation
@inproceedings{aralikatte-etal-2021-far, title = "How far can we get with one {GPU} in 100 hours? {C}o{AS}ta{L} at {M}ulti{I}ndic{MT} Shared Task", author = "Aralikatte, Rahul and Murrieta Bello, H{\'e}ctor Ricardo and de Lhoneux, Miryam and Hershcovich, Daniel and Bollmann, Marcel and S{\o}gaard, Anders", editor = "Nakazawa, Toshiaki and Nakayama, Hideki and Goto, Isao and Mino, Hideya and Ding, Chenchen and Dabre, Raj and Kunchukuttan, Anoop and Higashiyama, Shohei and Manabe, Hiroshi and Pa, Win Pa and Parida, Shantipriya and Bojar, Ond{\v{r}}ej and Chu, Chenhui and Eriguchi, Akiko and Abe, Kaori and Oda, Yusuke and Sudoh, Katsuhito and Kurohashi, Sadao and Bhattacharyya, Pushpak", booktitle = "Proceedings of the 8th Workshop on Asian Translation (WAT2021)", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.wat-1.24", doi = "10.18653/v1/2021.wat-1.24", pages = "205--211", abstract = "This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="aralikatte-etal-2021-far"> <titleInfo> <title>How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task</title> </titleInfo> <name type="personal"> <namePart type="given">Rahul</namePart> <namePart type="family">Aralikatte</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Héctor</namePart> <namePart type="given">Ricardo</namePart> <namePart type="family">Murrieta Bello</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Miryam</namePart> <namePart type="family">de Lhoneux</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Daniel</namePart> <namePart type="family">Hershcovich</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marcel</namePart> <namePart type="family">Bollmann</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anders</namePart> <namePart type="family">Søgaard</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2021-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 8th Workshop on Asian Translation (WAT2021)</title> </titleInfo> <name type="personal"> <namePart type="given">Toshiaki</namePart> <namePart type="family">Nakazawa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hideki</namePart> <namePart type="family">Nakayama</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Isao</namePart> <namePart type="family">Goto</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hideya</namePart> <namePart type="family">Mino</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chenchen</namePart> <namePart type="family">Ding</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Raj</namePart> <namePart type="family">Dabre</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Anoop</namePart> <namePart type="family">Kunchukuttan</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shohei</namePart> <namePart type="family">Higashiyama</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Hiroshi</namePart> <namePart type="family">Manabe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Win</namePart> <namePart type="given">Pa</namePart> <namePart type="family">Pa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shantipriya</namePart> <namePart type="family">Parida</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ondřej</namePart> <namePart type="family">Bojar</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chenhui</namePart> <namePart type="family">Chu</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Akiko</namePart> <namePart type="family">Eriguchi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kaori</namePart> <namePart type="family">Abe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yusuke</namePart> <namePart type="family">Oda</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Katsuhito</namePart> <namePart type="family">Sudoh</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sadao</namePart> <namePart type="family">Kurohashi</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Pushpak</namePart> <namePart type="family">Bhattacharyya</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics.</abstract> <identifier type="citekey">aralikatte-etal-2021-far</identifier> <identifier type="doi">10.18653/v1/2021.wat-1.24</identifier> <location> <url>https://aclanthology.org/2021.wat-1.24</url> </location> <part> <date>2021-08</date> <extent unit="page"> <start>205</start> <end>211</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task %A Aralikatte, Rahul %A Murrieta Bello, Héctor Ricardo %A de Lhoneux, Miryam %A Hershcovich, Daniel %A Bollmann, Marcel %A Søgaard, Anders %Y Nakazawa, Toshiaki %Y Nakayama, Hideki %Y Goto, Isao %Y Mino, Hideya %Y Ding, Chenchen %Y Dabre, Raj %Y Kunchukuttan, Anoop %Y Higashiyama, Shohei %Y Manabe, Hiroshi %Y Pa, Win Pa %Y Parida, Shantipriya %Y Bojar, Ondřej %Y Chu, Chenhui %Y Eriguchi, Akiko %Y Abe, Kaori %Y Oda, Yusuke %Y Sudoh, Katsuhito %Y Kurohashi, Sadao %Y Bhattacharyya, Pushpak %S Proceedings of the 8th Workshop on Asian Translation (WAT2021) %D 2021 %8 August %I Association for Computational Linguistics %C Online %F aralikatte-etal-2021-far %X This work shows that competitive translation results can be obtained in a constrained setting by incorporating the latest advances in memory and compute optimization. We train and evaluate large multilingual translation models using a single GPU for a maximum of 100 hours and get within 4-5 BLEU points of the top submission on the leaderboard. We also benchmark standard baselines on the PMI corpus and re-discover well-known shortcomings of translation systems and metrics. %R 10.18653/v1/2021.wat-1.24 %U https://aclanthology.org/2021.wat-1.24 %U https://doi.org/10.18653/v1/2021.wat-1.24 %P 205-211
Markdown (Informal)
[How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task](https://aclanthology.org/2021.wat-1.24) (Aralikatte et al., WAT 2021)
- How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task (Aralikatte et al., WAT 2021)
ACL
- Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Miryam de Lhoneux, Daniel Hershcovich, Marcel Bollmann, and Anders Søgaard. 2021. How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task. In Proceedings of the 8th Workshop on Asian Translation (WAT2021), pages 205–211, Online. Association for Computational Linguistics.