TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula
Antoni Oliver, Mercè Vàzquez, Marta Coll-Florit, Sergi Álvarez, Víctor Suárez, Claudi Aventín-Boya, Cristina Valdés, Mar Font, Alejandro Pardos
Abstract
The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese.- Anthology ID:
- 2023.eamt-1.50
- Volume:
- Proceedings of the 24th Annual Conference of the European Association for Machine Translation
- Month:
- June
- Year:
- 2023
- Address:
- Tampere, Finland
- Editors:
- Mary Nurminen, Judith Brenner, Maarit Koponen, Sirkku Latomaa, Mikhail Mikhailov, Frederike Schierl, Tharindu Ranasinghe, Eva Vanmassenhove, Sergi Alvarez Vidal, Nora Aranberri, Mara Nunziatini, Carla Parra Escartín, Mikel Forcada, Maja Popovic, Carolina Scarton, Helena Moniz
- Venue:
- EAMT
- SIG:
- Publisher:
- European Association for Machine Translation
- Note:
- Pages:
- 495–496
- Language:
- URL:
- https://aclanthology.org/2023.eamt-1.50
- DOI:
- Bibkey:
- Cite (ACL):
- Antoni Oliver, Mercè Vàzquez, Marta Coll-Florit, Sergi Álvarez, Víctor Suárez, Claudi Aventín-Boya, Cristina Valdés, Mar Font, and Alejandro Pardos. 2023. TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 495–496, Tampere, Finland. European Association for Machine Translation.
- Cite (Informal):
- TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula (Oliver et al., EAMT 2023)
- Copy Citation:
- PDF:
- https://aclanthology.org/2023.eamt-1.50.pdf
Export citation
@inproceedings{oliver-etal-2023-tan, title = "{TAN}-{IBE}: Neural Machine Translation for the romance languages of the {I}berian Peninsula", author = "Oliver, Antoni and V{\`a}zquez, Merc{\`e} and Coll-Florit, Marta and {\'A}lvarez, Sergi and Su{\'a}rez, V{\'\i}ctor and Avent{\'\i}n-Boya, Claudi and Vald{\'e}s, Cristina and Font, Mar and Pardos, Alejandro", editor = "Nurminen, Mary and Brenner, Judith and Koponen, Maarit and Latomaa, Sirkku and Mikhailov, Mikhail and Schierl, Frederike and Ranasinghe, Tharindu and Vanmassenhove, Eva and Vidal, Sergi Alvarez and Aranberri, Nora and Nunziatini, Mara and Escart{\'\i}n, Carla Parra and Forcada, Mikel and Popovic, Maja and Scarton, Carolina and Moniz, Helena", booktitle = "Proceedings of the 24th Annual Conference of the European Association for Machine Translation", month = jun, year = "2023", address = "Tampere, Finland", publisher = "European Association for Machine Translation", url = "https://aclanthology.org/2023.eamt-1.50", pages = "495--496", abstract = "The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="oliver-etal-2023-tan"> <titleInfo> <title>TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula</title> </titleInfo> <name type="personal"> <namePart type="given">Antoni</namePart> <namePart type="family">Oliver</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mercè</namePart> <namePart type="family">Vàzquez</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marta</namePart> <namePart type="family">Coll-Florit</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sergi</namePart> <namePart type="family">Álvarez</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Víctor</namePart> <namePart type="family">Suárez</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Claudi</namePart> <namePart type="family">Aventín-Boya</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Cristina</namePart> <namePart type="family">Valdés</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mar</namePart> <namePart type="family">Font</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Alejandro</namePart> <namePart type="family">Pardos</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2023-06</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 24th Annual Conference of the European Association for Machine Translation</title> </titleInfo> <name type="personal"> <namePart type="given">Mary</namePart> <namePart type="family">Nurminen</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Judith</namePart> <namePart type="family">Brenner</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Maarit</namePart> <namePart type="family">Koponen</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sirkku</namePart> <namePart type="family">Latomaa</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mikhail</namePart> <namePart type="family">Mikhailov</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Frederike</namePart> <namePart type="family">Schierl</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Tharindu</namePart> <namePart type="family">Ranasinghe</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Eva</namePart> <namePart type="family">Vanmassenhove</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sergi</namePart> <namePart type="given">Alvarez</namePart> <namePart type="family">Vidal</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nora</namePart> <namePart type="family">Aranberri</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mara</namePart> <namePart type="family">Nunziatini</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Carla</namePart> <namePart type="given">Parra</namePart> <namePart type="family">Escartín</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mikel</namePart> <namePart type="family">Forcada</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Maja</namePart> <namePart type="family">Popovic</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Carolina</namePart> <namePart type="family">Scarton</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Helena</namePart> <namePart type="family">Moniz</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>European Association for Machine Translation</publisher> <place> <placeTerm type="text">Tampere, Finland</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese.</abstract> <identifier type="citekey">oliver-etal-2023-tan</identifier> <location> <url>https://aclanthology.org/2023.eamt-1.50</url> </location> <part> <date>2023-06</date> <extent unit="page"> <start>495</start> <end>496</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula %A Oliver, Antoni %A Vàzquez, Mercè %A Coll-Florit, Marta %A Álvarez, Sergi %A Suárez, Víctor %A Aventín-Boya, Claudi %A Valdés, Cristina %A Font, Mar %A Pardos, Alejandro %Y Nurminen, Mary %Y Brenner, Judith %Y Koponen, Maarit %Y Latomaa, Sirkku %Y Mikhailov, Mikhail %Y Schierl, Frederike %Y Ranasinghe, Tharindu %Y Vanmassenhove, Eva %Y Vidal, Sergi Alvarez %Y Aranberri, Nora %Y Nunziatini, Mara %Y Escartín, Carla Parra %Y Forcada, Mikel %Y Popovic, Maja %Y Scarton, Carolina %Y Moniz, Helena %S Proceedings of the 24th Annual Conference of the European Association for Machine Translation %D 2023 %8 June %I European Association for Machine Translation %C Tampere, Finland %F oliver-etal-2023-tan %X The main goal of this project is to explore the techniques for training NMT systems applied to Spanish, Portuguese, Catalan, Galician, Asturian, Aragonese and Aranese. These languages belong to the same Romance family, but they are very different in terms of the linguistic resources available. Asturian, Aragonese and Aranese can be considered low resource languages. These characteristics make this setting an excellent place to explore training techniques for low-resource languages: transfer learning and multilingual systems, among others. The first months of the project have been dedicated to the compilation of monolingual and parallel corpora for Asturian, Aragonese and Aranese. %U https://aclanthology.org/2023.eamt-1.50 %P 495-496
Markdown (Informal)
[TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula](https://aclanthology.org/2023.eamt-1.50) (Oliver et al., EAMT 2023)
- TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula (Oliver et al., EAMT 2023)
ACL
- Antoni Oliver, Mercè Vàzquez, Marta Coll-Florit, Sergi Álvarez, Víctor Suárez, Claudi Aventín-Boya, Cristina Valdés, Mar Font, and Alejandro Pardos. 2023. TAN-IBE: Neural Machine Translation for the romance languages of the Iberian Peninsula. In Proceedings of the 24th Annual Conference of the European Association for Machine Translation, pages 495–496, Tampere, Finland. European Association for Machine Translation.