SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages
Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Nizar Habash, Charbel El-Khaissi, Omer Goldman, Michael Gasser, William Lane, Matt Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, Andrey Shcherbakov, Aziyana Bayyr-ool, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Andrew Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Duygu Ataman, Witold Kieraś, Marcin Woliński, Totok Suhardijanto, Niklas Stoehr, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Richard J. Hatcher, Emily Prud’hommeaux, Ritesh Kumar, Mans Hulden, Botond Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohit Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, Ekaterina Vylomova
Correct Metadata for
Abstract
This year’s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems’ predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving >90% accuracy on 65% of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems’ performance on previously unseen lemmas.- Anthology ID:
- 2021.sigmorphon-1.25
- Volume:
- Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Garrett Nicolai, Kyle Gorman, Ryan Cotterell
- Venue:
- SIGMORPHON
- SIG:
- SIGMORPHON
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 229–259
- Language:
- URL:
- https://aclanthology.org/2021.sigmorphon-1.25/
- DOI:
- 10.18653/v1/2021.sigmorphon-1.25
- Bibkey:
- Cite (ACL):
- Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Nizar Habash, Charbel El-Khaissi, Omer Goldman, Michael Gasser, William Lane, Matt Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, Andrey Shcherbakov, Aziyana Bayyr-ool, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Andrew Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Duygu Ataman, Witold Kieraś, Marcin Woliński, Totok Suhardijanto, Niklas Stoehr, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Richard J. Hatcher, Emily Prud’hommeaux, Ritesh Kumar, Mans Hulden, Botond Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohit Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, and Ekaterina Vylomova. 2021. SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 229–259, Online. Association for Computational Linguistics.
- Cite (Informal):
- SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages (Pimentel et al., SIGMORPHON 2021)
- Copy Citation:
- PDF:
- https://aclanthology.org/2021.sigmorphon-1.25.pdf
Export citation
@inproceedings{pimentel-ryskina-etal-2021-sigmorphon, title = "{SIGMORPHON} 2021 Shared Task on Morphological Reinflection: Generalization Across Languages", author = "Pimentel, Tiago and Ryskina, Maria and Mielke, Sabrina J. and Wu, Shijie and Chodroff, Eleanor and Leonard, Brian and Nicolai, Garrett and Ghanggo Ate, Yustinus and Khalifa, Salam and Habash, Nizar and El-Khaissi, Charbel and Goldman, Omer and Gasser, Michael and Lane, William and Coler, Matt and Oncevay, Arturo and Montoya Samame, Jaime Rafael and Silva Villegas, Gema Celeste and Ek, Adam and Bernardy, Jean-Philippe and Shcherbakov, Andrey and Bayyr-ool, Aziyana and Sheifer, Karina and Ganieva, Sofya and Plugaryov, Matvey and Klyachko, Elena and Salehi, Ali and Krizhanovsky, Andrew and Krizhanovsky, Natalia and Vania, Clara and Ivanova, Sardana and Salchak, Aelita and Straughn, Christopher and Liu, Zoey and Washington, Jonathan North and Ataman, Duygu and Kiera{\'s}, Witold and Woli{\'n}ski, Marcin and Suhardijanto, Totok and Stoehr, Niklas and Nuriah, Zahroh and Ratan, Shyam and Tyers, Francis M. and Ponti, Edoardo M. and Aiton, Grant and Hatcher, Richard J. and Prud{'}hommeaux, Emily and Kumar, Ritesh and Hulden, Mans and Barta, Botond and Lakatos, Dorina and Szolnok, G{\'a}bor and {\'A}cs, Judit and Raj, Mohit and Yarowsky, David and Cotterell, Ryan and Ambridge, Ben and Vylomova, Ekaterina", editor = "Nicolai, Garrett and Gorman, Kyle and Cotterell, Ryan", booktitle = "Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.sigmorphon-1.25/", doi = "10.18653/v1/2021.sigmorphon-1.25", pages = "229--259", abstract = "This year`s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, V{\~o}ro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Ash{\'a}ninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems' predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving {\ensuremath{>}}90{\%} accuracy on 65{\%} of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems' performance on previously unseen lemmas." }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="pimentel-ryskina-etal-2021-sigmorphon"> <titleInfo> <title>SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages</title> </titleInfo> <name type="personal"> <namePart type="given">Tiago</namePart> <namePart type="family">Pimentel</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Maria</namePart> <namePart type="family">Ryskina</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sabrina</namePart> <namePart type="given">J</namePart> <namePart type="family">Mielke</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shijie</namePart> <namePart type="family">Wu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Eleanor</namePart> <namePart type="family">Chodroff</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Brian</namePart> <namePart type="family">Leonard</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Garrett</namePart> <namePart type="family">Nicolai</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yustinus</namePart> <namePart type="family">Ghanggo Ate</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Salam</namePart> <namePart type="family">Khalifa</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Nizar</namePart> <namePart type="family">Habash</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Charbel</namePart> <namePart type="family">El-Khaissi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Omer</namePart> <namePart type="family">Goldman</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Michael</namePart> <namePart type="family">Gasser</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">William</namePart> <namePart type="family">Lane</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matt</namePart> <namePart type="family">Coler</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Arturo</namePart> <namePart type="family">Oncevay</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jaime</namePart> <namePart type="given">Rafael</namePart> <namePart type="family">Montoya Samame</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Gema</namePart> <namePart type="given">Celeste</namePart> <namePart type="family">Silva Villegas</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Adam</namePart> <namePart type="family">Ek</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jean-Philippe</namePart> <namePart type="family">Bernardy</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Andrey</namePart> <namePart type="family">Shcherbakov</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aziyana</namePart> <namePart type="family">Bayyr-ool</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Karina</namePart> <namePart type="family">Sheifer</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sofya</namePart> <namePart type="family">Ganieva</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Matvey</namePart> <namePart type="family">Plugaryov</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Elena</namePart> <namePart type="family">Klyachko</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ali</namePart> <namePart type="family">Salehi</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Andrew</namePart> <namePart type="family">Krizhanovsky</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Natalia</namePart> <namePart type="family">Krizhanovsky</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Clara</namePart> <namePart type="family">Vania</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Sardana</namePart> <namePart type="family">Ivanova</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Aelita</namePart> <namePart type="family">Salchak</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christopher</namePart> <namePart type="family">Straughn</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zoey</namePart> <namePart type="family">Liu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jonathan</namePart> <namePart type="given">North</namePart> <namePart type="family">Washington</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Duygu</namePart> <namePart type="family">Ataman</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Witold</namePart> <namePart type="family">Kieraś</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Marcin</namePart> <namePart type="family">Woliński</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Totok</namePart> <namePart type="family">Suhardijanto</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Niklas</namePart> <namePart type="family">Stoehr</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zahroh</namePart> <namePart type="family">Nuriah</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shyam</namePart> <namePart type="family">Ratan</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Francis</namePart> <namePart type="given">M</namePart> <namePart type="family">Tyers</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Edoardo</namePart> <namePart type="given">M</namePart> <namePart type="family">Ponti</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Grant</namePart> <namePart type="family">Aiton</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Richard</namePart> <namePart type="given">J</namePart> <namePart type="family">Hatcher</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Emily</namePart> <namePart type="family">Prud’hommeaux</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ritesh</namePart> <namePart type="family">Kumar</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mans</namePart> <namePart type="family">Hulden</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Botond</namePart> <namePart type="family">Barta</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Dorina</namePart> <namePart type="family">Lakatos</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Gábor</namePart> <namePart type="family">Szolnok</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Judit</namePart> <namePart type="family">Ács</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Mohit</namePart> <namePart type="family">Raj</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">David</namePart> <namePart type="family">Yarowsky</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ryan</namePart> <namePart type="family">Cotterell</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ben</namePart> <namePart type="family">Ambridge</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ekaterina</namePart> <namePart type="family">Vylomova</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2021-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology</title> </titleInfo> <name type="personal"> <namePart type="given">Garrett</namePart> <namePart type="family">Nicolai</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Kyle</namePart> <namePart type="family">Gorman</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ryan</namePart> <namePart type="family">Cotterell</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>This year‘s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems’ predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving \ensuremath>90% accuracy on 65% of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems’ performance on previously unseen lemmas.</abstract> <identifier type="citekey">pimentel-ryskina-etal-2021-sigmorphon</identifier> <identifier type="doi">10.18653/v1/2021.sigmorphon-1.25</identifier> <location> <url>https://aclanthology.org/2021.sigmorphon-1.25/</url> </location> <part> <date>2021-08</date> <extent unit="page"> <start>229</start> <end>259</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages %A Pimentel, Tiago %A Ryskina, Maria %A Mielke, Sabrina J. %A Wu, Shijie %A Chodroff, Eleanor %A Leonard, Brian %A Nicolai, Garrett %A Ghanggo Ate, Yustinus %A Khalifa, Salam %A Habash, Nizar %A El-Khaissi, Charbel %A Goldman, Omer %A Gasser, Michael %A Lane, William %A Coler, Matt %A Oncevay, Arturo %A Montoya Samame, Jaime Rafael %A Silva Villegas, Gema Celeste %A Ek, Adam %A Bernardy, Jean-Philippe %A Shcherbakov, Andrey %A Bayyr-ool, Aziyana %A Sheifer, Karina %A Ganieva, Sofya %A Plugaryov, Matvey %A Klyachko, Elena %A Salehi, Ali %A Krizhanovsky, Andrew %A Krizhanovsky, Natalia %A Vania, Clara %A Ivanova, Sardana %A Salchak, Aelita %A Straughn, Christopher %A Liu, Zoey %A Washington, Jonathan North %A Ataman, Duygu %A Kieraś, Witold %A Woliński, Marcin %A Suhardijanto, Totok %A Stoehr, Niklas %A Nuriah, Zahroh %A Ratan, Shyam %A Tyers, Francis M. %A Ponti, Edoardo M. %A Aiton, Grant %A Hatcher, Richard J. %A Prud’hommeaux, Emily %A Kumar, Ritesh %A Hulden, Mans %A Barta, Botond %A Lakatos, Dorina %A Szolnok, Gábor %A Ács, Judit %A Raj, Mohit %A Yarowsky, David %A Cotterell, Ryan %A Ambridge, Ben %A Vylomova, Ekaterina %Y Nicolai, Garrett %Y Gorman, Kyle %Y Cotterell, Ryan %S Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology %D 2021 %8 August %I Association for Computational Linguistics %C Online %F pimentel-ryskina-etal-2021-sigmorphon %X This year‘s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six systems on the new data and conduct an extensive error analysis of the systems’ predictions. Transformer-based models generally demonstrate superior performance on the majority of languages, achieving \ensuremath>90% accuracy on 65% of them. The languages on which systems yielded low accuracy are mainly under-resourced, with a limited amount of data. Most errors made by the systems are due to allomorphy, honorificity, and form variation. In addition, we observe that systems especially struggle to inflect multiword lemmas. The systems also produce misspelled forms or end up in repetitive loops (e.g., RNN-based models). Finally, we report a large drop in systems’ performance on previously unseen lemmas. %R 10.18653/v1/2021.sigmorphon-1.25 %U https://aclanthology.org/2021.sigmorphon-1.25/ %U https://doi.org/10.18653/v1/2021.sigmorphon-1.25 %P 229-259
Markdown (Informal)
[SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages](https://aclanthology.org/2021.sigmorphon-1.25/) (Pimentel et al., SIGMORPHON 2021)
- SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages (Pimentel et al., SIGMORPHON 2021)
ACL
- Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Nizar Habash, Charbel El-Khaissi, Omer Goldman, Michael Gasser, William Lane, Matt Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, Andrey Shcherbakov, Aziyana Bayyr-ool, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, Elena Klyachko, Ali Salehi, Andrew Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, Aelita Salchak, Christopher Straughn, Zoey Liu, Jonathan North Washington, Duygu Ataman, Witold Kieraś, Marcin Woliński, Totok Suhardijanto, Niklas Stoehr, Zahroh Nuriah, Shyam Ratan, Francis M. Tyers, Edoardo M. Ponti, Grant Aiton, Richard J. Hatcher, Emily Prud’hommeaux, Ritesh Kumar, Mans Hulden, Botond Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohit Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, and Ekaterina Vylomova. 2021. SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 229–259, Online. Association for Computational Linguistics.