@article{khashabi-etal-2021-parsinlu,
title = "{P}arsi{NLU}: A Suite of Language Understanding Challenges for {P}ersian",
author = "Khashabi, Daniel and
Cohan, Arman and
Shakeri, Siamak and
Hosseini, Pedram and
Pezeshkpour, Pouya and
Alikhani, Malihe and
Aminnaseri, Moin and
Bitaab, Marzieh and
Brahman, Faeze and
Ghazarian, Sarik and
Gheini, Mozhdeh and
Kabiri, Arman and
Mahabagdi, Rabeeh Karimi and
Memarrast, Omid and
Mosallanezhad, Ahmadreza and
Noury, Erfan and
Raji, Shahab and
Rasooli, Mohammad Sadegh and
Sadeghi, Sepideh and
Azer, Erfan Sadeqi and
Samghabadi, Niloofar Safi and
Shafaei, Mahsa and
Sheybani, Saber and
Tazarv, Ali and
Yaghoobzadeh, Yadollah",
editor = "Roark, Brian and
Nenkova, Ani",
journal = "Transactions of the Association for Computational Linguistics",
volume = "9",
year = "2021",
address = "Cambridge, MA",
publisher = "MIT Press",
url = "https://aclanthology.org/2021.tacl-1.68",
doi = "10.1162/tacl_a_00419",
pages = "1147--1162",
abstract = "Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of language understanding tasks{---}reading comprehension, textual entailment, and so on. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5k new instances across 6 distinct NLU tasks. Additionally, we present the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.1",
}
<?xml version="1.0" encoding="UTF-8"?>
<modsCollection xmlns="http://www.loc.gov/mods/v3">
<mods ID="khashabi-etal-2021-parsinlu">
<titleInfo>
<title>ParsiNLU: A Suite of Language Understanding Challenges for Persian</title>
</titleInfo>
<name type="personal">
<namePart type="given">Daniel</namePart>
<namePart type="family">Khashabi</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Arman</namePart>
<namePart type="family">Cohan</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Siamak</namePart>
<namePart type="family">Shakeri</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Pedram</namePart>
<namePart type="family">Hosseini</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Pouya</namePart>
<namePart type="family">Pezeshkpour</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Malihe</namePart>
<namePart type="family">Alikhani</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Moin</namePart>
<namePart type="family">Aminnaseri</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Marzieh</namePart>
<namePart type="family">Bitaab</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Faeze</namePart>
<namePart type="family">Brahman</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sarik</namePart>
<namePart type="family">Ghazarian</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Mozhdeh</namePart>
<namePart type="family">Gheini</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Arman</namePart>
<namePart type="family">Kabiri</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Rabeeh</namePart>
<namePart type="given">Karimi</namePart>
<namePart type="family">Mahabagdi</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Omid</namePart>
<namePart type="family">Memarrast</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ahmadreza</namePart>
<namePart type="family">Mosallanezhad</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Erfan</namePart>
<namePart type="family">Noury</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Shahab</namePart>
<namePart type="family">Raji</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Mohammad</namePart>
<namePart type="given">Sadegh</namePart>
<namePart type="family">Rasooli</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Sepideh</namePart>
<namePart type="family">Sadeghi</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Erfan</namePart>
<namePart type="given">Sadeqi</namePart>
<namePart type="family">Azer</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Niloofar</namePart>
<namePart type="given">Safi</namePart>
<namePart type="family">Samghabadi</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Mahsa</namePart>
<namePart type="family">Shafaei</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Saber</namePart>
<namePart type="family">Sheybani</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Ali</namePart>
<namePart type="family">Tazarv</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<name type="personal">
<namePart type="given">Yadollah</namePart>
<namePart type="family">Yaghoobzadeh</namePart>
<role>
<roleTerm authority="marcrelator" type="text">author</roleTerm>
</role>
</name>
<originInfo>
<dateIssued>2021</dateIssued>
</originInfo>
<typeOfResource>text</typeOfResource>
<genre authority="bibutilsgt">journal article</genre>
<relatedItem type="host">
<titleInfo>
<title>Transactions of the Association for Computational Linguistics</title>
</titleInfo>
<originInfo>
<issuance>continuing</issuance>
<publisher>MIT Press</publisher>
<place>
<placeTerm type="text">Cambridge, MA</placeTerm>
</place>
</originInfo>
<genre authority="marcgt">periodical</genre>
<genre authority="bibutilsgt">academic journal</genre>
</relatedItem>
<abstract>Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of language understanding tasks—reading comprehension, textual entailment, and so on. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5k new instances across 6 distinct NLU tasks. Additionally, we present the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.1</abstract>
<identifier type="citekey">khashabi-etal-2021-parsinlu</identifier>
<identifier type="doi">10.1162/tacl_a_00419</identifier>
<location>
<url>https://aclanthology.org/2021.tacl-1.68</url>
</location>
<part>
<date>2021</date>
<detail type="volume"><number>9</number></detail>
<extent unit="page">
<start>1147</start>
<end>1162</end>
</extent>
</part>
</mods>
</modsCollection>
%0 Journal Article
%T ParsiNLU: A Suite of Language Understanding Challenges for Persian
%A Khashabi, Daniel
%A Cohan, Arman
%A Shakeri, Siamak
%A Hosseini, Pedram
%A Pezeshkpour, Pouya
%A Alikhani, Malihe
%A Aminnaseri, Moin
%A Bitaab, Marzieh
%A Brahman, Faeze
%A Ghazarian, Sarik
%A Gheini, Mozhdeh
%A Kabiri, Arman
%A Mahabagdi, Rabeeh Karimi
%A Memarrast, Omid
%A Mosallanezhad, Ahmadreza
%A Noury, Erfan
%A Raji, Shahab
%A Rasooli, Mohammad Sadegh
%A Sadeghi, Sepideh
%A Azer, Erfan Sadeqi
%A Samghabadi, Niloofar Safi
%A Shafaei, Mahsa
%A Sheybani, Saber
%A Tazarv, Ali
%A Yaghoobzadeh, Yadollah
%J Transactions of the Association for Computational Linguistics
%D 2021
%V 9
%I MIT Press
%C Cambridge, MA
%F khashabi-etal-2021-parsinlu
%X Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this language. The availability of high-quality evaluation datasets is a necessity for reliable assessment of the progress on different NLU tasks and domains. We introduce ParsiNLU, the first benchmark in Persian language that includes a range of language understanding tasks—reading comprehension, textual entailment, and so on. These datasets are collected in a multitude of ways, often involving manual annotations by native speakers. This results in over 14.5k new instances across 6 distinct NLU tasks. Additionally, we present the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compare them with human performance, which provides valuable insights into our ability to tackle natural language understanding challenges in Persian. We hope ParsiNLU fosters further research and advances in Persian language understanding.1
%R 10.1162/tacl_a_00419
%U https://aclanthology.org/2021.tacl-1.68
%U https://doi.org/10.1162/tacl_a_00419
%P 1147-1162
Markdown (Informal)
[ParsiNLU: A Suite of Language Understanding Challenges for Persian](https://aclanthology.org/2021.tacl-1.68) (Khashabi et al., TACL 2021)
ACL
- Daniel Khashabi, Arman Cohan, Siamak Shakeri, Pedram Hosseini, Pouya Pezeshkpour, Malihe Alikhani, Moin Aminnaseri, Marzieh Bitaab, Faeze Brahman, Sarik Ghazarian, Mozhdeh Gheini, Arman Kabiri, Rabeeh Karimi Mahabagdi, Omid Memarrast, Ahmadreza Mosallanezhad, Erfan Noury, Shahab Raji, Mohammad Sadegh Rasooli, Sepideh Sadeghi, et al.. 2021. ParsiNLU: A Suite of Language Understanding Challenges for Persian. Transactions of the Association for Computational Linguistics, 9:1147–1162.