TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing
Xiao Wang, Qin Liu, Tao Gui, Qi Zhang, Yicheng Zou, Xin Zhou, Jiacheng Ye, Yongxin Zhang, Rui Zheng, Zexiong Pang, Qinzhuo Wu, Zhengyan Li, Chong Zhang, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Shan Qin, Bolin Zhu, Xiaoyu Xing, Jinlan Fu, Yue Zhang, Minlong Peng, Xiaoqing Zheng, Yaqian Zhou, Zhongyu Wei, Xipeng Qiu, Xuanjing Huang
Abstract
TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at https://github.com/textflint with all the evaluation results demonstrated at textflint.io.- Anthology ID:
- 2021.acl-demo.41
- Volume:
- Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations
- Month:
- August
- Year:
- 2021
- Address:
- Online
- Editors:
- Heng Ji, Jong C. Park, Rui Xia
- Venues:
- ACL | IJCNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 347–355
- Language:
- URL:
- https://aclanthology.org/2021.acl-demo.41
- DOI:
- 10.18653/v1/2021.acl-demo.41
- Bibkey:
- Cite (ACL):
- Xiao Wang, Qin Liu, Tao Gui, Qi Zhang, Yicheng Zou, Xin Zhou, Jiacheng Ye, Yongxin Zhang, Rui Zheng, Zexiong Pang, Qinzhuo Wu, Zhengyan Li, Chong Zhang, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, et al.. 2021. TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pages 347–355, Online. Association for Computational Linguistics.
- Cite (Informal):
- TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing (Wang et al., ACL-IJCNLP 2021)
- Copy Citation:
- PDF:
- https://aclanthology.org/2021.acl-demo.41.pdf
- Video:
- https://aclanthology.org/2021.acl-demo.41.mp4
- Data
- MultiNLI, SQuAD
Export citation
@inproceedings{wang-etal-2021-textflint, title = "{T}ext{F}lint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing", author = "Wang, Xiao and Liu, Qin and Gui, Tao and Zhang, Qi and Zou, Yicheng and Zhou, Xin and Ye, Jiacheng and Zhang, Yongxin and Zheng, Rui and Pang, Zexiong and Wu, Qinzhuo and Li, Zhengyan and Zhang, Chong and Ma, Ruotian and Fei, Zichu and Cai, Ruijian and Zhao, Jun and Hu, Xingwu and Yan, Zhiheng and Tan, Yiding and Hu, Yuan and Bian, Qiyuan and Liu, Zhihua and Qin, Shan and Zhu, Bolin and Xing, Xiaoyu and Fu, Jinlan and Zhang, Yue and Peng, Minlong and Zheng, Xiaoqing and Zhou, Yaqian and Wei, Zhongyu and Qiu, Xipeng and Huang, Xuanjing", editor = "Ji, Heng and Park, Jong C. and Xia, Rui", booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations", month = aug, year = "2021", address = "Online", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2021.acl-demo.41", doi = "10.18653/v1/2021.acl-demo.41", pages = "347--355", abstract = "TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at \url{https://github.com/textflint} with all the evaluation results demonstrated at textflint.io.", }
<?xml version="1.0" encoding="UTF-8"?> <modsCollection xmlns="http://www.loc.gov/mods/v3"> <mods ID="wang-etal-2021-textflint"> <titleInfo> <title>TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing</title> </titleInfo> <name type="personal"> <namePart type="given">Xiao</namePart> <namePart type="family">Wang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Qin</namePart> <namePart type="family">Liu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Tao</namePart> <namePart type="family">Gui</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Qi</namePart> <namePart type="family">Zhang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yicheng</namePart> <namePart type="family">Zou</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xin</namePart> <namePart type="family">Zhou</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jiacheng</namePart> <namePart type="family">Ye</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yongxin</namePart> <namePart type="family">Zhang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rui</namePart> <namePart type="family">Zheng</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zexiong</namePart> <namePart type="family">Pang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Qinzhuo</namePart> <namePart type="family">Wu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zhengyan</namePart> <namePart type="family">Li</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Chong</namePart> <namePart type="family">Zhang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ruotian</namePart> <namePart type="family">Ma</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zichu</namePart> <namePart type="family">Fei</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Ruijian</namePart> <namePart type="family">Cai</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jun</namePart> <namePart type="family">Zhao</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xingwu</namePart> <namePart type="family">Hu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zhiheng</namePart> <namePart type="family">Yan</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yiding</namePart> <namePart type="family">Tan</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yuan</namePart> <namePart type="family">Hu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Qiyuan</namePart> <namePart type="family">Bian</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zhihua</namePart> <namePart type="family">Liu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Shan</namePart> <namePart type="family">Qin</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Bolin</namePart> <namePart type="family">Zhu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xiaoyu</namePart> <namePart type="family">Xing</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jinlan</namePart> <namePart type="family">Fu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yue</namePart> <namePart type="family">Zhang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Minlong</namePart> <namePart type="family">Peng</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xiaoqing</namePart> <namePart type="family">Zheng</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Yaqian</namePart> <namePart type="family">Zhou</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Zhongyu</namePart> <namePart type="family">Wei</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xipeng</namePart> <namePart type="family">Qiu</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Xuanjing</namePart> <namePart type="family">Huang</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2021-08</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations</title> </titleInfo> <name type="personal"> <namePart type="given">Heng</namePart> <namePart type="family">Ji</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jong</namePart> <namePart type="given">C</namePart> <namePart type="family">Park</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Rui</namePart> <namePart type="family">Xia</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Online</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at https://github.com/textflint with all the evaluation results demonstrated at textflint.io.</abstract> <identifier type="citekey">wang-etal-2021-textflint</identifier> <identifier type="doi">10.18653/v1/2021.acl-demo.41</identifier> <location> <url>https://aclanthology.org/2021.acl-demo.41</url> </location> <part> <date>2021-08</date> <extent unit="page"> <start>347</start> <end>355</end> </extent> </part> </mods> </modsCollection>
%0 Conference Proceedings %T TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing %A Wang, Xiao %A Liu, Qin %A Gui, Tao %A Zhang, Qi %A Zou, Yicheng %A Zhou, Xin %A Ye, Jiacheng %A Zhang, Yongxin %A Zheng, Rui %A Pang, Zexiong %A Wu, Qinzhuo %A Li, Zhengyan %A Zhang, Chong %A Ma, Ruotian %A Fei, Zichu %A Cai, Ruijian %A Zhao, Jun %A Hu, Xingwu %A Yan, Zhiheng %A Tan, Yiding %A Hu, Yuan %A Bian, Qiyuan %A Liu, Zhihua %A Qin, Shan %A Zhu, Bolin %A Xing, Xiaoyu %A Fu, Jinlan %A Zhang, Yue %A Peng, Minlong %A Zheng, Xiaoqing %A Zhou, Yaqian %A Wei, Zhongyu %A Qiu, Xipeng %A Huang, Xuanjing %Y Ji, Heng %Y Park, Jong C. %Y Xia, Rui %S Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations %D 2021 %8 August %I Association for Computational Linguistics %C Online %F wang-etal-2021-textflint %X TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at https://github.com/textflint with all the evaluation results demonstrated at textflint.io. %R 10.18653/v1/2021.acl-demo.41 %U https://aclanthology.org/2021.acl-demo.41 %U https://doi.org/10.18653/v1/2021.acl-demo.41 %P 347-355
Markdown (Informal)
[TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing](https://aclanthology.org/2021.acl-demo.41) (Wang et al., ACL-IJCNLP 2021)
- TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing (Wang et al., ACL-IJCNLP 2021)
ACL
- Xiao Wang, Qin Liu, Tao Gui, Qi Zhang, Yicheng Zou, Xin Zhou, Jiacheng Ye, Yongxin Zhang, Rui Zheng, Zexiong Pang, Qinzhuo Wu, Zhengyan Li, Chong Zhang, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, et al.. 2021. TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, pages 347–355, Online. Association for Computational Linguistics.