Modeling Easiness for Training Transformers with Curriculum Learning

Leonardo Ranaldi, Giulia Pucci, Fabio Massimo Zanzotto


Abstract
Directly learning from complex examples is generally problematic for humans and machines. Indeed, a better strategy is exposing learners to examples in a reasonable, pedagogically-motivated order. Curriculum Learning (CL) has been proposed to import this strategy when training machine learning models. In this paper, building on Curriculum Learning, we propose a novel, linguistically motivated measure to determine example complexity for organizing examples during learning. Our complexity measure - LRC- is based on length, rarity, and comprehensibility. Our resulting learning model is CL-LRC, that is, CL with LRC. Experiments on downstream tasks show that CL-LRC outperforms existing CL and non-CL methods for training BERT and RoBERTa from scratch. Furthermore, we analyzed different measures, including perplexity, loss, and learning curve of different models pre-trained from scratch, showing that CL-LRC performs better than the state-of-the-art.
Anthology ID:
2023.ranlp-1.101
Volume:
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing
Month:
September
Year:
2023
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd., Shoumen, Bulgaria
Note:
Pages:
937–948
Language:
URL:
https://aclanthology.org/2023.ranlp-1.101
DOI:
Bibkey:
Cite (ACL):
Leonardo Ranaldi, Giulia Pucci, and Fabio Massimo Zanzotto. 2023. Modeling Easiness for Training Transformers with Curriculum Learning. In Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing, pages 937–948, Varna, Bulgaria. INCOMA Ltd., Shoumen, Bulgaria.
Cite (Informal):
Modeling Easiness for Training Transformers with Curriculum Learning (Ranaldi et al., RANLP 2023)
Copy Citation:
PDF:
https://aclanthology.org/2023.ranlp-1.101.pdf