PSC: Extending Context Window of Large Language Models via Phase Shift Calibration

Wenqiao Zhu, Chao Xu, Lulu Wang, Jun Wu


Abstract
Rotary Position Embedding (RoPE) is an efficient position encoding approach and is widely utilized in numerous large language models (LLMs). Recently, a lot of methods have been put forward to further expand the context window based on RoPE. The core concept of those methods is to predefine or search for a set of factors to rescale the base frequencies of RoPE. Nevertheless, it is quite a challenge for existing methods to predefine an optimal factor due to the exponential search space. In view of this, we introduce PSC (Phase Shift Calibration), a small module for calibrating the frequencies predefined by existing methods. With the employment of PSC, we demonstrate that many existing methods can be further enhanced, like PI, YaRN, and LongRoPE. We conducted extensive experiments across multiple models and tasks. The results demonstrate that (1) when PSC is enabled, the comparative reductions in perplexity increase as the context window size is varied from 16k, to 32k, and up to 64k. (2) Our approach is broadly applicable and exhibits robustness across a variety of models and tasks.
Anthology ID:
2024.emnlp-main.341
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5958–5970
Language:
URL:
https://aclanthology.org/2024.emnlp-main.341/
DOI:
10.18653/v1/2024.emnlp-main.341
Bibkey:
Cite (ACL):
Wenqiao Zhu, Chao Xu, Lulu Wang, and Jun Wu. 2024. PSC: Extending Context Window of Large Language Models via Phase Shift Calibration. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 5958–5970, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
PSC: Extending Context Window of Large Language Models via Phase Shift Calibration (Zhu et al., EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.341.pdf
Software:
 2024.emnlp-main.341.software.zip