%0 Conference Proceedings %T Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers %A Dai, Damai %A Sun, Yutao %A Dong, Li %A Hao, Yaru %A Ma, Shuming %A Sui, Zhifang %A Wei, Furu %Y Rogers, Anna %Y Boyd-Graber, Jordan %Y Okazaki, Naoaki %S Findings of the Association for Computational Linguistics: ACL 2023 %D 2023 %8 July %I Association for Computational Linguistics %C Toronto, Canada %F dai-etal-2023-gpt %R 10.18653/v1/2023.findings-acl.247 %U https://aclanthology.org/2023.findings-acl.247/ %U https://doi.org/10.18653/v1/2023.findings-acl.247 %P 4005-4019