%0 Conference Proceedings %T An Architecture for Accelerated Large-Scale Inference of Transformer-Based Language Models %A Ganiev, Amir %A Chapin, Colton %A De Andrade, Anderson %A Liu, Chen %Y Kim, Young-bum %Y Li, Yunyao %Y Rambow, Owen %S Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers %D 2021 %8 June %I Association for Computational Linguistics %C Online %F ganiev-etal-2021-architecture %R 10.18653/v1/2021.naacl-industry.21 %U https://aclanthology.org/2021.naacl-industry.21/ %U https://doi.org/10.18653/v1/2021.naacl-industry.21 %P 163-169