Unsupervised Discrete Representations of American Sign Language

Artem Abzaliev, Rada Mihalcea


Abstract
Many modalities are naturally represented as continuous signals, making it difficult to use them with models that expect discrete units, such as LLMs. In this paper, we explore the use of audio compression techniques for the discrete representation of the gestures used in sign language. We train a tokenizer for American Sign Language (ASL) fingerspelling, which discretizes sequences of fingerspelling signs into tokens. We also propose a loss function to improve the interpretability of these tokens such that they preserve both the semantic and the visual information of the signal. We show that the proposed method improves the performance of the discretized sequence on downstream tasks.
Anthology ID:
2024.emnlp-main.1104
Volume:
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
19786–19793
Language:
URL:
https://aclanthology.org/2024.emnlp-main.1104/
DOI:
10.18653/v1/2024.emnlp-main.1104
Bibkey:
Cite (ACL):
Artem Abzaliev and Rada Mihalcea. 2024. Unsupervised Discrete Representations of American Sign Language. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 19786–19793, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Discrete Representations of American Sign Language (Abzaliev & Mihalcea, EMNLP 2024)
Copy Citation:
PDF:
https://aclanthology.org/2024.emnlp-main.1104.pdf