HypMix: Hyperbolic Interpolative Data Augmentation

Ramit Sawhney, Megh Thakkar, Shivam Agarwal, Di Jin, Diyi Yang, Lucie Flek


Abstract
Interpolation-based regularisation methods for data augmentation have proven to be effective for various tasks and modalities. These methods involve performing mathematical operations over the raw input samples or their latent states representations - vectors that often possess complex hierarchical geometries. However, these operations are performed in the Euclidean space, simplifying these representations, which may lead to distorted and noisy interpolations. We propose HypMix, a novel model-, data-, and modality-agnostic interpolative data augmentation technique operating in the hyperbolic space, which captures the complex geometry of input and hidden state hierarchies better than its contemporaries. We evaluate HypMix on benchmark and low resource datasets across speech, text, and vision modalities, showing that HypMix consistently outperforms state-of-the-art data augmentation techniques. In addition, we demonstrate the use of HypMix in semi-supervised settings. We further probe into the adversarial robustness and qualitative inferences we draw from HypMix that elucidate the efficacy of the Riemannian hyperbolic manifolds for interpolation-based data augmentation.
Anthology ID:
2021.emnlp-main.776
Volume:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2021
Address:
Online and Punta Cana, Dominican Republic
Editors:
Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
9858–9868
Language:
URL:
https://aclanthology.org/2021.emnlp-main.776
DOI:
10.18653/v1/2021.emnlp-main.776
Bibkey:
Cite (ACL):
Ramit Sawhney, Megh Thakkar, Shivam Agarwal, Di Jin, Diyi Yang, and Lucie Flek. 2021. HypMix: Hyperbolic Interpolative Data Augmentation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 9858–9868, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):
HypMix: Hyperbolic Interpolative Data Augmentation (Sawhney et al., EMNLP 2021)
Copy Citation:
PDF:
https://aclanthology.org/2021.emnlp-main.776.pdf
Video:
 https://aclanthology.org/2021.emnlp-main.776.mp4
Code
 caisa-lab/hypmix-emnlp
Data
CIFAR-10CIFAR-100