Exploring Transfer Learning for Urdu Speech Synthesis

Sahar Jamal, Sadaf Abdul Rauf, Quratulain Majid


Abstract
Neural methods in Text to Speech synthesis (TTS) have demonstrated momentous advancement in terms of the naturalness and intelligibility of the synthesized speech. In this paper we present neural speech synthesis system for Urdu language, a low resource language. The main challenge faced for this study was the non-availability of any publicly available Urdu speech synthesis corpora. Urdu speech corpus was created using audio books and synthetic speech generation. To leverage the low resource scenario we adopted transfer learning for our experiments where knowledge extracted is further used to train the model using a relatively smaller Urdu training data set. The results from this model show satisfactory results, though a good margin for improvement exists and we are working to improve it further.
Anthology ID:
2022.eurali-1.11
Volume:
Proceedings of the Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference
Month:
June
Year:
2022
Address:
Marseille, France
Editors:
Atul Kr. Ojha, Sina Ahmadi, Chao-Hong Liu, John P. McCrae
Venue:
EURALI
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
70–74
Language:
URL:
https://aclanthology.org/2022.eurali-1.11
DOI:
Bibkey:
Cite (ACL):
Sahar Jamal, Sadaf Abdul Rauf, and Quratulain Majid. 2022. Exploring Transfer Learning for Urdu Speech Synthesis. In Proceedings of the Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference, pages 70–74, Marseille, France. European Language Resources Association.
Cite (Informal):
Exploring Transfer Learning for Urdu Speech Synthesis (Jamal et al., EURALI 2022)
Copy Citation:
PDF:
https://aclanthology.org/2022.eurali-1.11.pdf
Data
LJSpeech