Advancing Arabic ASR for Disordered Speech: Fine-Tuning Wav2Vec2 on Egyptian Dysarthric Speech
DOI:
https://doi.org/10.58445/rars.2975Keywords:
Automatic Speech Recognition, Arabic ASR, Wav2Vec2, Dysarthric Speech, Egyptian Arabic, Low-Resource Languages, Speech Recognition Fine-TuningAbstract
Despite significant advances in Automatic Speech Recognition (ASR), its application to low-resource languages such as Arabic—especially for speakers with speech disorders—remains underdeveloped. This study presents a novel approach to Arabic ASR for disordered speech by fine-tuning a Wav2Vec2 model on a personalized dataset comprising approximately 1,300 utterances from an Egyptian Arabic speaker with speech impairments. Building on the comparative foundation set by Alsohby (2025), which evaluated four state-of-the-art ASR models across general, dysarthric, and accented speech, we extend the analysis through specialized model adaptation. Our methodology encompasses data preprocessing, fine-tuning, and evaluation using Word Error Rate (WER) and Character Error Rate (CER). Results indicate a substantial performance gain, reducing WER from 0.8516 to 0.3736 and CER from 0.5756 to 0.3478. These findings demonstrate the effectiveness of personalized fine-tuning and underscore the critical need for diverse, domain-specific datasets to improve ASR accessibility for Arabic speakers with speech impairments.
References
Alsohby, I. (2025). Comprehensive Analysis of Foundation ASR Model Performance: A Comparative Study of Conformer, HuBERT, Wav2Vec2, and Whisper with Insights into Dysarthric, Accented, and General Speech Recognition. Zenodo. https://doi.org/10.5281/zenodo.15459146
Alotaibi, Y., & Alotaibi, M. (2022). Arabic Automatic Speech Recognition: A Systematic Literature Review. MDPI. https://www.mdpi.com/2076-3417/12/17/8898
Abushariah, M. et al. (2024). Modern Standard Arabic Speech Disorders Corpus. International Journal of Speech Technology. https://link.springer.com/article/10.1007/s10772-023-10093-0
Qian, Z. et al. (2023). A survey of technologies for automatic dysarthric speech recognition. EURASIP Journal on Audio, Speech, and Music Processing. https://doi.org/10.1186/s13636-023-00318-2
MacDonald, B. et al. (2021). Personalized ASR Models from a Large and Diverse Disordered Speech Dataset. Google Research. https://blog.research.google/2021/08/personalized-asr-models-from-large-and.html
Downloads
Posted
Categories
License
Copyright (c) 2025 Islam Alsohby

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.