Arabic Sign Language Recognition with Deep Learning Models and Keypoint Landmarks

Alshanik, Farah; Aljunidi, Saif; Qawasmeh, Ethar

doi:10.5565/rev/elcvia.2317

Cita bibliogràfica -- Enllaç permanent: https://ddd.uab.cat/record/328339

Google Scholar: cites

Arabic Sign Language Recognition with Deep Learning Models and Keypoint Landmarks
Alshanik, Farah

(Jordan University of Science and Technology)
Aljunidi, Saif

(Yarmouk University, Irbid, Jordan)
Qawasmeh, Ethar

(Yarmouk University, Irbid, Jordan)

Data:	2026
Resum:	Communication is a fundamental aspect of human interaction, essential for expressing emotions and building relationships. While individuals with typical hearing rely on spoken language, the deaf and mute community communicates through visual gestures and facial expressions, commonly known as sign language. However, communication barriers persist between hearing and non-hearing individuals, especially in regions with limited assistive technologies. To address this gap, we developed a real-time sign language system that converts Arabic sign gestures into textual output. Unlike most existing systems that are limited to individual alphabets or numbers, our model recognizes complete, meaningful words. It was trained on a curated dataset of 112 Arabic sign language words extracted from the KARSL dataset. Using OpenCV and the MediaPipe framework, multimodal keypoints from hands, face, and upper-body pose were extracted. MediaPipe Hands generated a 255-dimensional feature vector for each video frame, capturing real-time hand movements. These features were used to train deep learning models-CNN, GRU, LSTM, and Bi-LSTM. Among these, the Bi-LSTM model achieved the highest performance with a training accuracy of 99. 89% and testing accuracy of 99. 61%. These results emphasize the potential of MediaPipe-based landmark extraction combined with deep learning to support accessible communication for Arabic-speaking deaf communities.
Nota:	The authors would like to acknowledge the deanship of research at the Jordan University of Science and Technology for supporting this research (grant 552/2025)
Drets:	Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.
Llengua:	Anglès
Document:	Article ; recerca ; Versió publicada
Matèria:	Arabic sign language ; KArSL dataset ; MediaPipe ; LSTM ; GRU ; Sign language recognition ; Deep learning
Publicat a:	ELCVIA, Vol. 25, Num. 2 (2026) , p. 1-19 (Regular Issue) , ISSN 1577-5097

Adreça original: https://elcvia.cvc.uab.cat/article/view/2317
Adreça alternativa: https://raco.cat/index.php/ELCVIA/article/view/980000008487
DOI: 10.5565/rev/elcvia.2317

19 p, 614.3 KB

El registre apareix a les col·leccions:
Articles > Articles publicats > ELCVIA
Articles > Articles de recerca

Registre creat el 2026-06-16, darrera modificació el 2026-06-17

Registres semblants

Afegeix-lo al cistell personal
Anomena i desa Citation, BibTeX, MARC, MARCXML, DC, EDM OpenAire4