Arabic Sign Language Recognition with Deep Learning Models and Keypoint Landmarks
Alshanik, Farah 
(Jordan University of Science and Technology)
Aljunidi, Saif 
(Yarmouk University, Irbid, Jordan)
Qawasmeh, Ethar 
(Yarmouk University, Irbid, Jordan)
| Data: |
2026 |
| Resum: |
Communication is a fundamental aspect of human interaction, essential for expressing emotions and building relationships. While individuals with typical hearing rely on spoken language, the deaf and mute community communicates through visual gestures and facial expressions, commonly known as sign language. However, communication barriers persist between hearing and non-hearing individuals, especially in regions with limited assistive technologies. To address this gap, we developed a real-time sign language system that converts Arabic sign gestures into textual output. Unlike most existing systems that are limited to individual alphabets or numbers, our model recognizes complete, meaningful words. It was trained on a curated dataset of 112 Arabic sign language words extracted from the KARSL dataset. Using OpenCV and the MediaPipe framework, multimodal keypoints from hands, face, and upper-body pose were extracted. MediaPipe Hands generated a 255-dimensional feature vector for each video frame, capturing real-time hand movements. These features were used to train deep learning models-CNN, GRU, LSTM, and Bi-LSTM. Among these, the Bi-LSTM model achieved the highest performance with a training accuracy of 99. 89% and testing accuracy of 99. 61%. These results emphasize the potential of MediaPipe-based landmark extraction combined with deep learning to support accessible communication for Arabic-speaking deaf communities. |
| Nota: |
The authors would like to acknowledge the deanship of research at the Jordan University of Science and Technology for supporting this research (grant 552/2025) |
| Drets: |
Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, i la comunicació pública de l'obra, sempre que no sigui amb finalitats comercials, i sempre que es reconegui l'autoria de l'obra original. No es permet la creació d'obres derivades.  |
| Llengua: |
Anglès |
| Document: |
Article ; recerca ; Versió publicada |
| Matèria: |
Arabic sign language ;
KArSL dataset ;
MediaPipe ;
LSTM ;
GRU ;
Sign language recognition ;
Deep learning |
| Publicat a: |
ELCVIA, Vol. 25, Num. 2 (2026) , p. 1-19 (Regular Issue) , ISSN 1577-5097 |
Adreça original: https://elcvia.cvc.uab.cat/article/view/2317
Adreça alternativa: https://raco.cat/index.php/ELCVIA/article/view/980000008487
DOI: 10.5565/rev/elcvia.2317
El registre apareix a les col·leccions:
Articles >
Articles publicats >
ELCVIAArticles >
Articles de recerca
Registre creat el 2026-06-16, darrera modificació el 2026-06-17