Preprint / Version 1

Leveraging LLMs, computer vision, and ChatGPT to ease communication between deaf and hearing individuals

##article.authors##

  • Amogh Khaparde Writer

DOI:

https://doi.org/10.58445/rars.1653

Keywords:

ASL Interpreter, CNN model

Abstract

Hearing and hard of hearing individuals might struggle to communicate without the help of a human interpreter. However, machine learning is a tool that can help enable ease of communication between both groups of people. This project involves an application that uses a convolutional neural network to solve this problem. This convolutional neural network model was custom-made from 46,000 augmented training images and tested with 8000 images. Its purpose in the application is to take camera frames of ASL fingerspelling as input and then translate it into text. This alphabet is appended to a sentence variable where it is stored. Once the user has completed a sentence, a ChatGPT API will correct any grammatical errors or misclassifications that the model made. With this ChatGPT edited sentence, a Play.HT API can turn it to speech in order to further smoothen the language barrier. A background cropping feature was added as well, which takes frames from a camera and uses the MediaPipe module to locate the hand, after which it will remove any part of the image except the hand in order to reduce noise for the model. The final model achieved a 97.5% accuracy and 0.1% loss with the testing dataset and also had a strong confusion matrix as well, with only some mistakes with similar looking characters such as M and N. The model also worked better in different rooms since the background noise was removed from the model entirely. Overall, this application is an innovative step for using AI and machine learning to slowly removing the language barrier between hard of hearing and hearing individuals.

References

"American Sign Language alphabet recognition using Microsoft Kinect."

ScholarsMine, scholarsmine.mst.edu/cgi/

viewcontent.cgi?article=8391&context=masters_theses. Accessed 20 Nov. 2023.

Bowden, Richard. "Spelling it out: Real-time ASL fingerspelling recognition."

ieeexplore.ieee.org, ieeexplore.ieee.org/document/6130290. Accessed

Nov. 2023.

Starner, Thad. "American sign language recognition with the kinect." dl.acm.org, dl.acm.org/doi/10.1145/2070481.2070532. Accessed 20 Nov. 2023.

Yang, Hee Deok, editor. "Sign Language Recognition with the Kinect Sensor Based on Conditional Random Fields." ncbi.nlm.nih.gov, www.ncbi.nlm.nih.gov/pmc/articles/PMC4327011/. Accessed 16 Nov. 2023.

"Wearable-tech glove translates sign language into speech in real time." UCLA, newsroom.ucla.edu/releases/glove-translates-sign-language-to-speech. Accessed 20 Nov. 2023.

"Sign Language Glove." Cornell, people.ece.cornell.edu/land/courses/ece4760/FinalProjects/f2014/rdv28_mjl256/webpage/. Accessed 20 Nov. 2023.

"Recognition of Finger Spelling of American Sign Language with Artificial Neural Network Using Position/Orientation Sensors and Data Glove." SpingerLink, link.springer.com/chapter/10.1007/11427445_25. Accessed 20 Nov. 2023.

"Dataglove for Sign Language Recognition of People with Hearing and Speech Impairment via Wearable Inertial Sensors." MDPI, www.mdpi.com/1424-8220/23/15/6693. Accessed 20 Nov. 2023.

R. Fatmi, S. Rashad and R. Integlia, "Comparing ANN, SVM, and HMM based Machine Learning Methods for American Sign Language Recognition using Wearable Motion Sensors," 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 2019, pp. 0290-0297, doi: 10.1109/CCWC.2019.8666491.

Chong, Taek Wei. "American Sign Language Recognition Using Leap Motion Controller with Machine Learning Approach." National Library of Medicine, www.ncbi.nlm.nih.gov/pmc/articles/PMC6210690/. Accessed 20 Nov. 2023.

"The Leap Motion controller: A view on sign language." Griffith University, research-repository.griffith.edu.au/bitstream/handle/10072/59247/89839_1.pdf. Accessed 20 Nov. 2023.

"Sign language recognition through Leap Motion controller and input prediction algorithm." Journal of Physics, iopscience.iop.org/article/10.1088/1742-6596/1715/1/012008/pdf#:~:text=Leap%20motion%20controller%20(LMC)%20is,language%20letters%20and%20digits%20recognition. Accessed 20 Nov. 2023.

"Sign Language Recognition with Advanced Computer Vision." Towardsdatascience, towardsdatascience.com/sign-language-recognition-with-advanced-computer-vision-7b74f20f3442. Accessed 20 Nov. 2023.

Science Direct. 16 Nov. 2021, www.sciencedirect.com/science/article/pii/S2667305321000454. Accessed 12 Oct. 2023.

S. Mhatre, S. Joshi and H. B. Kulkarni, "Sign Language Detection using LSTM," 2022 IEEE International Conference on Current Development in Engineering and Technology (CCET), Bhopal, India, 2022, pp. 1-6, doi: 10.1109/CCET56606.2022.10080705.

Parades, Brian. "American Sign Language Interpret using web camera and deep learning." Ieomsociety.org, ieomsociety.org/proceedings/2022rome/89.pdf?CFID=20565289-bddb-402a-93a8-93cf81c3e18e&CFTOKEN=0. Accessed 20 Nov. 2023.

"DeepASLR: A CNN based human computer interface for American Sign Language recognition for hearing-impaired individuals." Science Direct, www.sciencedirect.com/science/article/pii/S2666990021000471#:~:text=Jain%20et%20al.,for%20a%20two%2Dlayer%20CNN. Accessed 12 Oct. 2023.

"ASL Detection - 99% Accuracy." Kaggle, www.kaggle.com/code/namanmanchanda/asl-detection-99-accuracy. Accessed 17 Nov. 2023.

"MiCT-RANet-ASL-FingerSpelling." Github, github.com/fmahoudeau/MiCT-RANet-ASL-FingerSpelling. Accessed 17 Nov. 2023.

Shi, Bowen. "Fingerspelling Detection in American Sign Language."

openaccess.com, openaccess.thecvf.com/content/CVPR2021/papers/

Shi_Fingerspelling_Detection_in_American_Sign_Language_CVPR_2021_paper.pdf.

Accessed 21 Nov. 2023.

Downloads

Posted

2024-09-21