Preprint / Version 1

Sign Language Recognition from Video using Geometrical and Transfer Learning Techniques

##article.authors##

  • Thomas Li Unaffiliated
  • David Chen

DOI:

https://doi.org/10.58445/rars.486

Keywords:

ASL, Learning techniques, 2-D video stream recognition

Abstract

We aim to develop an American Sign Language (ASL) recognition system to bridge the communica tion barrier for the deaf and hard-of-hearing communities. Some previous projects utilized specialized hardware, while this study focuses on purely 2-D video stream recognition due to its accessibility. In this paper, we use the fine-tuning method, which involves fine-tuning a neural network model trained on public datasets for specific individuals in a data-efficient manner. Challenges such as image background interference and occlusion are discussed. The algorithms are tested on teenager, adult and senior male and female hands and the accuracies are comparatively better than other previous models, with the results average testing accuracy being 96.696%. 

References

Zhihao Zhou, Kyle Chen, Xiaoshi Li, Songlin Zhang, Yufen Wu, Yihao Zhou, Keyu Meng, Chenchen Sun, Qiang He, Wenjing Fan, Endong Fan, Zhiwei Lin, Xulong Tan, Weili Deng, Jin Yang, and Jun

Chen. Sign-to-speech translation using machine-learning-assisted stretchable sensor arrays. Nature Electronics, 3:571–578, 2020.

Google. Hand landmarks detection guide. https://developers.google.com/mediapipe/solutions/ vision/hand_landmarker.

Asl alphabet. https://www.kaggle.com/dsv/29550.

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90, 2017.

Pytorch. Crossentropyloss. https://pytorch.org/docs/stable/generated/torch.nn. CrossEntropyLoss.html.

Joanne Quinn, Joanne McEachen, Michael Fullan, Mag Gardner, and Max Drummy. Dive into deep learning: Tools for engagement. Corwin Press, 2019.

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017.

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mo bilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

Geoffrey Hinton, Nitish Srivastava, and Kevin Swersky. Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on, 14(8):2, 2012.

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.

Camillo Lugaresi, Jiuqiang Tang, Hadon Nash, Chris McClanahan, Esha Uboweja, Michael Hays, Fan Zhang, Chuo-Ling Chang, Ming Guang Yong, Juhyun Lee, et al. Mediapipe: A framework for building perception pipelines. arXiv preprint arXiv:1906.08172, 2019.

Downloads

Posted

2023-09-23