Abstract
Dynamic hand gestures attract great interest and are utilized in different fields. Among these, man-machine interaction is an interesting area that makes use of the hand to provide a natural way of interaction between them. A dynamic hand gesture recognition system is proposed in this paper, which helps to perform control operations in applications such as music players, video games, etc. The key motivation of this research is to provide a simple, touch-free system for effortless and faster human-computer interaction (HCI). As this proposed model employs dynamic hand gestures, HCI is achieved by building a model with a convolutional neural network (CNN) and long short-term memory (LSTM) networks. CNN helps in extracting important features from the images and LSTM helps to extract the motion information between the frames. Various models are constructed by differing the LSTM and CNN layers. The proposed system is tested on an existing EgoGesture dataset that has several classes of gestures from which the dynamic gestures are utilized. This dataset is used as it has more data with a complex background, actions performed with varying speeds, lighting conditions, etc. This proposed hand gesture recognition system attained an accuracy of 93%, which is better than other existing systems subject to certain limitations.
Keywords:
dynamic hand gesture, human-computer interaction, long short-term memory, convolutional neural networkReferences
2. H. Tang, H. Liu, W. Xiao, N. Sebe, Fast and robust dynamic hand gesture recognition via key frames extraction and feature fusion, Neurocomputing, 331(C): 424–433, 2019, https://doi.org/10.1016/j.neucom.2018.11.038
3. N.A. Ibraheem, R.Z. Khan, M.M. Hasan, Comparative study of skin color-based segmentation techniques, International Journal of Applied Information Systems (IJAIS), 5(10): 24–34, 2013, https://doi.org/10.5120/ijais13-450985
4. M. Alhussein, K. Aurangzeb, S.I. Haider, Hybrid CNN-LSTM model for short-term individual household load forecasting, IEEE Access, 8: 180544–180557, 2020, https://doi.org/10.1109/ACCESS.2020.3028281
5. R.M. Prakash, T. Deepa, T. Gunasundari, N. Kasthuri, Gesture recognition and finger-tip detection for human computer interaction, [in:] 2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–4, IEEE, 2017, https://doi.org/10.1109/ICIIECS.2017.8276056
6. M. Soliman, F. Mueller, L. Hegemann, J.S. Roo, C. Theobalt, J. Steimle, Finger input: Capturing expressive single-hand thumb-to-finger microgestures, [in:] 2018 International Conference on Interactive Surfaces and Spaces (ICISS), pp. 177–187, ACM, 2018, https://doi.org/10.1145/3279778.3279799
7. F. Chen, J. Deng, Z. Pang, M. Baghaei Nejad, H. Yang, G. Yang, Finger angle-based hand gesture recognition for smart infrastructure using wearable wrist-worn camera, Applied Sciences, 8(3): 369, 2018, https://doi.org/10.3390/app8030369
8. N.L. Hakim, T.K. Shih, S.P.K. Arachchi, W. Aditya, Y.C. Chen, C.Y. Lin, Dynamic hand gesture recognition using 3DCNN and LSTM with FSM context-aware model, Sensors, 19(24): 5429, 2019, https://doi.org/10.3390/s19245429
9. O. Kopuklu, A. Gunduz, N. Kose, G. Rigoll, Real-time hand gesture detection and classification using convolutional neural networks, [in:] 2019 International Conference on Automatic Face & Gesture Recognition (ICAFGR), pp. 1–8, IEEE, 2019, https://doi.org/10.48550/ARXIV.1901.10323
10. S. Sridhar, F. Mueller, A. Oulasvirta, C. Theobalt, Fast and robust hand tracking using detection-guided optimization, [in] 2015 Conference on Computer Vision and Pattern Recognition, pp. 3213–3221, IEEE, 2015, https://doi.org/10.1109/cvpr.2015.7298941
11. C. Cao, Y. Zhang, Y. Wu, H. Lu, J. Cheng, Egocentric gesture recognition using recurrent 3d convolutional neural networks with spatiotemporal transformer modules, [in:] 2017 International conference on computer vision (ICCV), pp. 3763–3771, IEEE, 2017, https://doi.org/10.1109/ICCV.2017.406
12. H. Gammulle, S. Denman, S. Sridharan, C. Fookes, Two stream LSTM: A deep fusion framework for human action recognition, [in:] 2017 Winter Conference on Applications of Computer Vision (WACV), pp. 177–186, IEEE, 2017, https://doi.org/10.1109/WACV.2017.27
13. M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, A. Baskurt, Sequential deep learning for human action recognition, [in:] 2011 International workshop on human behavior understanding, pp. 29–39, Springer, 2011, https://doi.org/10.1007/978-3-642-25446-8_4
14. M. Loey, G. Manogaran, M.H.N. Taha, N.E.M. Khalifa, A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic, Measurement, 167: 108288, 2021, https://doi.org/10.1016/j.measurement.2020.108288
15. A. Agrawal, R. Raj, S. Porwal, Vision-based multimodal human-computer interaction using hand and head gestures, [in:] 2013 Conference on Information & Communication Technologies, pp. 1288–1292, IEEE, 2013, https://doi.org/10.1109/CICT.2013.6558300
16. C. Wang, Z. Liu, S.C. Chan, Super pixel-based hand gesture recognition with kinect depth camera, IEEE transactions on multimedia, 17(1): 29–39, 2014, https://doi.org/10.1109/TMM.2014.2374357
17. M.D. Zeiler, R. Fergus, Visualizing and understanding convolutional networks, [in:] 2014 European Conference on computer vision (ECCV), 8689: 818–833, Springer, 2014, https://doi.org/10.1007/978-3-319-10590-1_53
18. Y. Zhang, C. Cao, J. Cheng, H. Lu, EgoGesture: A new dataset and benchmark for egocentric hand gesture recognition, IEEE Transactions on Multimedia, 20(5): 1038–1050, 2018, https://doi.org/10.1109/TMM.2018.2808769
19. R.P. Sharma, G.K. Verma, Human computer interaction using hand gesture, Procedia Computer Science, 54: 721–727, 2015, https://doi.org/10.1016/j.procs.2015.06.085
20. K. Simonyan, A. Zisserman, Two-stream convolutional networks for action recognition in videos, [in:] 2014 International Conference on Neural Information Processing Systems, Vol. 1, pp. 568–576, 2014, https://doi.org/10.48550/ARXIV.1406.2199
21. L. Chao, J. Tao, M. Yang, Y. Li, Z. Wen, Long short-term memory recurrent neural network based encoding method for emotion recognition in video, [in:] 2016 International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2752–2756, IEEE, 2016, https://doi.org/10.1109/ICASSP.2016.7472178
22. K. Manisha, K. Artik, Automatic hand gesture recognition using hybrid meta-heuristicbased feature selection and classification with dynamic time warping, Computer Science Review, 39: 100320, 2021, https://doi.org/10.1016/j.cosrev.2020.100320
23. A. Mujahid et al., Real-time hand gesture recognition based on deep learning YOLOv3 model, Applied Sciences, 11(9): 4164, 2021, https://doi.org/10.3390/app11094164
24. C. Li, S. Li, Y. Gao, X. Zhang, W. Li, A two-stream neural network for pose-based hand gesture recognition, IEEE Transactions on Cognitive and Developmental Systems, 40: 2021, https://doi.org/10.1109/TCDS.2021.3126637
25. T. Xianlun, Y. Zhenfu, P. Jiangping, H. Bohui, W. Huiming, J. Li, Selective spatiotemporal features learning for dynamic gesture recognition, Expert Systems with Applications, 169: 4499, 2021, https://doi.org/10.1016/j.eswa.2020.114499
26. EgoGesture Dataset, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, http://www.nlpr.ia.ac.cn/iva/yfzhang/datasets/egogesture.html

