Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features

Sign language recognition(SLR) is a multidisciplinary research topic in pattern recognition and computer vision. Due to large amount of data from the continuous frames of sign language videos, selecting representative data to eliminate irrelevant information has always been a challenging problem in...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Wei Pan, Xiongquan Zhang, Zhongfu Ye
Formato: article
Lenguaje:EN
Publicado: IEEE 2020
Materias:
Acceso en línea:https://doaj.org/article/9b8a7ec5aaa042b2af6924d8452efd31
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9b8a7ec5aaa042b2af6924d8452efd31
record_format dspace
spelling oai:doaj.org-article:9b8a7ec5aaa042b2af6924d8452efd312021-11-19T00:05:46ZAttention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features2169-353610.1109/ACCESS.2020.3041115https://doaj.org/article/9b8a7ec5aaa042b2af6924d8452efd312020-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9272801/https://doaj.org/toc/2169-3536Sign language recognition(SLR) is a multidisciplinary research topic in pattern recognition and computer vision. Due to large amount of data from the continuous frames of sign language videos, selecting representative data to eliminate irrelevant information has always been a challenging problem in data preprocessing of sign language samples. In recent years, skeletal data emerged as a new type of data but received insufficient attention. Meanwhile, due to the increasing diversity of sign language features, making full use of them has also been an important research topic. In this paper, we improve keyframe-centered clips (KCC) sampling to get a new kind of sampling method called optimized keyframe-centered clips (OptimKCC) sampling to select key actions from sign language videos. Besides, we design a new kind of skeletal feature called Multi-Plane Vector Relation (MPVR) to describe the video samples. Finally, combined with the attention mechanism, we also use Attention-Based networks to distribute weights to the temporal features and the spatial features extracted from skeletal data. We implement comparison experiments on our own and the public sign language dataset under the Signer-Independent and the Signer-Dependent circumstances to show the advantages of our methods.Wei PanXiongquan ZhangZhongfu YeIEEEarticleSign language recognitionkeyframe samplingskeletal featuresattention-based BLSTMElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 8, Pp 215592-215602 (2020)
institution DOAJ
collection DOAJ
language EN
topic Sign language recognition
keyframe sampling
skeletal features
attention-based BLSTM
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Sign language recognition
keyframe sampling
skeletal features
attention-based BLSTM
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Wei Pan
Xiongquan Zhang
Zhongfu Ye
Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
description Sign language recognition(SLR) is a multidisciplinary research topic in pattern recognition and computer vision. Due to large amount of data from the continuous frames of sign language videos, selecting representative data to eliminate irrelevant information has always been a challenging problem in data preprocessing of sign language samples. In recent years, skeletal data emerged as a new type of data but received insufficient attention. Meanwhile, due to the increasing diversity of sign language features, making full use of them has also been an important research topic. In this paper, we improve keyframe-centered clips (KCC) sampling to get a new kind of sampling method called optimized keyframe-centered clips (OptimKCC) sampling to select key actions from sign language videos. Besides, we design a new kind of skeletal feature called Multi-Plane Vector Relation (MPVR) to describe the video samples. Finally, combined with the attention mechanism, we also use Attention-Based networks to distribute weights to the temporal features and the spatial features extracted from skeletal data. We implement comparison experiments on our own and the public sign language dataset under the Signer-Independent and the Signer-Dependent circumstances to show the advantages of our methods.
format article
author Wei Pan
Xiongquan Zhang
Zhongfu Ye
author_facet Wei Pan
Xiongquan Zhang
Zhongfu Ye
author_sort Wei Pan
title Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
title_short Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
title_full Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
title_fullStr Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
title_full_unstemmed Attention-Based Sign Language Recognition Network Utilizing Keyframe Sampling and Skeletal Features
title_sort attention-based sign language recognition network utilizing keyframe sampling and skeletal features
publisher IEEE
publishDate 2020
url https://doaj.org/article/9b8a7ec5aaa042b2af6924d8452efd31
work_keys_str_mv AT weipan attentionbasedsignlanguagerecognitionnetworkutilizingkeyframesamplingandskeletalfeatures
AT xiongquanzhang attentionbasedsignlanguagerecognitionnetworkutilizingkeyframesamplingandskeletalfeatures
AT zhongfuye attentionbasedsignlanguagerecognitionnetworkutilizingkeyframesamplingandskeletalfeatures
_version_ 1718420683025809408