RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor

Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which h...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Shih-Hung Yang, Yao-Mao Cheng, Jyun-We Huang, Yon-Ping Chen
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/2dd77c35306a42a194de0a472a3a3452
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:2dd77c35306a42a194de0a472a3a3452
record_format dspace
spelling oai:doaj.org-article:2dd77c35306a42a194de0a472a3a34522021-11-11T18:20:53ZRFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor10.3390/math92128152227-7390https://doaj.org/article/2dd77c35306a42a194de0a472a3a34522021-11-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/21/2815https://doaj.org/toc/2227-7390Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.Shih-Hung YangYao-Mao ChengJyun-We HuangYon-Ping ChenMDPI AGarticlefingerspelling recognitiondepth sensorfinger attentionreceptive fieldinter-finger relationMathematicsQA1-939ENMathematics, Vol 9, Iss 2815, p 2815 (2021)
institution DOAJ
collection DOAJ
language EN
topic fingerspelling recognition
depth sensor
finger attention
receptive field
inter-finger relation
Mathematics
QA1-939
spellingShingle fingerspelling recognition
depth sensor
finger attention
receptive field
inter-finger relation
Mathematics
QA1-939
Shih-Hung Yang
Yao-Mao Cheng
Jyun-We Huang
Yon-Ping Chen
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
description Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.
format article
author Shih-Hung Yang
Yao-Mao Cheng
Jyun-We Huang
Yon-Ping Chen
author_facet Shih-Hung Yang
Yao-Mao Cheng
Jyun-We Huang
Yon-Ping Chen
author_sort Shih-Hung Yang
title RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
title_short RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
title_full RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
title_fullStr RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
title_full_unstemmed RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
title_sort rfanet: receptive field-aware network with finger attention for fingerspelling recognition using a depth sensor
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/2dd77c35306a42a194de0a472a3a3452
work_keys_str_mv AT shihhungyang rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor
AT yaomaocheng rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor
AT jyunwehuang rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor
AT yonpingchen rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor
_version_ 1718431873008402432