RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor
Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which h...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/2dd77c35306a42a194de0a472a3a3452 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:2dd77c35306a42a194de0a472a3a3452 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:2dd77c35306a42a194de0a472a3a34522021-11-11T18:20:53ZRFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor10.3390/math92128152227-7390https://doaj.org/article/2dd77c35306a42a194de0a472a3a34522021-11-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/21/2815https://doaj.org/toc/2227-7390Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations.Shih-Hung YangYao-Mao ChengJyun-We HuangYon-Ping ChenMDPI AGarticlefingerspelling recognitiondepth sensorfinger attentionreceptive fieldinter-finger relationMathematicsQA1-939ENMathematics, Vol 9, Iss 2815, p 2815 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
fingerspelling recognition depth sensor finger attention receptive field inter-finger relation Mathematics QA1-939 |
spellingShingle |
fingerspelling recognition depth sensor finger attention receptive field inter-finger relation Mathematics QA1-939 Shih-Hung Yang Yao-Mao Cheng Jyun-We Huang Yon-Ping Chen RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
description |
Automatic fingerspelling recognition tackles the communication barrier between deaf and hearing individuals. However, the accuracy of fingerspelling recognition is reduced by high intra-class variability and low inter-class variability. In the existing methods, regular convolutional kernels, which have limited receptive fields (RFs) and often cannot detect subtle discriminative details, are applied to learn features. In this study, we propose a receptive field-aware network with finger attention (RFaNet) that highlights the finger regions and builds inter-finger relations. To highlight the discriminative details of these fingers, RFaNet reweights the low-level features of the hand depth image with those of the non-forearm image and improves finger localization, even when the wrist is occluded. RFaNet captures neighboring and inter-region dependencies between fingers in high-level features. An atrous convolution procedure enlarges the RFs at multiple scales and a non-local operation computes the interactions between multi-scale feature maps, thereby facilitating the building of inter-finger relations. Thus, the representation of a sign is invariant to viewpoint changes, which are primarily responsible for intra-class variability. On an American Sign Language fingerspelling dataset, RFaNet achieved 1.77% higher classification accuracy than state-of-the-art methods. RFaNet achieved effective transfer learning when the number of labeled depth images was insufficient. The fingerspelling representation of a depth image can be effectively transferred from large- to small-scale datasets via highlighting the finger regions and building inter-finger relations, thereby reducing the requirement for expensive fingerspelling annotations. |
format |
article |
author |
Shih-Hung Yang Yao-Mao Cheng Jyun-We Huang Yon-Ping Chen |
author_facet |
Shih-Hung Yang Yao-Mao Cheng Jyun-We Huang Yon-Ping Chen |
author_sort |
Shih-Hung Yang |
title |
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
title_short |
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
title_full |
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
title_fullStr |
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
title_full_unstemmed |
RFaNet: Receptive Field-Aware Network with Finger Attention for Fingerspelling Recognition Using a Depth Sensor |
title_sort |
rfanet: receptive field-aware network with finger attention for fingerspelling recognition using a depth sensor |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/2dd77c35306a42a194de0a472a3a3452 |
work_keys_str_mv |
AT shihhungyang rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor AT yaomaocheng rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor AT jyunwehuang rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor AT yonpingchen rfanetreceptivefieldawarenetworkwithfingerattentionforfingerspellingrecognitionusingadepthsensor |
_version_ |
1718431873008402432 |