An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory

Abstract Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CR...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Pengfei Meng, Shuangcheng Jia, Qian Li
Formato: article
Lenguaje:EN
Publicado: Nature Portfolio 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/da4d000f1ad24142a21e258db08b060d
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:da4d000f1ad24142a21e258db08b060d
record_format dspace
spelling oai:doaj.org-article:da4d000f1ad24142a21e258db08b060d2021-11-28T12:15:35ZAn innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory10.1038/s41598-021-01520-y2045-2322https://doaj.org/article/da4d000f1ad24142a21e258db08b060d2021-11-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-01520-yhttps://doaj.org/toc/2045-2322Abstract Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively.Pengfei MengShuangcheng JiaQian LiNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-9 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Pengfei Meng
Shuangcheng Jia
Qian Li
An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
description Abstract Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively.
format article
author Pengfei Meng
Shuangcheng Jia
Qian Li
author_facet Pengfei Meng
Shuangcheng Jia
Qian Li
author_sort Pengfei Meng
title An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_short An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_full An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_fullStr An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_full_unstemmed An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_sort innovative network based on double receptive field and recursive bi-directional long short-term memory
publisher Nature Portfolio
publishDate 2021
url https://doaj.org/article/da4d000f1ad24142a21e258db08b060d
work_keys_str_mv AT pengfeimeng aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT shuangchengjia aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT qianli aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT pengfeimeng innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT shuangchengjia innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT qianli innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
_version_ 1718408106459791360