Subword Recognition in Historical Arabic Documents using C-GRUs

The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive na...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Hanadi Hassen, Somaya Al-Madeed, Ahmed Bouridane
Formato: article
Lenguaje:EN
Publicado: UIKTEN 2021
Materias:
L
T
Acceso en línea:https://doaj.org/article/ca1500247a4d4c0881175e0aacbae990
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:ca1500247a4d4c0881175e0aacbae990
record_format dspace
spelling oai:doaj.org-article:ca1500247a4d4c0881175e0aacbae9902021-12-01T10:50:02ZSubword Recognition in Historical Arabic Documents using C-GRUs10.18421/TEM104-192217-83092217-8333https://doaj.org/article/ca1500247a4d4c0881175e0aacbae9902021-11-01T00:00:00Zhttps://www.temjournal.com/content/104/TEMJournalNovember2021_1630_1637.pdfhttps://doaj.org/toc/2217-8309https://doaj.org/toc/2217-8333The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.Hanadi HassenSomaya Al-MadeedAhmed BouridaneUIKTENarticlehandwriting recognitionarabic historical documentscnnsgrusclassificationEducationLTechnologyTENTEM Journal, Vol 10, Iss 4, Pp 1630-1637 (2021)
institution DOAJ
collection DOAJ
language EN
topic handwriting recognition
arabic historical documents
cnns
grus
classification
Education
L
Technology
T
spellingShingle handwriting recognition
arabic historical documents
cnns
grus
classification
Education
L
Technology
T
Hanadi Hassen
Somaya Al-Madeed
Ahmed Bouridane
Subword Recognition in Historical Arabic Documents using C-GRUs
description The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.
format article
author Hanadi Hassen
Somaya Al-Madeed
Ahmed Bouridane
author_facet Hanadi Hassen
Somaya Al-Madeed
Ahmed Bouridane
author_sort Hanadi Hassen
title Subword Recognition in Historical Arabic Documents using C-GRUs
title_short Subword Recognition in Historical Arabic Documents using C-GRUs
title_full Subword Recognition in Historical Arabic Documents using C-GRUs
title_fullStr Subword Recognition in Historical Arabic Documents using C-GRUs
title_full_unstemmed Subword Recognition in Historical Arabic Documents using C-GRUs
title_sort subword recognition in historical arabic documents using c-grus
publisher UIKTEN
publishDate 2021
url https://doaj.org/article/ca1500247a4d4c0881175e0aacbae990
work_keys_str_mv AT hanadihassen subwordrecognitioninhistoricalarabicdocumentsusingcgrus
AT somayaalmadeed subwordrecognitioninhistoricalarabicdocumentsusingcgrus
AT ahmedbouridane subwordrecognitioninhistoricalarabicdocumentsusingcgrus
_version_ 1718405260861505536