Subword Recognition in Historical Arabic Documents using C-GRUs
The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive na...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
UIKTEN
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/ca1500247a4d4c0881175e0aacbae990 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:ca1500247a4d4c0881175e0aacbae990 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:ca1500247a4d4c0881175e0aacbae9902021-12-01T10:50:02ZSubword Recognition in Historical Arabic Documents using C-GRUs10.18421/TEM104-192217-83092217-8333https://doaj.org/article/ca1500247a4d4c0881175e0aacbae9902021-11-01T00:00:00Zhttps://www.temjournal.com/content/104/TEMJournalNovember2021_1630_1637.pdfhttps://doaj.org/toc/2217-8309https://doaj.org/toc/2217-8333The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords.Hanadi HassenSomaya Al-MadeedAhmed BouridaneUIKTENarticlehandwriting recognitionarabic historical documentscnnsgrusclassificationEducationLTechnologyTENTEM Journal, Vol 10, Iss 4, Pp 1630-1637 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
handwriting recognition arabic historical documents cnns grus classification Education L Technology T |
spellingShingle |
handwriting recognition arabic historical documents cnns grus classification Education L Technology T Hanadi Hassen Somaya Al-Madeed Ahmed Bouridane Subword Recognition in Historical Arabic Documents using C-GRUs |
description |
The recent years have witnessed an increased tendency to digitize historical manuscripts that not only ensures the preservation of these collections but also allows researchers and end-users’ direct access to these images. Recognition of Arabic handwriting is challenging due to the highly cursive nature of the script and other challenges associated with historical documents (degradation etc.). This paper presents an end-to-end system to recognize Arabic handwritten sub words in historical documents. More specifically, we introduce a hybrid CNN-GRU model where the shallow convolutional network learns robust feature representations while the GRU layers carry out the sequence modelling and generate the transcription of the text. The proposed system is evaluated on two different datasets, IBN SINA and VML-HD reporting recognition rates of 96.10% and 98.60% respectively. A comparison with existing techniques evaluated on the same datasets validates the effectiveness of our proposed model in characterizing Arabic subwords. |
format |
article |
author |
Hanadi Hassen Somaya Al-Madeed Ahmed Bouridane |
author_facet |
Hanadi Hassen Somaya Al-Madeed Ahmed Bouridane |
author_sort |
Hanadi Hassen |
title |
Subword Recognition in Historical Arabic Documents using C-GRUs |
title_short |
Subword Recognition in Historical Arabic Documents using C-GRUs |
title_full |
Subword Recognition in Historical Arabic Documents using C-GRUs |
title_fullStr |
Subword Recognition in Historical Arabic Documents using C-GRUs |
title_full_unstemmed |
Subword Recognition in Historical Arabic Documents using C-GRUs |
title_sort |
subword recognition in historical arabic documents using c-grus |
publisher |
UIKTEN |
publishDate |
2021 |
url |
https://doaj.org/article/ca1500247a4d4c0881175e0aacbae990 |
work_keys_str_mv |
AT hanadihassen subwordrecognitioninhistoricalarabicdocumentsusingcgrus AT somayaalmadeed subwordrecognitioninhistoricalarabicdocumentsusingcgrus AT ahmedbouridane subwordrecognitioninhistoricalarabicdocumentsusingcgrus |
_version_ |
1718405260861505536 |