Towards Developing a Comprehensive Tag Set for the Arabic Language

This paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology use...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Alqrainy Shihadeh, Alawairdhi Muhammed
Formato: article
Lenguaje:EN
Publicado: De Gruyter 2020
Materias:
Q
Acceso en línea:https://doaj.org/article/bddd0e27f2fe474f9ec71882524e770e
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:bddd0e27f2fe474f9ec71882524e770e
record_format dspace
spelling oai:doaj.org-article:bddd0e27f2fe474f9ec71882524e770e2021-12-05T14:10:51ZTowards Developing a Comprehensive Tag Set for the Arabic Language2191-026X10.1515/jisys-2019-0256https://doaj.org/article/bddd0e27f2fe474f9ec71882524e770e2020-09-01T00:00:00Zhttps://doi.org/10.1515/jisys-2019-0256https://doaj.org/toc/2191-026XThis paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology useful for Linguistics and Natural Language Processing (NLP) developers to extract more linguistic information from it. The tag names in the developed tag set uses terminology from Arabic tradition grammar rather than English grammar. The usability of the presented Tag set has been tested in manual tagging and built up a set of tagged text to serve as a goal corpus used to compare it with the results obtained from the tagger. The tagger has achieved an average accuracy of 90% using the developed detailed tag set.Alqrainy ShihadehAlawairdhi MuhammedDe Gruyterarticlearabic languagepart-of-speech (pos)tag setnatural language processing (nlp)68-xxScienceQElectronic computers. Computer scienceQA75.5-76.95ENJournal of Intelligent Systems, Vol 30, Iss 1, Pp 287-296 (2020)
institution DOAJ
collection DOAJ
language EN
topic arabic language
part-of-speech (pos)
tag set
natural language processing (nlp)
68-xx
Science
Q
Electronic computers. Computer science
QA75.5-76.95
spellingShingle arabic language
part-of-speech (pos)
tag set
natural language processing (nlp)
68-xx
Science
Q
Electronic computers. Computer science
QA75.5-76.95
Alqrainy Shihadeh
Alawairdhi Muhammed
Towards Developing a Comprehensive Tag Set for the Arabic Language
description This paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology useful for Linguistics and Natural Language Processing (NLP) developers to extract more linguistic information from it. The tag names in the developed tag set uses terminology from Arabic tradition grammar rather than English grammar. The usability of the presented Tag set has been tested in manual tagging and built up a set of tagged text to serve as a goal corpus used to compare it with the results obtained from the tagger. The tagger has achieved an average accuracy of 90% using the developed detailed tag set.
format article
author Alqrainy Shihadeh
Alawairdhi Muhammed
author_facet Alqrainy Shihadeh
Alawairdhi Muhammed
author_sort Alqrainy Shihadeh
title Towards Developing a Comprehensive Tag Set for the Arabic Language
title_short Towards Developing a Comprehensive Tag Set for the Arabic Language
title_full Towards Developing a Comprehensive Tag Set for the Arabic Language
title_fullStr Towards Developing a Comprehensive Tag Set for the Arabic Language
title_full_unstemmed Towards Developing a Comprehensive Tag Set for the Arabic Language
title_sort towards developing a comprehensive tag set for the arabic language
publisher De Gruyter
publishDate 2020
url https://doaj.org/article/bddd0e27f2fe474f9ec71882524e770e
work_keys_str_mv AT alqrainyshihadeh towardsdevelopingacomprehensivetagsetforthearabiclanguage
AT alawairdhimuhammed towardsdevelopingacomprehensivetagsetforthearabiclanguage
_version_ 1718371687316062208