Towards Developing a Comprehensive Tag Set for the Arabic Language

This paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology use...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Alqrainy Shihadeh, Alawairdhi Muhammed
Formato: article
Lenguaje:EN
Publicado: De Gruyter 2020
Materias:
Q
Acceso en línea:https://doaj.org/article/bddd0e27f2fe474f9ec71882524e770e
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:This paper presents a comprehensive Tag set as a fundamental component for developing an automated Word Class/Part-of-Speech (PoS) tagging system for the Arabic language. The aim is to develop a standard and comprehensive PoS tag set that based upon PoS classes and Arabic inflectional morphology useful for Linguistics and Natural Language Processing (NLP) developers to extract more linguistic information from it. The tag names in the developed tag set uses terminology from Arabic tradition grammar rather than English grammar. The usability of the presented Tag set has been tested in manual tagging and built up a set of tagged text to serve as a goal corpus used to compare it with the results obtained from the tagger. The tagger has achieved an average accuracy of 90% using the developed detailed tag set.