Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages

In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Zygmunt Vetulani, Grażyna Vetulani, Panchanan Mohanty
Formato: article
Lenguaje:EN
Publicado: Taylor & Francis Group 2021
Materias:
Acceso en línea:https://doaj.org/article/c6f2d551e22545159d0d736c814cdab1
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.