Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages

In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Zygmunt Vetulani, Grażyna Vetulani, Panchanan Mohanty
Formato: article
Lenguaje:EN
Publicado: Taylor & Francis Group 2021
Materias:
Acceso en línea:https://doaj.org/article/c6f2d551e22545159d0d736c814cdab1
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:c6f2d551e22545159d0d736c814cdab1
record_format dspace
spelling oai:doaj.org-article:c6f2d551e22545159d0d736c814cdab12021-11-17T14:22:00ZDevelopment of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages2475-18392475-184710.1080/24751839.2021.1966236https://doaj.org/article/c6f2d551e22545159d0d736c814cdab12021-10-01T00:00:00Zhttp://dx.doi.org/10.1080/24751839.2021.1966236https://doaj.org/toc/2475-1839https://doaj.org/toc/2475-1847In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.Zygmunt VetulaniGrażyna VetulaniPanchanan MohantyTaylor & Francis Grouparticleit systems with nl competencelanguage resources and toolslanguage technologyless-resourced languageslexicon-grammarTelecommunicationTK5101-6720Information technologyT58.5-58.64ENJournal of Information and Telecommunication, Vol 5, Iss 4, Pp 514-535 (2021)
institution DOAJ
collection DOAJ
language EN
topic it systems with nl competence
language resources and tools
language technology
less-resourced languages
lexicon-grammar
Telecommunication
TK5101-6720
Information technology
T58.5-58.64
spellingShingle it systems with nl competence
language resources and tools
language technology
less-resourced languages
lexicon-grammar
Telecommunication
TK5101-6720
Information technology
T58.5-58.64
Zygmunt Vetulani
Grażyna Vetulani
Panchanan Mohanty
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
description In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.
format article
author Zygmunt Vetulani
Grażyna Vetulani
Panchanan Mohanty
author_facet Zygmunt Vetulani
Grażyna Vetulani
Panchanan Mohanty
author_sort Zygmunt Vetulani
title Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
title_short Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
title_full Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
title_fullStr Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
title_full_unstemmed Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
title_sort development of real size it systems with language competence as a challenge for a less-resourced language: a methodological proposal for indo-aryan languages
publisher Taylor & Francis Group
publishDate 2021
url https://doaj.org/article/c6f2d551e22545159d0d736c814cdab1
work_keys_str_mv AT zygmuntvetulani developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages
AT grazynavetulani developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages
AT panchananmohanty developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages
_version_ 1718425444778246144