Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages
In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Taylor & Francis Group
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/c6f2d551e22545159d0d736c814cdab1 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:c6f2d551e22545159d0d736c814cdab1 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:c6f2d551e22545159d0d736c814cdab12021-11-17T14:22:00ZDevelopment of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages2475-18392475-184710.1080/24751839.2021.1966236https://doaj.org/article/c6f2d551e22545159d0d736c814cdab12021-10-01T00:00:00Zhttp://dx.doi.org/10.1080/24751839.2021.1966236https://doaj.org/toc/2475-1839https://doaj.org/toc/2475-1847In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali.Zygmunt VetulaniGrażyna VetulaniPanchanan MohantyTaylor & Francis Grouparticleit systems with nl competencelanguage resources and toolslanguage technologyless-resourced languageslexicon-grammarTelecommunicationTK5101-6720Information technologyT58.5-58.64ENJournal of Information and Telecommunication, Vol 5, Iss 4, Pp 514-535 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
it systems with nl competence language resources and tools language technology less-resourced languages lexicon-grammar Telecommunication TK5101-6720 Information technology T58.5-58.64 |
spellingShingle |
it systems with nl competence language resources and tools language technology less-resourced languages lexicon-grammar Telecommunication TK5101-6720 Information technology T58.5-58.64 Zygmunt Vetulani Grażyna Vetulani Panchanan Mohanty Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
description |
In this paper, based on the example of our early works for Polish, we want to share our experience in the challenging task of developing NLP-based technologies in the situation of initial scarcity of digital language resources that ranked Polish among the Less-Resourced Languages. We present some of our projects aiming at language resources and tools we had to create in order to be able to process texts in Polish and develop real-scale systems with language understanding competence. The case study we present here is the rule-based system POLINT-112-SMS for improving information management in emergency situations. We argue in favour of the lexicon-grammar approach to the formal description of inflecting languages and present our current work on this grammatical paradigm. Our current work is on the implementation of the ideas presented in the first part of the paper on three prominent Indian languages, that is, Hindi, Odia, and Bengali. |
format |
article |
author |
Zygmunt Vetulani Grażyna Vetulani Panchanan Mohanty |
author_facet |
Zygmunt Vetulani Grażyna Vetulani Panchanan Mohanty |
author_sort |
Zygmunt Vetulani |
title |
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
title_short |
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
title_full |
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
title_fullStr |
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
title_full_unstemmed |
Development of real size IT systems with language competence as a challenge for a Less-Resourced Language: a methodological proposal for Indo-Aryan languages |
title_sort |
development of real size it systems with language competence as a challenge for a less-resourced language: a methodological proposal for indo-aryan languages |
publisher |
Taylor & Francis Group |
publishDate |
2021 |
url |
https://doaj.org/article/c6f2d551e22545159d0d736c814cdab1 |
work_keys_str_mv |
AT zygmuntvetulani developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages AT grazynavetulani developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages AT panchananmohanty developmentofrealsizeitsystemswithlanguagecompetenceasachallengeforalessresourcedlanguageamethodologicalproposalforindoaryanlanguages |
_version_ |
1718425444778246144 |