Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)

Leagtar amach sa pháipéar seo na céimeanna a leanadh le Corpas Foclóireachta na Gaeilge 2020 (CFG2020), corpas aonteangach 77.3 milliún focal, a thiomsú. Mínítear comhthéacs an tionscadail agus na riachtanais a spreag na cinntí a tógadh lena linn. Déantar cur síos ansin ar chéim an tiomsaithe agus...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Mícheál J. Ó Meachair, Brian Ó Raghallaigh, Úna Bhreathnach, Gearóid Ó Cleircín, Kevin Scannell
Formato: article
Lenguaje:EN
GA
GD
Publicado: The Irish Association for Applied Linguistics 2021
Materias:
Acceso en línea:https://doaj.org/article/36d9b8106f2340e58d872333161b70dc
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:36d9b8106f2340e58d872333161b70dc
record_format dspace
spelling oai:doaj.org-article:36d9b8106f2340e58d872333161b70dc2021-11-27T16:07:03ZTiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)0332-205X2565-6325https://doaj.org/article/36d9b8106f2340e58d872333161b70dc2021-11-01T00:00:00Zhttps://journal.iraal.ie/index.php/teanga/article/view/726https://doaj.org/toc/0332-205Xhttps://doaj.org/toc/2565-6325 Leagtar amach sa pháipéar seo na céimeanna a leanadh le Corpas Foclóireachta na Gaeilge 2020 (CFG2020), corpas aonteangach 77.3 milliún focal, a thiomsú. Mínítear comhthéacs an tionscadail agus na riachtanais a spreag na cinntí a tógadh lena linn. Déantar cur síos ansin ar chéim an tiomsaithe agus ar na céimeanna próiseála. Tugtar spléachadh ar inneachar an chorpais, ar an acmhainn a cruthaíodh lena chuardach, agus ar an gcineál anailíse agus taighde a cumasaíodh leis seo. Tiomsaíodh CFG2020 ar an tuiscint gur réamhchéim é ar thionscadal níos leithne corpais, is ar an gcúis sin a dhéantar moltaí i dtaca lena fheabhsú agus lena mhéadú. [This paper sets out the steps followed in the compilation of Corpas Foclóireachta na Gaeilge 2020 (CFG2020), a monolingual 77.3 million word Irish-language corpus. The context and circumstances of the project are explained, along with the motivation for various decisions made. The compilation and processing stages are described in detail. The contents of the corpus are outlined and the resource created to query CFG2020 is presented, along with reference to the kinds of analysis and research which it enables. CFG2020 was created as a first step towards a proposed larger corpus project, and suggestions for improvement and expansion are therefore proposed.]         Mícheál J. Ó MeachairBrian Ó RaghallaighÚna BhreathnachGearóid Ó CleircínKevin ScannellThe Irish Association for Applied LinguisticsarticleCorpas foclóireachtaFoclóireachtCorpaisGaeilgeLexicographic corpusLexicographyPhilology. LinguisticsP1-1091ENGAGDTeanga: The Journal of the Irish Association for Applied Linguistics , Vol 28 (2021)
institution DOAJ
collection DOAJ
language EN
GA
GD
topic Corpas foclóireachta
Foclóireacht
Corpais
Gaeilge
Lexicographic corpus
Lexicography
Philology. Linguistics
P1-1091
spellingShingle Corpas foclóireachta
Foclóireacht
Corpais
Gaeilge
Lexicographic corpus
Lexicography
Philology. Linguistics
P1-1091
Mícheál J. Ó Meachair
Brian Ó Raghallaigh
Úna Bhreathnach
Gearóid Ó Cleircín
Kevin Scannell
Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
description Leagtar amach sa pháipéar seo na céimeanna a leanadh le Corpas Foclóireachta na Gaeilge 2020 (CFG2020), corpas aonteangach 77.3 milliún focal, a thiomsú. Mínítear comhthéacs an tionscadail agus na riachtanais a spreag na cinntí a tógadh lena linn. Déantar cur síos ansin ar chéim an tiomsaithe agus ar na céimeanna próiseála. Tugtar spléachadh ar inneachar an chorpais, ar an acmhainn a cruthaíodh lena chuardach, agus ar an gcineál anailíse agus taighde a cumasaíodh leis seo. Tiomsaíodh CFG2020 ar an tuiscint gur réamhchéim é ar thionscadal níos leithne corpais, is ar an gcúis sin a dhéantar moltaí i dtaca lena fheabhsú agus lena mhéadú. [This paper sets out the steps followed in the compilation of Corpas Foclóireachta na Gaeilge 2020 (CFG2020), a monolingual 77.3 million word Irish-language corpus. The context and circumstances of the project are explained, along with the motivation for various decisions made. The compilation and processing stages are described in detail. The contents of the corpus are outlined and the resource created to query CFG2020 is presented, along with reference to the kinds of analysis and research which it enables. CFG2020 was created as a first step towards a proposed larger corpus project, and suggestions for improvement and expansion are therefore proposed.]        
format article
author Mícheál J. Ó Meachair
Brian Ó Raghallaigh
Úna Bhreathnach
Gearóid Ó Cleircín
Kevin Scannell
author_facet Mícheál J. Ó Meachair
Brian Ó Raghallaigh
Úna Bhreathnach
Gearóid Ó Cleircín
Kevin Scannell
author_sort Mícheál J. Ó Meachair
title Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
title_short Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
title_full Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
title_fullStr Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
title_full_unstemmed Tiomsú Corpais don Taighde Foclóireachta: Corpas Foclóireachta na Gaeilge (CFG2020)
title_sort tiomsú corpais don taighde foclóireachta: corpas foclóireachta na gaeilge (cfg2020)
publisher The Irish Association for Applied Linguistics
publishDate 2021
url https://doaj.org/article/36d9b8106f2340e58d872333161b70dc
work_keys_str_mv AT michealjomeachair tiomsucorpaisdontaighdefocloireachtacorpasfocloireachtanagaeilgecfg2020
AT brianoraghallaigh tiomsucorpaisdontaighdefocloireachtacorpasfocloireachtanagaeilgecfg2020
AT unabhreathnach tiomsucorpaisdontaighdefocloireachtacorpasfocloireachtanagaeilgecfg2020
AT gearoidocleircin tiomsucorpaisdontaighdefocloireachtacorpasfocloireachtanagaeilgecfg2020
AT kevinscannell tiomsucorpaisdontaighdefocloireachtacorpasfocloireachtanagaeilgecfg2020
_version_ 1718408431770009600