Creating Lexical Resources in TEI P5

Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of va...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gerhard Budin, Stefan Majewski, Karlheinz Mörth
Formato: article
Lenguaje:DE
EN
ES
FR
IT
Publicado: OpenEdition 2012
Materias:
P5
NLP
Acceso en línea:https://doaj.org/article/3d9cdf4730ac43b08863c3f128baaf23
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:3d9cdf4730ac43b08863c3f128baaf23
record_format dspace
spelling oai:doaj.org-article:3d9cdf4730ac43b08863c3f128baaf232021-12-02T11:29:24ZCreating Lexical Resources in TEI P52162-560310.4000/jtei.522https://doaj.org/article/3d9cdf4730ac43b08863c3f128baaf232012-10-01T00:00:00Zhttp://journals.openedition.org/jtei/522https://doaj.org/toc/2162-5603Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes.Gerhard BudinStefan MajewskiKarlheinz MörthOpenEditionarticleP5dictionariesdigital lexicographyNLPComputer engineering. Computer hardwareTK7885-7895DEENESFRITJournal of the Text Encoding Initiative, Vol 3 (2012)
institution DOAJ
collection DOAJ
language DE
EN
ES
FR
IT
topic P5
dictionaries
digital lexicography
NLP
Computer engineering. Computer hardware
TK7885-7895
spellingShingle P5
dictionaries
digital lexicography
NLP
Computer engineering. Computer hardware
TK7885-7895
Gerhard Budin
Stefan Majewski
Karlheinz Mörth
Creating Lexical Resources in TEI P5
description Although most of the relevant dictionary productions of the recent past have relied on digital data and methods, there is little consensus on formats and standards. The Institute for Corpus Linguistics and Text Technology (ICLTT) of the Austrian Academy of Sciences has been conducting a number of varied lexicographic projects, both digitising print dictionaries and working on the creation of genuinely digital lexicographic data. This data was designed to serve varying purposes: machine-readability was only one. A second goal was interoperability with digital NLP tools. To achieve this end, a uniform encoding system applicable across all the projects was developed. The paper describes the constraints imposed on the content models of the various elements of the TEI dictionary module and provides arguments in favour of TEI P5 as an encoding system not only being used to represent digitised print dictionaries but also for NLP purposes.
format article
author Gerhard Budin
Stefan Majewski
Karlheinz Mörth
author_facet Gerhard Budin
Stefan Majewski
Karlheinz Mörth
author_sort Gerhard Budin
title Creating Lexical Resources in TEI P5
title_short Creating Lexical Resources in TEI P5
title_full Creating Lexical Resources in TEI P5
title_fullStr Creating Lexical Resources in TEI P5
title_full_unstemmed Creating Lexical Resources in TEI P5
title_sort creating lexical resources in tei p5
publisher OpenEdition
publishDate 2012
url https://doaj.org/article/3d9cdf4730ac43b08863c3f128baaf23
work_keys_str_mv AT gerhardbudin creatinglexicalresourcesinteip5
AT stefanmajewski creatinglexicalresourcesinteip5
AT karlheinzmorth creatinglexicalresourcesinteip5
_version_ 1718395872492912640