Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format

This paper presents work in progress on the DTA “Base Format” for Manuscripts (DTABf-M), an extension to the DTA “Base Format” (DTABf) for the TEI-conformant annotation of manuscripts. The DTABf is a TEI-subset for the consistent, yet unambiguous, annotation of large amounts of historical text. Duri...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Susanne Haaf, Christian Thomas
Formato:	article
Lenguaje:	DE EN ES FR IT
Publicado:	OpenEdition 2017
Materias:	annotation of manuscripts TEI corpora chaining ODDs interoperability interchange TEI customization Computer engineering. Computer hardware TK7885-7895
Acceso en línea:	https://doaj.org/article/d68c836aa48b4cabadf1336be87759d7
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:d68c836aa48b4cabadf1336be87759d7
record_format	dspace
spelling	oai:doaj.org-article:d68c836aa48b4cabadf1336be87759d72021-12-02T11:29:11ZEnabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format2162-560310.4000/jtei.1650https://doaj.org/article/d68c836aa48b4cabadf1336be87759d72017-08-01T00:00:00Zhttp://journals.openedition.org/jtei/1650https://doaj.org/toc/2162-5603This paper presents work in progress on the DTA “Base Format” for Manuscripts (DTABf-M), an extension to the DTA “Base Format” (DTABf) for the TEI-conformant annotation of manuscripts. The DTABf is a TEI-subset for the consistent, yet unambiguous, annotation of large amounts of historical text. During our work on the DTA corpora, the DTABf has continuously been subject to further adaptations to specific annotation needs. The latest addition, the DTABf-M, contains elements, attributes, and values necessary for the annotation of (historical) handwritten documents. The goal is to provide a TEI format for diverse manuscripts in large text corpora. While the DTABf covers a wide range of phenomena found not only in printed texts but also in manuscripts, there are certain manuscript-specific features which have to be additionally represented by the DTABf-M. There are several prerequisites for DTABf-M to be suitable for the DTA and its workflows and processes: First, it should be based on the original DTABf tagset, and only extend it if unavoidable. Second, like the DTABf, the DTABf-M should be created in a bottom-up approach, that is, based on actual phenomena found in handwritten texts which are transcribed and encoded using the DTABf. Third, the format should complement the DTABf, not replace it. Hence, it is necessary to find a modular way of integrating the DTABf-M into the DTABf. This paper describes how we deal with these issues in the process of developing the DTABf-M.Susanne HaafChristian ThomasOpenEditionarticleannotation of manuscriptsTEI corporachaining ODDsinteroperabilityinterchangeTEI customizationComputer engineering. Computer hardwareTK7885-7895DEENESFRITJournal of the Text Encoding Initiative, Vol 10 (2017)
institution	DOAJ
collection	DOAJ
language	DE EN ES FR IT
topic	annotation of manuscripts TEI corpora chaining ODDs interoperability interchange TEI customization Computer engineering. Computer hardware TK7885-7895
spellingShingle	annotation of manuscripts TEI corpora chaining ODDs interoperability interchange TEI customization Computer engineering. Computer hardware TK7885-7895 Susanne Haaf Christian Thomas Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
description	This paper presents work in progress on the DTA “Base Format” for Manuscripts (DTABf-M), an extension to the DTA “Base Format” (DTABf) for the TEI-conformant annotation of manuscripts. The DTABf is a TEI-subset for the consistent, yet unambiguous, annotation of large amounts of historical text. During our work on the DTA corpora, the DTABf has continuously been subject to further adaptations to specific annotation needs. The latest addition, the DTABf-M, contains elements, attributes, and values necessary for the annotation of (historical) handwritten documents. The goal is to provide a TEI format for diverse manuscripts in large text corpora. While the DTABf covers a wide range of phenomena found not only in printed texts but also in manuscripts, there are certain manuscript-specific features which have to be additionally represented by the DTABf-M. There are several prerequisites for DTABf-M to be suitable for the DTA and its workflows and processes: First, it should be based on the original DTABf tagset, and only extend it if unavoidable. Second, like the DTABf, the DTABf-M should be created in a bottom-up approach, that is, based on actual phenomena found in handwritten texts which are transcribed and encoded using the DTABf. Third, the format should complement the DTABf, not replace it. Hence, it is necessary to find a modular way of integrating the DTABf-M into the DTABf. This paper describes how we deal with these issues in the process of developing the DTABf-M.
format	article
author	Susanne Haaf Christian Thomas
author_facet	Susanne Haaf Christian Thomas
author_sort	Susanne Haaf
title	Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
title_short	Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
title_full	Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
title_fullStr	Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
title_full_unstemmed	Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format
title_sort	enabling the encoding of manuscripts within the dtabf: extension and modularization of the format
publisher	OpenEdition
publishDate	2017
url	https://doaj.org/article/d68c836aa48b4cabadf1336be87759d7
work_keys_str_mv	AT susannehaaf enablingtheencodingofmanuscriptswithinthedtabfextensionandmodularizationoftheformat AT christianthomas enablingtheencodingofmanuscriptswithinthedtabfextensionandmodularizationoftheformat
_version_	1718395908542955520

Enabling the Encoding of Manuscripts within the DTABf: Extension and Modularization of the Format

Ejemplares similares