A TEI Schema for the Representation of Computer-mediated Communication

The paper presents an XML schema for the representation of genres of computer-mediated communication (CMC) that is compliant with the encoding framework defined by the TEI. It was designed for the annotation of CMC documents in the project Deutsches Referenzkorpus zur internetbasierten Kommunikation...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Michael Beißwenger, Maria Ermakova, Alexander Geyken, Lothar Lemnitzer, Angelika Storrer
Formato: article
Lenguaje:DE
EN
ES
FR
IT
Publicado: OpenEdition 2012
Materias:
CMC
Acceso en línea:https://doaj.org/article/47a049392c9243fb8944e414c9bb56e5
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:47a049392c9243fb8944e414c9bb56e5
record_format dspace
spelling oai:doaj.org-article:47a049392c9243fb8944e414c9bb56e52021-12-02T11:29:20ZA TEI Schema for the Representation of Computer-mediated Communication2162-560310.4000/jtei.476https://doaj.org/article/47a049392c9243fb8944e414c9bb56e52012-10-01T00:00:00Zhttp://journals.openedition.org/jtei/476https://doaj.org/toc/2162-5603The paper presents an XML schema for the representation of genres of computer-mediated communication (CMC) that is compliant with the encoding framework defined by the TEI. It was designed for the annotation of CMC documents in the project Deutsches Referenzkorpus zur internetbasierten Kommunikation (DeRiK), which aims at building a corpus on language use in the most popular CMC genres on the German-speaking Internet. The focus of the schema is on those CMC genres which are written and dialogic―such as forums, bulletin boards, chats, instant messaging, wiki and weblog discussions, microblogging on Twitter, and conversation on “social network” sites. The schema provides a representation format for the main structural features of CMC discourse as well as elements for the annotation of those units regarded as “typical” for language use on the Internet. The schema introduces an element <posting>, which describes stretches of text that are sent to the server by a user at a certain point in time. Postings are the main constituting elements of threads and logfiles, which, in our schema, are the two main types of CMC macrostructures. For the microlevel of CMC documents (that is, the structure of the <posting> content), the schema introduces elements for selected features of Internet jargon such as emoticons, interaction words and addressing terms. It allows for easy anonymization of CMC data for purposes in which the annotated data are made publicly available and includes metadata which are necessary for referencing random excerpts from the data as references in dictionary entries or as results of corpus queries. Documentation of the schema as well as encoding examples can be retrieved from the web at http://www.empirikom.net/bin/view/Themen/CmcTEI. The schema is meant to be a core model for representing CMC that can be modified and extended by others according to their own specific perspectives on CMC data. It could be a first step towards an integration of features for the representation of CMC genres into a future new version of the TEI Guidelines.Michael BeißwengerMaria ErmakovaAlexander GeykenLothar LemnitzerAngelika StorrerOpenEditionarticlecomputer-mediated communicationCMCweb genresthreadlogfileforumComputer engineering. Computer hardwareTK7885-7895DEENESFRITJournal of the Text Encoding Initiative, Vol 3 (2012)
institution DOAJ
collection DOAJ
language DE
EN
ES
FR
IT
topic computer-mediated communication
CMC
web genres
thread
logfile
forum
Computer engineering. Computer hardware
TK7885-7895
spellingShingle computer-mediated communication
CMC
web genres
thread
logfile
forum
Computer engineering. Computer hardware
TK7885-7895
Michael Beißwenger
Maria Ermakova
Alexander Geyken
Lothar Lemnitzer
Angelika Storrer
A TEI Schema for the Representation of Computer-mediated Communication
description The paper presents an XML schema for the representation of genres of computer-mediated communication (CMC) that is compliant with the encoding framework defined by the TEI. It was designed for the annotation of CMC documents in the project Deutsches Referenzkorpus zur internetbasierten Kommunikation (DeRiK), which aims at building a corpus on language use in the most popular CMC genres on the German-speaking Internet. The focus of the schema is on those CMC genres which are written and dialogic―such as forums, bulletin boards, chats, instant messaging, wiki and weblog discussions, microblogging on Twitter, and conversation on “social network” sites. The schema provides a representation format for the main structural features of CMC discourse as well as elements for the annotation of those units regarded as “typical” for language use on the Internet. The schema introduces an element <posting>, which describes stretches of text that are sent to the server by a user at a certain point in time. Postings are the main constituting elements of threads and logfiles, which, in our schema, are the two main types of CMC macrostructures. For the microlevel of CMC documents (that is, the structure of the <posting> content), the schema introduces elements for selected features of Internet jargon such as emoticons, interaction words and addressing terms. It allows for easy anonymization of CMC data for purposes in which the annotated data are made publicly available and includes metadata which are necessary for referencing random excerpts from the data as references in dictionary entries or as results of corpus queries. Documentation of the schema as well as encoding examples can be retrieved from the web at http://www.empirikom.net/bin/view/Themen/CmcTEI. The schema is meant to be a core model for representing CMC that can be modified and extended by others according to their own specific perspectives on CMC data. It could be a first step towards an integration of features for the representation of CMC genres into a future new version of the TEI Guidelines.
format article
author Michael Beißwenger
Maria Ermakova
Alexander Geyken
Lothar Lemnitzer
Angelika Storrer
author_facet Michael Beißwenger
Maria Ermakova
Alexander Geyken
Lothar Lemnitzer
Angelika Storrer
author_sort Michael Beißwenger
title A TEI Schema for the Representation of Computer-mediated Communication
title_short A TEI Schema for the Representation of Computer-mediated Communication
title_full A TEI Schema for the Representation of Computer-mediated Communication
title_fullStr A TEI Schema for the Representation of Computer-mediated Communication
title_full_unstemmed A TEI Schema for the Representation of Computer-mediated Communication
title_sort tei schema for the representation of computer-mediated communication
publisher OpenEdition
publishDate 2012
url https://doaj.org/article/47a049392c9243fb8944e414c9bb56e5
work_keys_str_mv AT michaelbeißwenger ateischemafortherepresentationofcomputermediatedcommunication
AT mariaermakova ateischemafortherepresentationofcomputermediatedcommunication
AT alexandergeyken ateischemafortherepresentationofcomputermediatedcommunication
AT lotharlemnitzer ateischemafortherepresentationofcomputermediatedcommunication
AT angelikastorrer ateischemafortherepresentationofcomputermediatedcommunication
AT michaelbeißwenger teischemafortherepresentationofcomputermediatedcommunication
AT mariaermakova teischemafortherepresentationofcomputermediatedcommunication
AT alexandergeyken teischemafortherepresentationofcomputermediatedcommunication
AT lotharlemnitzer teischemafortherepresentationofcomputermediatedcommunication
AT angelikastorrer teischemafortherepresentationofcomputermediatedcommunication
_version_ 1718395875350282240