Native Language Identification Across Text Types: How Special Are Scientists?

Native Language Identification (NLI) is the task of recognizing the native language of an author from text that they wrote in another language. In this paper, we investigate the generalizability of NLI models among learner corpora, and from learner corpora to a new text type, namely scientific artic...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Sabrina Stehwien, Sebastian Padó
Format:	article
Langue:	EN
Publié:	Accademia University Press 2016
Sujets:	Social Sciences H Computational linguistics. Natural language processing P98-98.5
Accès en ligne:	https://doaj.org/article/ad3c03ee29b045bcb83cade3d11d93f2
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

id	oai:doaj.org-article:ad3c03ee29b045bcb83cade3d11d93f2
record_format	dspace
spelling	oai:doaj.org-article:ad3c03ee29b045bcb83cade3d11d93f22021-12-02T09:52:25ZNative Language Identification Across Text Types: How Special Are Scientists?2499-455310.4000/ijcol.348https://doaj.org/article/ad3c03ee29b045bcb83cade3d11d93f22016-06-01T00:00:00Zhttp://journals.openedition.org/ijcol/348https://doaj.org/toc/2499-4553Native Language Identification (NLI) is the task of recognizing the native language of an author from text that they wrote in another language. In this paper, we investigate the generalizability of NLI models among learner corpora, and from learner corpora to a new text type, namely scientific articles. Our main results are: (a) the science corpus is not harder to model than some learner corpora; (b) it cannot profit as much as learner corpora from corpus combination via domain adaptation; (c) this pattern can be explained in terms of the respective models focusing on language transfer and topic indicators to different extents.Sabrina StehwienSebastian PadóAccademia University PressarticleSocial SciencesHComputational linguistics. Natural language processingP98-98.5ENIJCoL, Vol 2, Iss 1 (2016)
institution	DOAJ
collection	DOAJ
language	EN
topic	Social Sciences H Computational linguistics. Natural language processing P98-98.5
spellingShingle	Social Sciences H Computational linguistics. Natural language processing P98-98.5 Sabrina Stehwien Sebastian Padó Native Language Identification Across Text Types: How Special Are Scientists?
description	Native Language Identification (NLI) is the task of recognizing the native language of an author from text that they wrote in another language. In this paper, we investigate the generalizability of NLI models among learner corpora, and from learner corpora to a new text type, namely scientific articles. Our main results are: (a) the science corpus is not harder to model than some learner corpora; (b) it cannot profit as much as learner corpora from corpus combination via domain adaptation; (c) this pattern can be explained in terms of the respective models focusing on language transfer and topic indicators to different extents.
format	article
author	Sabrina Stehwien Sebastian Padó
author_facet	Sabrina Stehwien Sebastian Padó
author_sort	Sabrina Stehwien
title	Native Language Identification Across Text Types: How Special Are Scientists?
title_short	Native Language Identification Across Text Types: How Special Are Scientists?
title_full	Native Language Identification Across Text Types: How Special Are Scientists?
title_fullStr	Native Language Identification Across Text Types: How Special Are Scientists?
title_full_unstemmed	Native Language Identification Across Text Types: How Special Are Scientists?
title_sort	native language identification across text types: how special are scientists?
publisher	Accademia University Press
publishDate	2016
url	https://doaj.org/article/ad3c03ee29b045bcb83cade3d11d93f2
work_keys_str_mv	AT sabrinastehwien nativelanguageidentificationacrosstexttypeshowspecialarescientists AT sebastianpado nativelanguageidentificationacrosstexttypeshowspecialarescientists
_version_	1718397963380719616

Native Language Identification Across Text Types: How Special Are Scientists?

Documents similaires