Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.

Named entity recognition (NER) is one fundamental task in the natural language processing (NLP) community. Supervised neural network models based on contextualized word representations can achieve highly-competitive performance, which requires a large-scale manually-annotated corpus for training. Wh...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Huijiong Yan, Tao Qian, Liang Xie, Shanguang Chen
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/7f90d99991f446ffb3c4044a3dca981b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:7f90d99991f446ffb3c4044a3dca981b
record_format dspace
spelling oai:doaj.org-article:7f90d99991f446ffb3c4044a3dca981b2021-12-02T20:06:12ZUnsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.1932-620310.1371/journal.pone.0257230https://doaj.org/article/7f90d99991f446ffb3c4044a3dca981b2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0257230https://doaj.org/toc/1932-6203Named entity recognition (NER) is one fundamental task in the natural language processing (NLP) community. Supervised neural network models based on contextualized word representations can achieve highly-competitive performance, which requires a large-scale manually-annotated corpus for training. While for the resource-scarce languages, the construction of such as corpus is always expensive and time-consuming. Thus, unsupervised cross-lingual transfer is one good solution to address the problem. In this work, we investigate the unsupervised cross-lingual NER with model transfer based on contextualized word representations, which greatly advances the cross-lingual NER performance. We study several model transfer settings of the unsupervised cross-lingual NER, including (1) different types of the pretrained transformer-based language models as input, (2) the exploration strategies of the multilingual contextualized word representations, and (3) multi-source adaption. In particular, we propose an adapter-based word representation method combining with parameter generation network (PGN) better to capture the relationship between the source and target languages. We conduct experiments on a benchmark ConLL dataset involving four languages to simulate the cross-lingual setting. Results show that we can obtain highly-competitive performance by cross-lingual model transfer. In particular, our proposed adapter-based PGN model can lead to significant improvements for cross-lingual NER.Huijiong YanTao QianLiang XieShanguang ChenPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 9, p e0257230 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Huijiong Yan
Tao Qian
Liang Xie
Shanguang Chen
Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
description Named entity recognition (NER) is one fundamental task in the natural language processing (NLP) community. Supervised neural network models based on contextualized word representations can achieve highly-competitive performance, which requires a large-scale manually-annotated corpus for training. While for the resource-scarce languages, the construction of such as corpus is always expensive and time-consuming. Thus, unsupervised cross-lingual transfer is one good solution to address the problem. In this work, we investigate the unsupervised cross-lingual NER with model transfer based on contextualized word representations, which greatly advances the cross-lingual NER performance. We study several model transfer settings of the unsupervised cross-lingual NER, including (1) different types of the pretrained transformer-based language models as input, (2) the exploration strategies of the multilingual contextualized word representations, and (3) multi-source adaption. In particular, we propose an adapter-based word representation method combining with parameter generation network (PGN) better to capture the relationship between the source and target languages. We conduct experiments on a benchmark ConLL dataset involving four languages to simulate the cross-lingual setting. Results show that we can obtain highly-competitive performance by cross-lingual model transfer. In particular, our proposed adapter-based PGN model can lead to significant improvements for cross-lingual NER.
format article
author Huijiong Yan
Tao Qian
Liang Xie
Shanguang Chen
author_facet Huijiong Yan
Tao Qian
Liang Xie
Shanguang Chen
author_sort Huijiong Yan
title Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
title_short Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
title_full Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
title_fullStr Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
title_full_unstemmed Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
title_sort unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/7f90d99991f446ffb3c4044a3dca981b
work_keys_str_mv AT huijiongyan unsupervisedcrosslingualmodeltransferfornamedentityrecognitionwithcontextualizedwordrepresentations
AT taoqian unsupervisedcrosslingualmodeltransferfornamedentityrecognitionwithcontextualizedwordrepresentations
AT liangxie unsupervisedcrosslingualmodeltransferfornamedentityrecognitionwithcontextualizedwordrepresentations
AT shanguangchen unsupervisedcrosslingualmodeltransferfornamedentityrecognitionwithcontextualizedwordrepresentations
_version_ 1718375436154568704