Machine Translation in Low-Resource Languages by an Adversarial Neural Network
Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parall...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/c042e7e12b1748c3830acce4f976d5e0 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:c042e7e12b1748c3830acce4f976d5e0 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:c042e7e12b1748c3830acce4f976d5e02021-11-25T16:39:30ZMachine Translation in Low-Resource Languages by an Adversarial Neural Network10.3390/app1122108602076-3417https://doaj.org/article/c042e7e12b1748c3830acce4f976d5e02021-11-01T00:00:00Zhttps://www.mdpi.com/2076-3417/11/22/10860https://doaj.org/toc/2076-3417Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parallel sentence pairs. This study utilizes adversary and transfer learning techniques to mitigate the lack of sentence pairs in LRL corpora. We propose a new Low resource, Adversarial, Cross-lingual (LAC) model for NMT. In terms of the adversary technique, LAC model consists of a generator and discriminator. The generator is a Seq2Seq model that produces the translations from source to target languages, while the discriminator measures the gap between machine and human translations. In addition, we introduce transfer learning on LAC model to help capture the features in rare resources because some languages share the same subject-verb-object grammatical structure. Rather than using the entire pretrained LAC model, we separately utilize the pretrained generator and discriminator. The pretrained discriminator exhibited better performance in all experiments. Experimental results demonstrate that the LAC model achieves higher Bilingual Evaluation Understudy (BLEU) scores and has good potential to augment LRL translations.Mengtao SunHao WangMark PasquineIbrahim A. HameedMDPI AGarticlemachine learningadversarial machine learningimbalanced datasetstransfer learningTechnologyTEngineering (General). Civil engineering (General)TA1-2040Biology (General)QH301-705.5PhysicsQC1-999ChemistryQD1-999ENApplied Sciences, Vol 11, Iss 10860, p 10860 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
machine learning adversarial machine learning imbalanced datasets transfer learning Technology T Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999 |
spellingShingle |
machine learning adversarial machine learning imbalanced datasets transfer learning Technology T Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999 Mengtao Sun Hao Wang Mark Pasquine Ibrahim A. Hameed Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
description |
Existing Sequence-to-Sequence (Seq2Seq) Neural Machine Translation (NMT) shows strong capability with High-Resource Languages (HRLs). However, this approach poses serious challenges when processing Low-Resource Languages (LRLs), because the model expression is limited by the training scale of parallel sentence pairs. This study utilizes adversary and transfer learning techniques to mitigate the lack of sentence pairs in LRL corpora. We propose a new Low resource, Adversarial, Cross-lingual (LAC) model for NMT. In terms of the adversary technique, LAC model consists of a generator and discriminator. The generator is a Seq2Seq model that produces the translations from source to target languages, while the discriminator measures the gap between machine and human translations. In addition, we introduce transfer learning on LAC model to help capture the features in rare resources because some languages share the same subject-verb-object grammatical structure. Rather than using the entire pretrained LAC model, we separately utilize the pretrained generator and discriminator. The pretrained discriminator exhibited better performance in all experiments. Experimental results demonstrate that the LAC model achieves higher Bilingual Evaluation Understudy (BLEU) scores and has good potential to augment LRL translations. |
format |
article |
author |
Mengtao Sun Hao Wang Mark Pasquine Ibrahim A. Hameed |
author_facet |
Mengtao Sun Hao Wang Mark Pasquine Ibrahim A. Hameed |
author_sort |
Mengtao Sun |
title |
Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
title_short |
Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
title_full |
Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
title_fullStr |
Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
title_full_unstemmed |
Machine Translation in Low-Resource Languages by an Adversarial Neural Network |
title_sort |
machine translation in low-resource languages by an adversarial neural network |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/c042e7e12b1748c3830acce4f976d5e0 |
work_keys_str_mv |
AT mengtaosun machinetranslationinlowresourcelanguagesbyanadversarialneuralnetwork AT haowang machinetranslationinlowresourcelanguagesbyanadversarialneuralnetwork AT markpasquine machinetranslationinlowresourcelanguagesbyanadversarialneuralnetwork AT ibrahimahameed machinetranslationinlowresourcelanguagesbyanadversarialneuralnetwork |
_version_ |
1718413100582961152 |