High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus

The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., Eng...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Lanxin Zhao, Wanrong Gao, Jianbin Fang
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	neural machine translation transformer GPUs multi-domain corpus Technology T Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999
Acceso en línea:	https://doaj.org/article/7d69fc6909834514a76851fe72bfceb4
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:7d69fc6909834514a76851fe72bfceb4
record_format	dspace
spelling	oai:doaj.org-article:7d69fc6909834514a76851fe72bfceb42021-11-25T16:40:43ZHigh-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus10.3390/app1122109152076-3417https://doaj.org/article/7d69fc6909834514a76851fe72bfceb42021-11-01T00:00:00Zhttps://www.mdpi.com/2076-3417/11/22/10915https://doaj.org/toc/2076-3417The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., English) to Chinese. This article has implemented a GPU-enabled deep learning machine translation system based on a domain-specific corpus. Our system takes English text as input and uses an encoder-decoder model with an attention mechanism based on Google’s Transformer to translate the text to Chinese output. The model was trained using a simple self-designed entropy loss function and an Adam optimizer on English–Chinese bilingual text sentences from the News area of the UM-Corpus. The parallel training process of our model can be performed on common laptops, desktops, and servers with one or more GPUs. At training time, we not only track loss over training epochs but also measure the quality of our model’s translations with the BLEU score. We also provide an easy-to-use web interface for users so as to manage corpus, training projects, and trained models. The experimental results show that we can achieve a maximum BLEU score of 29.2. We can further improve this score by tuning other hyperparameters. The GPU-enabled model training runs over 15x faster than on a multi-core CPU, which facilitates us having a shorter turn-around time. As a case study, we compare the performance of our model to that of Baidu’s, which shows that our model can compete with the industry-level translation system. We argue that our deep-learning-based translation system is particularly suitable for teaching purposes and small/medium-sized enterprises.Lanxin ZhaoWanrong GaoJianbin FangMDPI AGarticleneural machine translationtransformerGPUsmulti-domain corpusTechnologyTEngineering (General). Civil engineering (General)TA1-2040Biology (General)QH301-705.5PhysicsQC1-999ChemistryQD1-999ENApplied Sciences, Vol 11, Iss 10915, p 10915 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	neural machine translation transformer GPUs multi-domain corpus Technology T Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999
spellingShingle	neural machine translation transformer GPUs multi-domain corpus Technology T Engineering (General). Civil engineering (General) TA1-2040 Biology (General) QH301-705.5 Physics QC1-999 Chemistry QD1-999 Lanxin Zhao Wanrong Gao Jianbin Fang High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
description	The ability to automate machine translation has various applications in international commerce, medicine, travel, education, and text digitization. Due to the different grammar and lack of clear word boundaries in Chinese, it is challenging to conduct translation from word-based languages (e.g., English) to Chinese. This article has implemented a GPU-enabled deep learning machine translation system based on a domain-specific corpus. Our system takes English text as input and uses an encoder-decoder model with an attention mechanism based on Google’s Transformer to translate the text to Chinese output. The model was trained using a simple self-designed entropy loss function and an Adam optimizer on English–Chinese bilingual text sentences from the News area of the UM-Corpus. The parallel training process of our model can be performed on common laptops, desktops, and servers with one or more GPUs. At training time, we not only track loss over training epochs but also measure the quality of our model’s translations with the BLEU score. We also provide an easy-to-use web interface for users so as to manage corpus, training projects, and trained models. The experimental results show that we can achieve a maximum BLEU score of 29.2. We can further improve this score by tuning other hyperparameters. The GPU-enabled model training runs over 15x faster than on a multi-core CPU, which facilitates us having a shorter turn-around time. As a case study, we compare the performance of our model to that of Baidu’s, which shows that our model can compete with the industry-level translation system. We argue that our deep-learning-based translation system is particularly suitable for teaching purposes and small/medium-sized enterprises.
format	article
author	Lanxin Zhao Wanrong Gao Jianbin Fang
author_facet	Lanxin Zhao Wanrong Gao Jianbin Fang
author_sort	Lanxin Zhao
title	High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
title_short	High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
title_full	High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
title_fullStr	High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
title_full_unstemmed	High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus
title_sort	high-performance english–chinese machine translation based on gpu-enabled deep neural networks with domain corpus
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/7d69fc6909834514a76851fe72bfceb4
work_keys_str_mv	AT lanxinzhao highperformanceenglishchinesemachinetranslationbasedongpuenableddeepneuralnetworkswithdomaincorpus AT wanronggao highperformanceenglishchinesemachinetranslationbasedongpuenableddeepneuralnetworkswithdomaincorpus AT jianbinfang highperformanceenglishchinesemachinetranslationbasedongpuenableddeepneuralnetworkswithdomaincorpus
_version_	1718413086088495104

High-Performance English–Chinese Machine Translation Based on GPU-Enabled Deep Neural Networks with Domain Corpus

Ejemplares similares