Table to text generation with accurate content copying

Abstract Generating fluent, coherent, and informative text from structured data is called table-to-text generation. Copying words from the table is a common method to solve the “out-of-vocabulary” problem, but it’s difficult to achieve accurate copying. In order to overcome this problem, we invent a...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Yang Yang, Juan Cao, Yujun Wen, Pengzhou Zhang
Formato:	article
Lenguaje:	EN
Publicado:	Nature Portfolio 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/591e684fd0834775a12f7ccd1c239198
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:591e684fd0834775a12f7ccd1c239198
record_format	dspace
spelling	oai:doaj.org-article:591e684fd0834775a12f7ccd1c2391982021-11-28T12:19:19ZTable to text generation with accurate content copying10.1038/s41598-021-00813-62045-2322https://doaj.org/article/591e684fd0834775a12f7ccd1c2391982021-11-01T00:00:00Zhttps://doi.org/10.1038/s41598-021-00813-6https://doaj.org/toc/2045-2322Abstract Generating fluent, coherent, and informative text from structured data is called table-to-text generation. Copying words from the table is a common method to solve the “out-of-vocabulary” problem, but it’s difficult to achieve accurate copying. In order to overcome this problem, we invent an auto-regressive framework based on the transformer that combines a copying mechanism and language modeling to generate target texts. Firstly, to make the model better learn the semantic relevance between table and text, we apply a word transformation method, which incorporates the field and position information into the target text to acquire the position of where to copy. Then we propose two auxiliary learning objectives, namely table-text constraint loss and copy loss. Table-text constraint loss is used to effectively model table inputs, whereas copy loss is exploited to precisely copy word fragments from a table. Furthermore, we improve the text search strategy to reduce the probability of generating incoherent and repetitive sentences. The model is verified by experiments on two datasets and better results are obtained than the baseline model. On WIKIBIO, the result is improved from 45.47 to 46.87 on BLEU and from 41.54 to 42.28 on ROUGE. On ROTOWIRE, the result is increased by 4.29% on CO metric, and 1.93 points higher on BLEU.Yang YangJuan CaoYujun WenPengzhou ZhangNature PortfolioarticleMedicineRScienceQENScientific Reports, Vol 11, Iss 1, Pp 1-12 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Yang Yang Juan Cao Yujun Wen Pengzhou Zhang Table to text generation with accurate content copying
description	Abstract Generating fluent, coherent, and informative text from structured data is called table-to-text generation. Copying words from the table is a common method to solve the “out-of-vocabulary” problem, but it’s difficult to achieve accurate copying. In order to overcome this problem, we invent an auto-regressive framework based on the transformer that combines a copying mechanism and language modeling to generate target texts. Firstly, to make the model better learn the semantic relevance between table and text, we apply a word transformation method, which incorporates the field and position information into the target text to acquire the position of where to copy. Then we propose two auxiliary learning objectives, namely table-text constraint loss and copy loss. Table-text constraint loss is used to effectively model table inputs, whereas copy loss is exploited to precisely copy word fragments from a table. Furthermore, we improve the text search strategy to reduce the probability of generating incoherent and repetitive sentences. The model is verified by experiments on two datasets and better results are obtained than the baseline model. On WIKIBIO, the result is improved from 45.47 to 46.87 on BLEU and from 41.54 to 42.28 on ROUGE. On ROTOWIRE, the result is increased by 4.29% on CO metric, and 1.93 points higher on BLEU.
format	article
author	Yang Yang Juan Cao Yujun Wen Pengzhou Zhang
author_facet	Yang Yang Juan Cao Yujun Wen Pengzhou Zhang
author_sort	Yang Yang
title	Table to text generation with accurate content copying
title_short	Table to text generation with accurate content copying
title_full	Table to text generation with accurate content copying
title_fullStr	Table to text generation with accurate content copying
title_full_unstemmed	Table to text generation with accurate content copying
title_sort	table to text generation with accurate content copying
publisher	Nature Portfolio
publishDate	2021
url	https://doaj.org/article/591e684fd0834775a12f7ccd1c239198
work_keys_str_mv	AT yangyang tabletotextgenerationwithaccuratecontentcopying AT juancao tabletotextgenerationwithaccuratecontentcopying AT yujunwen tabletotextgenerationwithaccuratecontentcopying AT pengzhouzhang tabletotextgenerationwithaccuratecontentcopying
_version_	1718408078289797120

Table to text generation with accurate content copying

Ejemplares similares