Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization

Text summarization aims to reduce text by removing less useful information to obtain information quickly and precisely. In Indonesian abstractive text summarization, the research mostly focuses on multi-document summarization which methods will not work optimally in single-document summarization. As...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Henry Lucky, Derwin Suhartono
Formato:	article
Lenguaje:	EN
Publicado:	UUM Press 2021
Materias:	abstractive text summarization bertsum model bert score gpt-like decoder rouge score Information technology T58.5-58.64
Acceso en línea:	https://doaj.org/article/2fc02cae383e408c998494a27a231d56
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:2fc02cae383e408c998494a27a231d56
record_format	dspace
spelling	oai:doaj.org-article:2fc02cae383e408c998494a27a231d562021-11-14T08:29:57ZInvestigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization10.32890/jict2022.21.1.41675-414X2180-3862https://doaj.org/article/2fc02cae383e408c998494a27a231d562021-11-01T00:00:00Zhttp://e-journal.uum.edu.my/index.php/jict/article/view/jict2022.21.1.4https://doaj.org/toc/1675-414Xhttps://doaj.org/toc/2180-3862Text summarization aims to reduce text by removing less useful information to obtain information quickly and precisely. In Indonesian abstractive text summarization, the research mostly focuses on multi-document summarization which methods will not work optimally in single-document summarization. As the public summarization datasets and works in English are focusing on single-document summarization, this study emphasized on Indonesian single-document summarization. Abstractive text summarization studies in English frequently use Bidirectional Encoder Representations from Transformers (BERT), and since Indonesian BERT checkpoint is available, it was employed in this study. This study investigated the use of Indonesian BERT in abstractive text summarization on the IndoSum dataset using the BERTSum model. The investigation proceeded by using various combinations of model encoders, model embedding sizes, and model decoders. Evaluation results showed that models with more embedding size and used Generative Pre-Training (GPT)-like decoder could improve the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score and BERTScore of the model results. Henry LuckyDerwin SuhartonoUUM Pressarticleabstractive text summarizationbertsum modelbert scoregpt-like decoderrouge scoreInformation technologyT58.5-58.64ENJournal of ICT, Vol 21, Iss 1, Pp 71-94 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	abstractive text summarization bertsum model bert score gpt-like decoder rouge score Information technology T58.5-58.64
spellingShingle	abstractive text summarization bertsum model bert score gpt-like decoder rouge score Information technology T58.5-58.64 Henry Lucky Derwin Suhartono Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
description	Text summarization aims to reduce text by removing less useful information to obtain information quickly and precisely. In Indonesian abstractive text summarization, the research mostly focuses on multi-document summarization which methods will not work optimally in single-document summarization. As the public summarization datasets and works in English are focusing on single-document summarization, this study emphasized on Indonesian single-document summarization. Abstractive text summarization studies in English frequently use Bidirectional Encoder Representations from Transformers (BERT), and since Indonesian BERT checkpoint is available, it was employed in this study. This study investigated the use of Indonesian BERT in abstractive text summarization on the IndoSum dataset using the BERTSum model. The investigation proceeded by using various combinations of model encoders, model embedding sizes, and model decoders. Evaluation results showed that models with more embedding size and used Generative Pre-Training (GPT)-like decoder could improve the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) score and BERTScore of the model results.
format	article
author	Henry Lucky Derwin Suhartono
author_facet	Henry Lucky Derwin Suhartono
author_sort	Henry Lucky
title	Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
title_short	Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
title_full	Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
title_fullStr	Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
title_full_unstemmed	Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization
title_sort	investigation of pre-trained bidirectional encoder representations from transformers checkpoints for indonesian abstractive text summarization
publisher	UUM Press
publishDate	2021
url	https://doaj.org/article/2fc02cae383e408c998494a27a231d56
work_keys_str_mv	AT henrylucky investigationofpretrainedbidirectionalencoderrepresentationsfromtransformerscheckpointsforindonesianabstractivetextsummarization AT derwinsuhartono investigationofpretrainedbidirectionalencoderrepresentationsfromtransformerscheckpointsforindonesianabstractivetextsummarization
_version_	1718429768407318528

Investigation of Pre-Trained Bidirectional Encoder Representations from Transformers Checkpoints for Indonesian Abstractive Text Summarization

Ejemplares similares