Analysis of Gradient Vanishing of RNNs and Performance Comparison

A recurrent neural network (RNN) combines variable-length input data with a hidden state that depends on previous time steps to generate output data. RNNs have been widely used in time-series data analysis, and various RNN algorithms have been proposed, such as the standard RNN, long short-term memo...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autor principal:	Seol-Hyun Noh
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	RNN LSTM GRU gradient vanishing accuracy Information technology T58.5-58.64
Acceso en línea:	https://doaj.org/article/c707bedefb6643e791954367081254ca
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:c707bedefb6643e791954367081254ca
record_format	dspace
spelling	oai:doaj.org-article:c707bedefb6643e791954367081254ca2021-11-25T17:58:25ZAnalysis of Gradient Vanishing of RNNs and Performance Comparison10.3390/info121104422078-2489https://doaj.org/article/c707bedefb6643e791954367081254ca2021-10-01T00:00:00Zhttps://www.mdpi.com/2078-2489/12/11/442https://doaj.org/toc/2078-2489A recurrent neural network (RNN) combines variable-length input data with a hidden state that depends on previous time steps to generate output data. RNNs have been widely used in time-series data analysis, and various RNN algorithms have been proposed, such as the standard RNN, long short-term memory (LSTM), and gated recurrent units (GRUs). In particular, it has been experimentally proven that LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. The learning ability is a measure of the effectiveness of gradient of error information that would be backpropagated. This study provided a theoretical and experimental basis for the result that LSTM and GRU have more efficient gradient descent than the standard RNN by analyzing and experimenting the gradient vanishing of the standard RNN, LSTM, and GRU. As a result, LSTM and GRU are robust to the degradation of gradient descent even when LSTM and GRU learn long-range input data, which means that the learning ability of LSTM and GRU is greater than standard RNN when learning long-range input data. Therefore, LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. In addition, it was verified whether the experimental results of river-level prediction models, solar power generation prediction models, and speech signal models using the standard RNN, LSTM, and GRUs are consistent with the analysis results of gradient vanishing.Seol-Hyun NohMDPI AGarticleRNNLSTMGRUgradient vanishingaccuracyInformation technologyT58.5-58.64ENInformation, Vol 12, Iss 442, p 442 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	RNN LSTM GRU gradient vanishing accuracy Information technology T58.5-58.64
spellingShingle	RNN LSTM GRU gradient vanishing accuracy Information technology T58.5-58.64 Seol-Hyun Noh Analysis of Gradient Vanishing of RNNs and Performance Comparison
description	A recurrent neural network (RNN) combines variable-length input data with a hidden state that depends on previous time steps to generate output data. RNNs have been widely used in time-series data analysis, and various RNN algorithms have been proposed, such as the standard RNN, long short-term memory (LSTM), and gated recurrent units (GRUs). In particular, it has been experimentally proven that LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. The learning ability is a measure of the effectiveness of gradient of error information that would be backpropagated. This study provided a theoretical and experimental basis for the result that LSTM and GRU have more efficient gradient descent than the standard RNN by analyzing and experimenting the gradient vanishing of the standard RNN, LSTM, and GRU. As a result, LSTM and GRU are robust to the degradation of gradient descent even when LSTM and GRU learn long-range input data, which means that the learning ability of LSTM and GRU is greater than standard RNN when learning long-range input data. Therefore, LSTM and GRU have higher validation accuracy and prediction accuracy than the standard RNN. In addition, it was verified whether the experimental results of river-level prediction models, solar power generation prediction models, and speech signal models using the standard RNN, LSTM, and GRUs are consistent with the analysis results of gradient vanishing.
format	article
author	Seol-Hyun Noh
author_facet	Seol-Hyun Noh
author_sort	Seol-Hyun Noh
title	Analysis of Gradient Vanishing of RNNs and Performance Comparison
title_short	Analysis of Gradient Vanishing of RNNs and Performance Comparison
title_full	Analysis of Gradient Vanishing of RNNs and Performance Comparison
title_fullStr	Analysis of Gradient Vanishing of RNNs and Performance Comparison
title_full_unstemmed	Analysis of Gradient Vanishing of RNNs and Performance Comparison
title_sort	analysis of gradient vanishing of rnns and performance comparison
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/c707bedefb6643e791954367081254ca
work_keys_str_mv	AT seolhyunnoh analysisofgradientvanishingofrnnsandperformancecomparison
_version_	1718411829139472384

Analysis of Gradient Vanishing of RNNs and Performance Comparison

Ejemplares similares