Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

Deep Reinforcement Learning (DRL) enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to a given environment. The adaptation of hyperparameters has a great impact on the overall learning process and the le...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Nesma M Ashraf, Reham R Mostafa, Rasha H Sakr, M Z Rashad
Formato:	article
Lenguaje:	EN
Publicado:	Public Library of Science (PLoS) 2021
Materias:	Medicine R Science Q
Acceso en línea:	https://doaj.org/article/17aa2e0f666d4f07ba3b14861f6590be
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:17aa2e0f666d4f07ba3b14861f6590be
record_format	dspace
spelling	oai:doaj.org-article:17aa2e0f666d4f07ba3b14861f6590be2021-12-02T20:07:10ZOptimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.1932-620310.1371/journal.pone.0252754https://doaj.org/article/17aa2e0f666d4f07ba3b14861f6590be2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0252754https://doaj.org/toc/1932-6203Deep Reinforcement Learning (DRL) enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to a given environment. The adaptation of hyperparameters has a great impact on the overall learning process and the learning processing times. Hyperparameters should be accurately estimated while training DRL algorithms, which is one of the key challenges that we attempt to address. This paper employs a swarm-based optimization algorithm, namely the Whale Optimization Algorithm (WOA), for optimizing the hyperparameters of the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve the optimum control strategy in an autonomous driving control problem. DDPG is capable of handling complex environments, which contain continuous spaces for actions. To evaluate the proposed algorithm, the Open Racing Car Simulator (TORCS), a realistic autonomous driving simulation environment, was chosen to its ease of design and implementation. Using TORCS, the DDPG agent with optimized hyperparameters was compared with a DDPG agent with reference hyperparameters. The experimental results showed that the DDPG's hyperparameters optimization leads to maximizing the total rewards, along with testing episodes and maintaining a stable driving policy.Nesma M AshrafReham R MostafaRasha H SakrM Z RashadPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 6, p e0252754 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Medicine R Science Q
spellingShingle	Medicine R Science Q Nesma M Ashraf Reham R Mostafa Rasha H Sakr M Z Rashad Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
description	Deep Reinforcement Learning (DRL) enables agents to make decisions based on a well-designed reward function that suites a particular environment without any prior knowledge related to a given environment. The adaptation of hyperparameters has a great impact on the overall learning process and the learning processing times. Hyperparameters should be accurately estimated while training DRL algorithms, which is one of the key challenges that we attempt to address. This paper employs a swarm-based optimization algorithm, namely the Whale Optimization Algorithm (WOA), for optimizing the hyperparameters of the Deep Deterministic Policy Gradient (DDPG) algorithm to achieve the optimum control strategy in an autonomous driving control problem. DDPG is capable of handling complex environments, which contain continuous spaces for actions. To evaluate the proposed algorithm, the Open Racing Car Simulator (TORCS), a realistic autonomous driving simulation environment, was chosen to its ease of design and implementation. Using TORCS, the DDPG agent with optimized hyperparameters was compared with a DDPG agent with reference hyperparameters. The experimental results showed that the DDPG's hyperparameters optimization leads to maximizing the total rewards, along with testing episodes and maintaining a stable driving policy.
format	article
author	Nesma M Ashraf Reham R Mostafa Rasha H Sakr M Z Rashad
author_facet	Nesma M Ashraf Reham R Mostafa Rasha H Sakr M Z Rashad
author_sort	Nesma M Ashraf
title	Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
title_short	Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
title_full	Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
title_fullStr	Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
title_full_unstemmed	Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
title_sort	optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.
publisher	Public Library of Science (PLoS)
publishDate	2021
url	https://doaj.org/article/17aa2e0f666d4f07ba3b14861f6590be
work_keys_str_mv	AT nesmamashraf optimizinghyperparametersofdeepreinforcementlearningforautonomousdrivingbasedonwhaleoptimizationalgorithm AT rehamrmostafa optimizinghyperparametersofdeepreinforcementlearningforautonomousdrivingbasedonwhaleoptimizationalgorithm AT rashahsakr optimizinghyperparametersofdeepreinforcementlearningforautonomousdrivingbasedonwhaleoptimizationalgorithm AT mzrashad optimizinghyperparametersofdeepreinforcementlearningforautonomousdrivingbasedonwhaleoptimizationalgorithm
_version_	1718375286802743296

Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm.

Ejemplares similares