MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments

This paper is an initial endeavor to bridge the gap between powerful Deep Reinforcement Learning methodologies and the problem of exploration/coverage of unknown terrains. Within this scope, MarsExplorer, an openai-gym compatible environment tailored to exploration/coverage of unknown areas, is pres...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Dimitrios I. Koutras, Athanasios C. Kapoutsis, Angelos A. Amanatiadis, Elias B. Kosmatopoulos
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	Deep Reinforcement Learning OpenAI gym exploration unknown terrains Electronics TK7800-8360
Acceso en línea:	https://doaj.org/article/89b0f32561004c3ea99cdb0421579ba6
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:89b0f32561004c3ea99cdb0421579ba6
record_format	dspace
spelling	oai:doaj.org-article:89b0f32561004c3ea99cdb0421579ba62021-11-25T17:24:21ZMarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments10.3390/electronics102227512079-9292https://doaj.org/article/89b0f32561004c3ea99cdb0421579ba62021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/22/2751https://doaj.org/toc/2079-9292This paper is an initial endeavor to bridge the gap between powerful Deep Reinforcement Learning methodologies and the problem of exploration/coverage of unknown terrains. Within this scope, MarsExplorer, an openai-gym compatible environment tailored to exploration/coverage of unknown areas, is presented. MarsExplorer translates the original robotics problem into a Reinforcement Learning setup that various off-the-shelf algorithms can tackle. Any learned policy can be straightforwardly applied to a robotic platform without an elaborate simulation model of the robot’s dynamics to apply a different learning/adaptation phase. One of its core features is the controllable multi-dimensional procedural generation of terrains, which is the key for producing policies with strong generalization capabilities. Four different state-of-the-art RL algorithms (A3C, PPO, Rainbow, and SAC) are trained on the MarsExplorer environment, and a proper evaluation of their results compared to the average human-level performance is reported. In the follow-up experimental analysis, the effect of the multi-dimensional difficulty setting on the learning capabilities of the best-performing algorithm (PPO) is analyzed. A milestone result is the generation of an exploration policy that follows the Hilbert curve without providing this information to the environment or rewarding directly or indirectly Hilbert-curve-like trajectories. The experimental analysis is concluded by evaluating PPO learned policy algorithm side-by-side with frontier-based exploration strategies. A study on the performance curves revealed that PPO-based policy was capable of performing adaptive-to-the-unknown-terrain sweeping without leaving expensive-to-revisit areas uncovered, underlying the capability of RL-based methodologies to tackle exploration tasks efficiently.Dimitrios I. KoutrasAthanasios C. KapoutsisAngelos A. AmanatiadisElias B. KosmatopoulosMDPI AGarticleDeep Reinforcement LearningOpenAI gymexplorationunknown terrainsElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2751, p 2751 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Deep Reinforcement Learning OpenAI gym exploration unknown terrains Electronics TK7800-8360
spellingShingle	Deep Reinforcement Learning OpenAI gym exploration unknown terrains Electronics TK7800-8360 Dimitrios I. Koutras Athanasios C. Kapoutsis Angelos A. Amanatiadis Elias B. Kosmatopoulos MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
description	This paper is an initial endeavor to bridge the gap between powerful Deep Reinforcement Learning methodologies and the problem of exploration/coverage of unknown terrains. Within this scope, MarsExplorer, an openai-gym compatible environment tailored to exploration/coverage of unknown areas, is presented. MarsExplorer translates the original robotics problem into a Reinforcement Learning setup that various off-the-shelf algorithms can tackle. Any learned policy can be straightforwardly applied to a robotic platform without an elaborate simulation model of the robot’s dynamics to apply a different learning/adaptation phase. One of its core features is the controllable multi-dimensional procedural generation of terrains, which is the key for producing policies with strong generalization capabilities. Four different state-of-the-art RL algorithms (A3C, PPO, Rainbow, and SAC) are trained on the MarsExplorer environment, and a proper evaluation of their results compared to the average human-level performance is reported. In the follow-up experimental analysis, the effect of the multi-dimensional difficulty setting on the learning capabilities of the best-performing algorithm (PPO) is analyzed. A milestone result is the generation of an exploration policy that follows the Hilbert curve without providing this information to the environment or rewarding directly or indirectly Hilbert-curve-like trajectories. The experimental analysis is concluded by evaluating PPO learned policy algorithm side-by-side with frontier-based exploration strategies. A study on the performance curves revealed that PPO-based policy was capable of performing adaptive-to-the-unknown-terrain sweeping without leaving expensive-to-revisit areas uncovered, underlying the capability of RL-based methodologies to tackle exploration tasks efficiently.
format	article
author	Dimitrios I. Koutras Athanasios C. Kapoutsis Angelos A. Amanatiadis Elias B. Kosmatopoulos
author_facet	Dimitrios I. Koutras Athanasios C. Kapoutsis Angelos A. Amanatiadis Elias B. Kosmatopoulos
author_sort	Dimitrios I. Koutras
title	MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
title_short	MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
title_full	MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
title_fullStr	MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
title_full_unstemmed	MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments
title_sort	marsexplorer: exploration of unknown terrains via deep reinforcement learning and procedurally generated environments
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/89b0f32561004c3ea99cdb0421579ba6
work_keys_str_mv	AT dimitriosikoutras marsexplorerexplorationofunknownterrainsviadeepreinforcementlearningandprocedurallygeneratedenvironments AT athanasiosckapoutsis marsexplorerexplorationofunknownterrainsviadeepreinforcementlearningandprocedurallygeneratedenvironments AT angelosaamanatiadis marsexplorerexplorationofunknownterrainsviadeepreinforcementlearningandprocedurallygeneratedenvironments AT eliasbkosmatopoulos marsexplorerexplorationofunknownterrainsviadeepreinforcementlearningandprocedurallygeneratedenvironments
_version_	1718412420154654720

MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments

Ejemplares similares