EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications
Augmented Reality (AR) has increasingly benefited from the use of Simultaneous Localization and Mapping (SLAM) systems. This technology has enabled developers to create AR markerless applications, but lack semantic understanding of their environment. The inclusion of this information would empower A...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/5905eecbaac6472aa022b005895bb614 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:5905eecbaac6472aa022b005895bb614 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:5905eecbaac6472aa022b005895bb6142021-11-25T17:53:08ZEnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications10.3390/ijgi101107722220-9964https://doaj.org/article/5905eecbaac6472aa022b005895bb6142021-11-01T00:00:00Zhttps://www.mdpi.com/2220-9964/10/11/772https://doaj.org/toc/2220-9964Augmented Reality (AR) has increasingly benefited from the use of Simultaneous Localization and Mapping (SLAM) systems. This technology has enabled developers to create AR markerless applications, but lack semantic understanding of their environment. The inclusion of this information would empower AR applications to better react to the surroundings more realistically. To gain semantic knowledge, in recent years, focus has shifted toward fusing SLAM systems with neural networks, giving birth to the field of Semantic SLAM. Building on existing research, this paper aimed to create a SLAM system that generates a 3D map using ORB-SLAM2 and enriches it with semantic knowledge originated from the Fast-SCNN network. The key novelty of our approach is a new method for improving the predictions of neural networks, employed to balance the loss of accuracy introduced by efficient real-time models. Exploiting sensor information provided by a smartphone, GPS coordinates are utilized to query the OpenStreetMap database. The returned information is used to understand which classes are currently absent in the environment, so that they can be removed from the network’s prediction with the goal of improving its accuracy. We achieved 87.40% Pixel Accuracy with Fast-SCNN on our custom version of COCO-Stuff and showed an improvement by involving GPS data for our self-made smartphone dataset resulting in 90.24% Pixel Accuracy. Having in mind the use on smartphones, the implementation aimed to find a trade-off between accuracy and efficiency, making the system achieve an unprecedented speed. To this end, the system was carefully designed and a strong focus on lightweight neural networks is also fundamental. This enabled the creation of an above real-time Semantic SLAM system that we called EnvSLAM (Environment SLAM). Our extensive evaluation reveals the efficiency of the system features and the operability in above real-time (48.1 frames per second with an input image resolution of 640 × 360 pixels). Moreover, the GPS integration indicates an effective improvement of the network’s prediction accuracy.Giulia MarchesiChristian EichhornDavid A. PlecherYuta ItohGudrun KlinkerMDPI AGarticleSLAMsemantic segmentationSemantic SLAMGPSAugmented Realitymachine learningGeography (General)G1-922ENISPRS International Journal of Geo-Information, Vol 10, Iss 772, p 772 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
SLAM semantic segmentation Semantic SLAM GPS Augmented Reality machine learning Geography (General) G1-922 |
spellingShingle |
SLAM semantic segmentation Semantic SLAM GPS Augmented Reality machine learning Geography (General) G1-922 Giulia Marchesi Christian Eichhorn David A. Plecher Yuta Itoh Gudrun Klinker EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
description |
Augmented Reality (AR) has increasingly benefited from the use of Simultaneous Localization and Mapping (SLAM) systems. This technology has enabled developers to create AR markerless applications, but lack semantic understanding of their environment. The inclusion of this information would empower AR applications to better react to the surroundings more realistically. To gain semantic knowledge, in recent years, focus has shifted toward fusing SLAM systems with neural networks, giving birth to the field of Semantic SLAM. Building on existing research, this paper aimed to create a SLAM system that generates a 3D map using ORB-SLAM2 and enriches it with semantic knowledge originated from the Fast-SCNN network. The key novelty of our approach is a new method for improving the predictions of neural networks, employed to balance the loss of accuracy introduced by efficient real-time models. Exploiting sensor information provided by a smartphone, GPS coordinates are utilized to query the OpenStreetMap database. The returned information is used to understand which classes are currently absent in the environment, so that they can be removed from the network’s prediction with the goal of improving its accuracy. We achieved 87.40% Pixel Accuracy with Fast-SCNN on our custom version of COCO-Stuff and showed an improvement by involving GPS data for our self-made smartphone dataset resulting in 90.24% Pixel Accuracy. Having in mind the use on smartphones, the implementation aimed to find a trade-off between accuracy and efficiency, making the system achieve an unprecedented speed. To this end, the system was carefully designed and a strong focus on lightweight neural networks is also fundamental. This enabled the creation of an above real-time Semantic SLAM system that we called EnvSLAM (Environment SLAM). Our extensive evaluation reveals the efficiency of the system features and the operability in above real-time (48.1 frames per second with an input image resolution of 640 × 360 pixels). Moreover, the GPS integration indicates an effective improvement of the network’s prediction accuracy. |
format |
article |
author |
Giulia Marchesi Christian Eichhorn David A. Plecher Yuta Itoh Gudrun Klinker |
author_facet |
Giulia Marchesi Christian Eichhorn David A. Plecher Yuta Itoh Gudrun Klinker |
author_sort |
Giulia Marchesi |
title |
EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
title_short |
EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
title_full |
EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
title_fullStr |
EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
title_full_unstemmed |
EnvSLAM: Combining SLAM Systems and Neural Networks to Improve the Environment Fusion in AR Applications |
title_sort |
envslam: combining slam systems and neural networks to improve the environment fusion in ar applications |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/5905eecbaac6472aa022b005895bb614 |
work_keys_str_mv |
AT giuliamarchesi envslamcombiningslamsystemsandneuralnetworkstoimprovetheenvironmentfusioninarapplications AT christianeichhorn envslamcombiningslamsystemsandneuralnetworkstoimprovetheenvironmentfusioninarapplications AT davidaplecher envslamcombiningslamsystemsandneuralnetworkstoimprovetheenvironmentfusioninarapplications AT yutaitoh envslamcombiningslamsystemsandneuralnetworkstoimprovetheenvironmentfusioninarapplications AT gudrunklinker envslamcombiningslamsystemsandneuralnetworkstoimprovetheenvironmentfusioninarapplications |
_version_ |
1718411861651619840 |