Generating Bird’s Eye View from Egocentric RGB Videos

In this paper, we present a method for generating bird’s eye video from egocentric RGB videos. Working with egocentric views is tricky since such the view is highly warped and prone to occlusions. On the other hand, a bird’s eye view has a consistent scaling in at least the two dimensions it shows....

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Vanita Jain, Qiming Wu, Shivam Grover, Kshitij Sidana, Gopal Chaudhary, San Hlaing Myint, Qiaozhi Hua
Formato:	article
Lenguaje:	EN
Publicado:	Hindawi-Wiley 2021
Materias:	Technology T Telecommunication TK5101-6720
Acceso en línea:	https://doaj.org/article/c3372781b8ce4496885386d9d29dcc3c
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:c3372781b8ce4496885386d9d29dcc3c
record_format	dspace
spelling	oai:doaj.org-article:c3372781b8ce4496885386d9d29dcc3c2021-11-22T01:10:56ZGenerating Bird’s Eye View from Egocentric RGB Videos1530-867710.1155/2021/7479473https://doaj.org/article/c3372781b8ce4496885386d9d29dcc3c2021-01-01T00:00:00Zhttp://dx.doi.org/10.1155/2021/7479473https://doaj.org/toc/1530-8677In this paper, we present a method for generating bird’s eye video from egocentric RGB videos. Working with egocentric views is tricky since such the view is highly warped and prone to occlusions. On the other hand, a bird’s eye view has a consistent scaling in at least the two dimensions it shows. Moreover, most of the state-of-the-art systems for tasks such as path prediction are built for bird’s eye views of the subjects. We present a deep learning-based approach that transfers the egocentric RGB images captured from a dashcam of a car to bird’s eye view. This is a task of view translation, and we perform two experiments. The first one uses an image-to-image translation method, and the other uses a video-to-video translation. We compare the results of our work with homographic transformation, and our SSIM values are better by a margin of 77% and 14.4%, and the RMSE errors are lower by 40% and 14.6% for image-to-image translation and video-to-video translation, respectively. We also visually show the efficacy and limitations of each method with helpful insights for future research. Compared to previous works that use homography and LIDAR for 3D point clouds, our work is more generalizable and does not require any expensive equipment.Vanita JainQiming WuShivam GroverKshitij SidanaGopal ChaudharySan Hlaing MyintQiaozhi HuaHindawi-WileyarticleTechnologyTTelecommunicationTK5101-6720ENWireless Communications and Mobile Computing, Vol 2021 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Technology T Telecommunication TK5101-6720
spellingShingle	Technology T Telecommunication TK5101-6720 Vanita Jain Qiming Wu Shivam Grover Kshitij Sidana Gopal Chaudhary San Hlaing Myint Qiaozhi Hua Generating Bird’s Eye View from Egocentric RGB Videos
description	In this paper, we present a method for generating bird’s eye video from egocentric RGB videos. Working with egocentric views is tricky since such the view is highly warped and prone to occlusions. On the other hand, a bird’s eye view has a consistent scaling in at least the two dimensions it shows. Moreover, most of the state-of-the-art systems for tasks such as path prediction are built for bird’s eye views of the subjects. We present a deep learning-based approach that transfers the egocentric RGB images captured from a dashcam of a car to bird’s eye view. This is a task of view translation, and we perform two experiments. The first one uses an image-to-image translation method, and the other uses a video-to-video translation. We compare the results of our work with homographic transformation, and our SSIM values are better by a margin of 77% and 14.4%, and the RMSE errors are lower by 40% and 14.6% for image-to-image translation and video-to-video translation, respectively. We also visually show the efficacy and limitations of each method with helpful insights for future research. Compared to previous works that use homography and LIDAR for 3D point clouds, our work is more generalizable and does not require any expensive equipment.
format	article
author	Vanita Jain Qiming Wu Shivam Grover Kshitij Sidana Gopal Chaudhary San Hlaing Myint Qiaozhi Hua
author_facet	Vanita Jain Qiming Wu Shivam Grover Kshitij Sidana Gopal Chaudhary San Hlaing Myint Qiaozhi Hua
author_sort	Vanita Jain
title	Generating Bird’s Eye View from Egocentric RGB Videos
title_short	Generating Bird’s Eye View from Egocentric RGB Videos
title_full	Generating Bird’s Eye View from Egocentric RGB Videos
title_fullStr	Generating Bird’s Eye View from Egocentric RGB Videos
title_full_unstemmed	Generating Bird’s Eye View from Egocentric RGB Videos
title_sort	generating bird’s eye view from egocentric rgb videos
publisher	Hindawi-Wiley
publishDate	2021
url	https://doaj.org/article/c3372781b8ce4496885386d9d29dcc3c
work_keys_str_mv	AT vanitajain generatingbirdseyeviewfromegocentricrgbvideos AT qimingwu generatingbirdseyeviewfromegocentricrgbvideos AT shivamgrover generatingbirdseyeviewfromegocentricrgbvideos AT kshitijsidana generatingbirdseyeviewfromegocentricrgbvideos AT gopalchaudhary generatingbirdseyeviewfromegocentricrgbvideos AT sanhlaingmyint generatingbirdseyeviewfromegocentricrgbvideos AT qiaozhihua generatingbirdseyeviewfromegocentricrgbvideos
_version_	1718418367335890944

Generating Bird’s Eye View from Egocentric RGB Videos

Ejemplares similares