Self-Supervised Monocular Depth Estimation With Extensive Pretraining

Although depth estimation is a key technology for three-dimensional sensing applications involving motion, active sensors such as LiDAR and depth cameras tend to be expensive and bulky. Here, we explore the potential of monocular depth estimation (MDE) based on a self-supervised approach. MDE is a p...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Hyukdoo Choi
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/05fc891f131343b9b113a4ba39a130ca
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:05fc891f131343b9b113a4ba39a130ca
record_format dspace
spelling oai:doaj.org-article:05fc891f131343b9b113a4ba39a130ca2021-12-02T00:00:17ZSelf-Supervised Monocular Depth Estimation With Extensive Pretraining2169-353610.1109/ACCESS.2021.3129628https://doaj.org/article/05fc891f131343b9b113a4ba39a130ca2021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9622207/https://doaj.org/toc/2169-3536Although depth estimation is a key technology for three-dimensional sensing applications involving motion, active sensors such as LiDAR and depth cameras tend to be expensive and bulky. Here, we explore the potential of monocular depth estimation (MDE) based on a self-supervised approach. MDE is a promising technology, but supervised learning suffers from a need for accurate ground-truth depth data. Recent studies have enabled self-supervised training on an MDE model with only monocular image sequences and image-reconstruction errors. We pretrained networks using multiple datasets, including monocular and stereo image sequences. The main challenges posed by the self-supervised MDE model were occlusions and dynamic objects. We proposed novel loss functions to handle these problems in the form of min-over-all and min-with-flow losses, both based on the per-pixel minimum reprojection error of Monodepth2 and extended to stereo images and optical flow. With extensive pretraining and novel losses, our model outperformed existing unsupervised approaches in quantitative depth estimation and the ability to distinguish small objects against a background, as evaluated by KITTI 2015.Hyukdoo ChoiIEEEarticleMonocular depth estimationdepth predictionconvolutional neural networksself-supervised learningunsupervised learningElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 157236-157246 (2021)
institution DOAJ
collection DOAJ
language EN
topic Monocular depth estimation
depth prediction
convolutional neural networks
self-supervised learning
unsupervised learning
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
spellingShingle Monocular depth estimation
depth prediction
convolutional neural networks
self-supervised learning
unsupervised learning
Electrical engineering. Electronics. Nuclear engineering
TK1-9971
Hyukdoo Choi
Self-Supervised Monocular Depth Estimation With Extensive Pretraining
description Although depth estimation is a key technology for three-dimensional sensing applications involving motion, active sensors such as LiDAR and depth cameras tend to be expensive and bulky. Here, we explore the potential of monocular depth estimation (MDE) based on a self-supervised approach. MDE is a promising technology, but supervised learning suffers from a need for accurate ground-truth depth data. Recent studies have enabled self-supervised training on an MDE model with only monocular image sequences and image-reconstruction errors. We pretrained networks using multiple datasets, including monocular and stereo image sequences. The main challenges posed by the self-supervised MDE model were occlusions and dynamic objects. We proposed novel loss functions to handle these problems in the form of min-over-all and min-with-flow losses, both based on the per-pixel minimum reprojection error of Monodepth2 and extended to stereo images and optical flow. With extensive pretraining and novel losses, our model outperformed existing unsupervised approaches in quantitative depth estimation and the ability to distinguish small objects against a background, as evaluated by KITTI 2015.
format article
author Hyukdoo Choi
author_facet Hyukdoo Choi
author_sort Hyukdoo Choi
title Self-Supervised Monocular Depth Estimation With Extensive Pretraining
title_short Self-Supervised Monocular Depth Estimation With Extensive Pretraining
title_full Self-Supervised Monocular Depth Estimation With Extensive Pretraining
title_fullStr Self-Supervised Monocular Depth Estimation With Extensive Pretraining
title_full_unstemmed Self-Supervised Monocular Depth Estimation With Extensive Pretraining
title_sort self-supervised monocular depth estimation with extensive pretraining
publisher IEEE
publishDate 2021
url https://doaj.org/article/05fc891f131343b9b113a4ba39a130ca
work_keys_str_mv AT hyukdoochoi selfsupervisedmonoculardepthestimationwithextensivepretraining
_version_ 1718403963890434048