DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution

Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and te...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Anusha Khan, Allah Bux Sargano, Zulfiqar Habib
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/b422ee2dc2364a2d8bcb4cb11696b449
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:b422ee2dc2364a2d8bcb4cb11696b449
record_format dspace
spelling oai:doaj.org-article:b422ee2dc2364a2d8bcb4cb11696b4492021-11-25T18:16:46ZDSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution10.3390/math92228732227-7390https://doaj.org/article/b422ee2dc2364a2d8bcb4cb11696b4492021-11-01T00:00:00Zhttps://www.mdpi.com/2227-7390/9/22/2873https://doaj.org/toc/2227-7390Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and temporal dependency between consecutive frames. Many existing techniques utilize spatial and temporal information separately and compensate motion via alignment. These methods cannot fully exploit the spatio-temporal information that significantly affects the quality of resultant HR videos. In this work, a novel deformable spatio-temporal convolutional residual network (DSTnet) is proposed to overcome the issues of separate motion estimation and compensation methods for VSR. The proposed framework consists of 3D convolutional residual blocks decomposed into spatial and temporal (2+1) D streams. This decomposition can simultaneously utilize input video’s spatial and temporal features without a separate motion estimation and compensation module. Furthermore, the deformable convolution layers have been used in the proposed model that enhances its motion-awareness capability. Our contribution is twofold; firstly, the proposed approach can overcome the challenges in modeling complex motions by efficiently using spatio-temporal information. Secondly, the proposed model has fewer parameters to learn than state-of-the-art methods, making it a computationally lean and efficient framework for VSR. Experiments are conducted on a benchmark Vid4 dataset to evaluate the efficacy of the proposed approach. The results demonstrate that the proposed approach achieves superior quantitative and qualitative performance compared to the state-of-the-art methods.Anusha KhanAllah Bux SarganoZulfiqar HabibMDPI AGarticlevideo super-resolutiondeformable convolution3D convolutionspatio-temporalresidual neural networkdeep learningMathematicsQA1-939ENMathematics, Vol 9, Iss 2873, p 2873 (2021)
institution DOAJ
collection DOAJ
language EN
topic video super-resolution
deformable convolution
3D convolution
spatio-temporal
residual neural network
deep learning
Mathematics
QA1-939
spellingShingle video super-resolution
deformable convolution
3D convolution
spatio-temporal
residual neural network
deep learning
Mathematics
QA1-939
Anusha Khan
Allah Bux Sargano
Zulfiqar Habib
DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
description Video super-resolution (VSR) aims at generating high-resolution (HR) video frames with plausible and temporally consistent details using their low-resolution (LR) counterparts, and neighboring frames. The key challenge for VSR lies in the effective exploitation of intra-frame spatial relation and temporal dependency between consecutive frames. Many existing techniques utilize spatial and temporal information separately and compensate motion via alignment. These methods cannot fully exploit the spatio-temporal information that significantly affects the quality of resultant HR videos. In this work, a novel deformable spatio-temporal convolutional residual network (DSTnet) is proposed to overcome the issues of separate motion estimation and compensation methods for VSR. The proposed framework consists of 3D convolutional residual blocks decomposed into spatial and temporal (2+1) D streams. This decomposition can simultaneously utilize input video’s spatial and temporal features without a separate motion estimation and compensation module. Furthermore, the deformable convolution layers have been used in the proposed model that enhances its motion-awareness capability. Our contribution is twofold; firstly, the proposed approach can overcome the challenges in modeling complex motions by efficiently using spatio-temporal information. Secondly, the proposed model has fewer parameters to learn than state-of-the-art methods, making it a computationally lean and efficient framework for VSR. Experiments are conducted on a benchmark Vid4 dataset to evaluate the efficacy of the proposed approach. The results demonstrate that the proposed approach achieves superior quantitative and qualitative performance compared to the state-of-the-art methods.
format article
author Anusha Khan
Allah Bux Sargano
Zulfiqar Habib
author_facet Anusha Khan
Allah Bux Sargano
Zulfiqar Habib
author_sort Anusha Khan
title DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
title_short DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
title_full DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
title_fullStr DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
title_full_unstemmed DSTnet: Deformable Spatio-Temporal Convolutional Residual Network for Video Super-Resolution
title_sort dstnet: deformable spatio-temporal convolutional residual network for video super-resolution
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/b422ee2dc2364a2d8bcb4cb11696b449
work_keys_str_mv AT anushakhan dstnetdeformablespatiotemporalconvolutionalresidualnetworkforvideosuperresolution
AT allahbuxsargano dstnetdeformablespatiotemporalconvolutionalresidualnetworkforvideosuperresolution
AT zulfiqarhabib dstnetdeformablespatiotemporalconvolutionalresidualnetworkforvideosuperresolution
_version_ 1718411402730799104