STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

The applied research in remote sensing images has been pushed by convolutional neural network (CNN). Because of the fixed size of the perceptual field, CNN is unable to model global semantic relevance. Modeling global semantic information is possible with the self-attentive Transformer-based model....

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Liang Gao, Hui Liu, Minhang Yang, Long Chen, Yaling Wan, Zhengqing Xiao, Yurong Qian
Formato:	article
Lenguaje:	EN
Publicado:	IEEE 2021
Materias:	Remote sensing self-attention semantic segmentation Transformer Ocean engineering TC1501-1800 Geophysics. Cosmic physics QC801-809
Acceso en línea:	https://doaj.org/article/2efddfdbdb5d4362b8201399ac39c380
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:2efddfdbdb5d4362b8201399ac39c380
record_format	dspace
spelling	oai:doaj.org-article:2efddfdbdb5d4362b8201399ac39c3802021-11-18T00:00:21ZSTransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation2151-153510.1109/JSTARS.2021.3119654https://doaj.org/article/2efddfdbdb5d4362b8201399ac39c3802021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9573374/https://doaj.org/toc/2151-1535The applied research in remote sensing images has been pushed by convolutional neural network (CNN). Because of the fixed size of the perceptual field, CNN is unable to model global semantic relevance. Modeling global semantic information is possible with the self-attentive Transformer-based model. However, the method of patch computation used by Transformer for self-attentive computation ignores the spatial information inside each patch. To address these issues, we offer the STransFuse model as a new semantic segmentation method for remote sensing images. It is a model that combines the benefits of Transformer with CNN to improve the segmentation quality of various remote sensing images. We employ a staged model to extract coarse-grained and fine-grained feature representations at various semantic scales, unlike earlier techniques based on Transformer model fusion. In order to take full advantage of the features acquired at different stages, we designed an adaptive fusion module. This module adaptively fuses the semantic information between features at different scales employing a self-attentive mechanism. The overall accuracy (OA) of our proposed model on the Vaihingen dataset is 1.36% higher than the baseline, and 1.27% improvement in OA over baseline on the Potsdam dataset. When compared to other advanced models, the STransFuse model performs admirably.Liang GaoHui LiuMinhang YangLong ChenYaling WanZhengqing XiaoYurong QianIEEEarticleRemote sensingself-attentionsemantic segmentationTransformerOcean engineeringTC1501-1800Geophysics. Cosmic physicsQC801-809ENIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, Vol 14, Pp 10990-11003 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Remote sensing self-attention semantic segmentation Transformer Ocean engineering TC1501-1800 Geophysics. Cosmic physics QC801-809
spellingShingle	Remote sensing self-attention semantic segmentation Transformer Ocean engineering TC1501-1800 Geophysics. Cosmic physics QC801-809 Liang Gao Hui Liu Minhang Yang Long Chen Yaling Wan Zhengqing Xiao Yurong Qian STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
description	The applied research in remote sensing images has been pushed by convolutional neural network (CNN). Because of the fixed size of the perceptual field, CNN is unable to model global semantic relevance. Modeling global semantic information is possible with the self-attentive Transformer-based model. However, the method of patch computation used by Transformer for self-attentive computation ignores the spatial information inside each patch. To address these issues, we offer the STransFuse model as a new semantic segmentation method for remote sensing images. It is a model that combines the benefits of Transformer with CNN to improve the segmentation quality of various remote sensing images. We employ a staged model to extract coarse-grained and fine-grained feature representations at various semantic scales, unlike earlier techniques based on Transformer model fusion. In order to take full advantage of the features acquired at different stages, we designed an adaptive fusion module. This module adaptively fuses the semantic information between features at different scales employing a self-attentive mechanism. The overall accuracy (OA) of our proposed model on the Vaihingen dataset is 1.36% higher than the baseline, and 1.27% improvement in OA over baseline on the Potsdam dataset. When compared to other advanced models, the STransFuse model performs admirably.
format	article
author	Liang Gao Hui Liu Minhang Yang Long Chen Yaling Wan Zhengqing Xiao Yurong Qian
author_facet	Liang Gao Hui Liu Minhang Yang Long Chen Yaling Wan Zhengqing Xiao Yurong Qian
author_sort	Liang Gao
title	STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
title_short	STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
title_full	STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
title_fullStr	STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
title_full_unstemmed	STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation
title_sort	stransfuse: fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation
publisher	IEEE
publishDate	2021
url	https://doaj.org/article/2efddfdbdb5d4362b8201399ac39c380
work_keys_str_mv	AT lianggao stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT huiliu stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT minhangyang stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT longchen stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT yalingwan stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT zhengqingxiao stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation AT yurongqian stransfusefusingswintransformerandconvolutionalneuralnetworkforremotesensingimagesemanticsegmentation
_version_	1718425214507810816

STransFuse: Fusing Swin Transformer and Convolutional Neural Network for Remote Sensing Image Semantic Segmentation

Ejemplares similares