Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation
The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/95a9750f55f84f13a6ae12e9424ee1a0 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:95a9750f55f84f13a6ae12e9424ee1a0 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:95a9750f55f84f13a6ae12e9424ee1a02021-11-25T18:53:51ZMemory-Augmented Transformer for Remote Sensing Image Semantic Segmentation10.3390/rs132245182072-4292https://doaj.org/article/95a9750f55f84f13a6ae12e9424ee1a02021-11-01T00:00:00Zhttps://www.mdpi.com/2072-4292/13/22/4518https://doaj.org/toc/2072-4292The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between different classes and to explore representative features for all instances of a given class. While common deep convolutional neural networks (DCNNs) can effectively focus on local features, they are limited by their receptive field to obtain consistent global information. In this paper, we propose a memory-augmented transformer (MAT) to effectively model both the local and global information. The feature extraction pipeline of the MAT is split into a memory-based global relationship guidance module and a local feature extraction module. The local feature extraction module mainly consists of a transformer, which is used to extract features from the input images. The global relationship guidance module maintains a memory bank for the consistent encoding of the global information. Global guidance is performed by memory interaction. Bidirectional information flow between the global and local branches is conducted by a memory-query module, as well as a memory-update module, respectively. Experiment results on the ISPRS Potsdam and ISPRS Vaihingen datasets demonstrated that our method can perform competitively with state-of-the-art methods.Xin ZhaoJiayi GuoYueting ZhangYirong WuMDPI AGarticlesemantic segmentationremote sensing imagerymemory-augmented transformermemory mechanismself-attentionScienceQENRemote Sensing, Vol 13, Iss 4518, p 4518 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
semantic segmentation remote sensing imagery memory-augmented transformer memory mechanism self-attention Science Q |
spellingShingle |
semantic segmentation remote sensing imagery memory-augmented transformer memory mechanism self-attention Science Q Xin Zhao Jiayi Guo Yueting Zhang Yirong Wu Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
description |
The semantic segmentation of remote sensing images requires distinguishing local regions of different classes and exploiting a uniform global representation of the same-class instances. Such requirements make it necessary for the segmentation methods to extract discriminative local features between different classes and to explore representative features for all instances of a given class. While common deep convolutional neural networks (DCNNs) can effectively focus on local features, they are limited by their receptive field to obtain consistent global information. In this paper, we propose a memory-augmented transformer (MAT) to effectively model both the local and global information. The feature extraction pipeline of the MAT is split into a memory-based global relationship guidance module and a local feature extraction module. The local feature extraction module mainly consists of a transformer, which is used to extract features from the input images. The global relationship guidance module maintains a memory bank for the consistent encoding of the global information. Global guidance is performed by memory interaction. Bidirectional information flow between the global and local branches is conducted by a memory-query module, as well as a memory-update module, respectively. Experiment results on the ISPRS Potsdam and ISPRS Vaihingen datasets demonstrated that our method can perform competitively with state-of-the-art methods. |
format |
article |
author |
Xin Zhao Jiayi Guo Yueting Zhang Yirong Wu |
author_facet |
Xin Zhao Jiayi Guo Yueting Zhang Yirong Wu |
author_sort |
Xin Zhao |
title |
Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
title_short |
Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
title_full |
Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
title_fullStr |
Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
title_full_unstemmed |
Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation |
title_sort |
memory-augmented transformer for remote sensing image semantic segmentation |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/95a9750f55f84f13a6ae12e9424ee1a0 |
work_keys_str_mv |
AT xinzhao memoryaugmentedtransformerforremotesensingimagesemanticsegmentation AT jiayiguo memoryaugmentedtransformerforremotesensingimagesemanticsegmentation AT yuetingzhang memoryaugmentedtransformerforremotesensingimagesemanticsegmentation AT yirongwu memoryaugmentedtransformerforremotesensingimagesemanticsegmentation |
_version_ |
1718410602937843712 |