PointMTL: Multi-Transform Learning for Effective 3D Point Cloud Representations

Effectively learning and extracting the feature representations of 3D point clouds is an important yet challenging task. Most of existing works achieve reasonable performance in 3D vision tasks by modeling the relationships among points appropriately. However, the feature representations are only le...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yifan Jian, Yuwei Yang, Zhi Chen, Xianguo Qing, Yang Zhao, Liang He, Xuekun Chen, Wei Luo
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/9e3f67a428074777ae56849b7057d65f
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Effectively learning and extracting the feature representations of 3D point clouds is an important yet challenging task. Most of existing works achieve reasonable performance in 3D vision tasks by modeling the relationships among points appropriately. However, the feature representations are only learned with a specific transform through these methods, which are easy to overlap and thus limit the representation ability of the model. To address these issues, we propose a novel Multi-Transform Learning framework for point clouds (PointMTL), which can extract diverse features from multiple mapping transform to obtain richer representations. Specifically, we build a module named Multi-Transform Encoder (MTE), which encodes and aggregates local features from multiple non-linear transforms. To further explore global context representations, a module named Global Spatial Fusion (GSF) is proposed to capture global information and selectively fuse with local representations. Moreover, to guarantee the richness and diversity of learned representations, we further propose a Spatial Independence Criterion (SIC) strategy to enlarge the differences between the transforms and reduce information redundancies. In contrast to previous works, our approach fully exploits representations from multiple transforms, thus having strong expressiveness and good robustness for point clouds related tasks. The experiments on three typical tasks (i.e., semantic segmentation on S3DIS and ScanNet, part segmentation on ShapeNet and shape classification on ModelNet40) demonstrates the effectiveness of our method.