Weakly Supervised Learning for Object Localization Based on an Attention Mechanism

Recently, deep learning has been successfully applied to object detection and localization tasks in images. When setting up deep learning frameworks for supervised training with large datasets, strongly labeling the objects facilitates good performance; however, the complexity of the image scene and...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Nojin Park, Hanseok Ko
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
T
Acceso en línea:https://doaj.org/article/76e39bd9a5a04c889c7bb8e6da6d03ba
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Recently, deep learning has been successfully applied to object detection and localization tasks in images. When setting up deep learning frameworks for supervised training with large datasets, strongly labeling the objects facilitates good performance; however, the complexity of the image scene and large size of the dataset make this a laborious task. Hence, it is of paramount importance that the expensive work associated with the tasks involving strong labeling, such as bounding box annotation, is reduced. In this paper, we propose a method to perform object localization tasks without bounding box annotation in the training process by means of employing a two-path activation-map-based classifier framework. In particular, we develop an activation-map-based framework to judicially control the attention map in the perception branch by adding a two-feature extractor so that better attention weights can be distributed to induce improved performance. The experimental results indicate that our method surpasses the performance of the existing deep learning models based on weakly supervised object localization. The experimental results show that the proposed method achieves the best performance, with 75.21% Top-1 classification accuracy and 55.15% Top-1 localization accuracy on the CUB-200-2011 dataset.