Multilabel Video Classification Model of Navigation Mark’s Lights Based on Deep Learning

At night, buoys and other navigation marks disappear to be replaced by fixed or flashing lights. Navigation marks are seen as a set of lights in various colors rather than their familiar outline. Deciphering that the meaning of the lights is a burden to navigators, it is also a new challenging resea...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xu Han, Mingyang Pan, Haipeng Ge, Shaoxi Li, Jingfeng Hu, Lining Zhao, Yu Li
Formato: article
Lenguaje:EN
Publicado: Hindawi Limited 2021
Materias:
Acceso en línea:https://doaj.org/article/5bb8b6c740f64191b0573bac0156eceb
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:At night, buoys and other navigation marks disappear to be replaced by fixed or flashing lights. Navigation marks are seen as a set of lights in various colors rather than their familiar outline. Deciphering that the meaning of the lights is a burden to navigators, it is also a new challenging research direction of intelligent sensing of navigation environment. The study studied initiatively the intelligent recognition of lights on navigation marks at night based on multilabel video classification methods. To capture effectively the characteristics of navigation mark’s lights, including both color and flashing phase, three different multilabel classification models based on binary relevance, label power set, and adapted algorithm were investigated and compared. According to the experiment’s results performed on a data set with 8000 minutes video, the model based on binary relevance, named NMLNet, has highest accuracy about 99.23% to classify 9 types of navigation mark’s lights. It also has the fastest computation speed with least network parameters. In the NMLNet, there are two branches for the classifications of color and flashing, respectively, and for the flashing classification, an improved MobileNet-v2 was used to capture the brightness characteristic of lights in each video frame, and an LSTM is used to capture the temporal dynamics of lights. Aiming to run on mobile devices on vessel, the MobileNet-v2 was used as backbone, and with the improvement of spatial attention mechanism, it achieved the accuracy near Resnet-50 while keeping its high speed.