An Approach to Detect Anomaly in Video Using Deep Generative Network

Anomaly detection in the video has recently gained attention due to its importance in the intelligent surveillance system. Even though the performance of the state-of-art methods has been competitive in the benchmark dataset, the trade-off between the computational resource and the accuracy of the a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Savath Saypadith, Takao Onoye
Formato: article
Lenguaje:EN
Publicado: IEEE 2021
Materias:
Acceso en línea:https://doaj.org/article/106c733f24d94516a46ccfc178e106ca
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Anomaly detection in the video has recently gained attention due to its importance in the intelligent surveillance system. Even though the performance of the state-of-art methods has been competitive in the benchmark dataset, the trade-off between the computational resource and the accuracy of the anomaly detection should be considered. In this paper, we present a framework to detect anomalies in video. We proposed a “multi-scale U-Net” network architecture, the unsupervised learning for anomaly detection in video based on generative adversarial network (GAN) structure. Shortcut Inception Modules (SIMs) and residual skip connection are employed to the generator network to increase the ability of the training and testing of the neural network. An asymmetric convolution has been applied instead of traditional convolution layers to decrease the number of training parameters without performance penalty in terms of detection accuracy. In the training phase, the generator network was trained to generate the normal events and attempt to make the generated image and the ground truth to be similar. A multi-scale U-Net kept useful features of an image that were lost during training caused by the convolution operator. The generator network is trained by minimizing the reconstruction error on the normal data and then using the reconstruction error as an indicator of anomalies in the testing phase. Our proposed framework has been evaluated on three benchmark datasets, including UCSD pedestrian, CHUK Avenue, and ShanghaiTech. As a result, the proposed framework surpasses the state-of-the-art learning-based methods on all these datasets, which achieved 95.7%, 86.9%, and 73.0% in terms of AUC. Moreover, the numbers of training and testing parameters in our framework are reduced compared to the baseline network architecture, while the detection accuracy is still improved.