Crowd activity recognition in live video streaming via 3D‐ResNet and region graph convolution network

Abstract Since the era of we‐media, live video industry has shown an explosive growth trend. For large‐scale live video streaming, especially those containing crowd events that may cause great social impact, how to identify and supervise the crowd activity in live video streaming effectively is of g...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Junpeng Kang, Jing Zhang, Wensheng Li, Li Zhuo
Formato: article
Lenguaje:EN
Publicado: Wiley 2021
Materias:
Acceso en línea:https://doaj.org/article/2c1e125c055446439dcfecf525920db4
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Abstract Since the era of we‐media, live video industry has shown an explosive growth trend. For large‐scale live video streaming, especially those containing crowd events that may cause great social impact, how to identify and supervise the crowd activity in live video streaming effectively is of great value to push the healthy development of live video industry. The existing crowd activity recognition mainly uses visual information, rarely fully exploiting and utilizing the correlation or external knowledge between crowd content. Therefore, a crowd activity recognition method in live video streaming is proposed by 3D‐ResNet and regional graph convolution network (ReGCN). (1) After extracting deep spatiotemporal features from live video streaming with 3D‐ResNet, the region proposals are generated by region proposal network. (2) A weakly supervised ReGCN is constructed by making region proposals as graph nodes and their correlations as edges. (3) Crowd activity in live video streaming is recognised by combining the output of ReGCN, the deep spatiotemporal features and the crowd motion intensity as external knowledge. Four experiments are conducted on the public collective activity extended dataset and a real‐world dataset BJUT‐CAD. The competitive results demonstrate that our method can effectively recognise crowd activity in live video streaming.