Enhancing semantics with multi‐objective reinforcement learning for video description
Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...
Guardado en:
Autores principales: | Qinyu Li, Longyu Yang, Pengjie Tang, Hanli Wang |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Wiley
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/1f50686212ac4f60b204af786657d346 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
-
A fast learning approach for autonomous navigation using a deep reinforcement learning method
por: Muhammad Mudassir Ejaz, et al.
Publicado: (2021) -
Multi‐view facial action unit detection via deep feature enhancement
por: Chuangao Tang, et al.
Publicado: (2021) -
Meta-Seg: A Generalized Meta-Learning Framework for Multi-Class Few-Shot Semantic Segmentation
por: Zhiying Cao, et al.
Publicado: (2019) -
Fusion of semantic and appearance features for loop‐closure detection in a dynamic environment
por: Yan Xu, et al.
Publicado: (2021) -
A Decision Control Method for Autonomous Driving Based on Multi-Task Reinforcement Learning
por: Yingfeng Cai, et al.
Publicado: (2021)