Enhancing semantics with multi‐objective reinforcement learning for video description

Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Qinyu Li, Longyu Yang, Pengjie Tang, Hanli Wang
Formato: article
Lenguaje:EN
Publicado: Wiley 2021
Materias:
Acceso en línea:https://doaj.org/article/1f50686212ac4f60b204af786657d346
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!