Enhancing semantics with multi‐objective reinforcement learning for video description
Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Wiley
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/1f50686212ac4f60b204af786657d346 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|