Enhancing semantics with multi‐objective reinforcement learning for video description
Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...
Enregistré dans:
Auteurs principaux: | , , , |
---|---|
Format: | article |
Langue: | EN |
Publié: |
Wiley
2021
|
Sujets: | |
Accès en ligne: | https://doaj.org/article/1f50686212ac4f60b204af786657d346 |
Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|