Enhancing semantics with multi‐objective reinforcement learning for video description
Abstract Video description is challenging due to the high complexity of translating visual content into language. In most popular attention‐based pipelines for this task, visual features and previously generated words are usually concatenated as a vector to predict the current word. However, the err...
Saved in:
Main Authors: | , , , |
---|---|
Format: | article |
Language: | EN |
Published: |
Wiley
2021
|
Subjects: | |
Online Access: | https://doaj.org/article/1f50686212ac4f60b204af786657d346 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|