What Does a Language-And-Vision Transformer See: The Impact of Semantic Information on Visual Representations

Neural networks have proven to be very successful in automatically capturing the composition of language and different structures across a range of multi-modal tasks. Thus, an important question to investigate is how neural networks learn and organise such structures. Numerous studies have examined...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Nikolai Ilinykh, Simon Dobnik
Formato:	article
Lenguaje:	EN
Publicado:	Frontiers Media S.A. 2021
Materias:	language-and-vision multi-modality transformer representation learning effect of language on vision self-attention Electronic computers. Computer science QA75.5-76.95
Acceso en línea:	https://doaj.org/article/e246edf91b36449eace5eac40210015e
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Internet

https://doaj.org/article/e246edf91b36449eace5eac40210015e

What Does a Language-And-Vision Transformer See: The Impact of Semantic Information on Visual Representations

Internet

Ejemplares similares