Reinforcement learning on slow features of high-dimensional input streams.

Humans and animals are able to learn complex behaviors based on a massive stream of sensory information from different modalities. Early animal studies have identified learning mechanisms that are based on reward and punishment such that animals tend to avoid actions that lead to punishment whereas...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Robert Legenstein, Niko Wilbert, Laurenz Wiskott
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2010
Materias:
Acceso en línea:https://doaj.org/article/a018376e604147de8c414935b23d132b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:a018376e604147de8c414935b23d132b
record_format dspace
spelling oai:doaj.org-article:a018376e604147de8c414935b23d132b2021-11-18T05:49:22ZReinforcement learning on slow features of high-dimensional input streams.1553-734X1553-735810.1371/journal.pcbi.1000894https://doaj.org/article/a018376e604147de8c414935b23d132b2010-08-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/20808883/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Humans and animals are able to learn complex behaviors based on a massive stream of sensory information from different modalities. Early animal studies have identified learning mechanisms that are based on reward and punishment such that animals tend to avoid actions that lead to punishment whereas rewarded actions are reinforced. However, most algorithms for reward-based learning are only applicable if the dimensionality of the state-space is sufficiently small or its structure is sufficiently simple. Therefore, the question arises how the problem of learning on high-dimensional data is solved in the brain. In this article, we propose a biologically plausible generic two-stage learning system that can directly be applied to raw high-dimensional input streams. The system is composed of a hierarchical slow feature analysis (SFA) network for preprocessing and a simple neural network on top that is trained based on rewards. We demonstrate by computer simulations that this generic architecture is able to learn quite demanding reinforcement learning tasks on high-dimensional visual input streams in a time that is comparable to the time needed when an explicit highly informative low-dimensional state-space representation is given instead of the high-dimensional visual input. The learning speed of the proposed architecture in a task similar to the Morris water maze task is comparable to that found in experimental studies with rats. This study thus supports the hypothesis that slowness learning is one important unsupervised learning principle utilized in the brain to form efficient state representations for behavioral learning.Robert LegensteinNiko WilbertLaurenz WiskottPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 6, Iss 8 (2010)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Robert Legenstein
Niko Wilbert
Laurenz Wiskott
Reinforcement learning on slow features of high-dimensional input streams.
description Humans and animals are able to learn complex behaviors based on a massive stream of sensory information from different modalities. Early animal studies have identified learning mechanisms that are based on reward and punishment such that animals tend to avoid actions that lead to punishment whereas rewarded actions are reinforced. However, most algorithms for reward-based learning are only applicable if the dimensionality of the state-space is sufficiently small or its structure is sufficiently simple. Therefore, the question arises how the problem of learning on high-dimensional data is solved in the brain. In this article, we propose a biologically plausible generic two-stage learning system that can directly be applied to raw high-dimensional input streams. The system is composed of a hierarchical slow feature analysis (SFA) network for preprocessing and a simple neural network on top that is trained based on rewards. We demonstrate by computer simulations that this generic architecture is able to learn quite demanding reinforcement learning tasks on high-dimensional visual input streams in a time that is comparable to the time needed when an explicit highly informative low-dimensional state-space representation is given instead of the high-dimensional visual input. The learning speed of the proposed architecture in a task similar to the Morris water maze task is comparable to that found in experimental studies with rats. This study thus supports the hypothesis that slowness learning is one important unsupervised learning principle utilized in the brain to form efficient state representations for behavioral learning.
format article
author Robert Legenstein
Niko Wilbert
Laurenz Wiskott
author_facet Robert Legenstein
Niko Wilbert
Laurenz Wiskott
author_sort Robert Legenstein
title Reinforcement learning on slow features of high-dimensional input streams.
title_short Reinforcement learning on slow features of high-dimensional input streams.
title_full Reinforcement learning on slow features of high-dimensional input streams.
title_fullStr Reinforcement learning on slow features of high-dimensional input streams.
title_full_unstemmed Reinforcement learning on slow features of high-dimensional input streams.
title_sort reinforcement learning on slow features of high-dimensional input streams.
publisher Public Library of Science (PLoS)
publishDate 2010
url https://doaj.org/article/a018376e604147de8c414935b23d132b
work_keys_str_mv AT robertlegenstein reinforcementlearningonslowfeaturesofhighdimensionalinputstreams
AT nikowilbert reinforcementlearningonslowfeaturesofhighdimensionalinputstreams
AT laurenzwiskott reinforcementlearningonslowfeaturesofhighdimensionalinputstreams
_version_ 1718424803835117568