Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates.
A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide a...
Guardado en:
Autores principales: | , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Public Library of Science (PLoS)
2014
|
Materias: | |
Acceso en línea: | https://doaj.org/article/75a7409e8d204eafa2be0aaf25ff03aa |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:75a7409e8d204eafa2be0aaf25ff03aa |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:75a7409e8d204eafa2be0aaf25ff03aa2021-11-18T08:35:35ZCoexistence of reward and unsupervised learning during the operant conditioning of neural firing rates.1932-620310.1371/journal.pone.0087123https://doaj.org/article/75a7409e8d204eafa2be0aaf25ff03aa2014-01-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/24475240/?tool=EBIhttps://doaj.org/toc/1932-6203A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide an opportunity for insight into this. Studies of reward-modulated spike-timing-dependent plasticity (RSTDP), and of other models such as R-max, have reproduced this learning behavior, but they have assumed that no unsupervised learning is present (i.e., no learning occurs without, or independent of, rewards). We show that these models cannot elicit firing rate reinforcement while exhibiting both reward learning and ongoing, stable unsupervised learning. To fix this issue, we propose a new RSTDP model of synaptic plasticity based upon the observed effects that dopamine has on long-term potentiation and depression (LTP and LTD). We show, both analytically and through simulations, that our new model can exhibit unsupervised learning and lead to firing rate reinforcement. This requires that the strengthening of LTP by the reward signal is greater than the strengthening of LTD and that the reinforced neuron exhibits irregular firing. We show the robustness of our findings to spike-timing correlations, to the synaptic weight dependence that is assumed, and to changes in the mean reward. We also consider our model in the differential reinforcement of two nearby neurons. Our model aligns more strongly with experimental studies than previous models and makes testable predictions for future experiments.Robert R KerrDavid B GraydenDoreen A ThomasMatthieu GilsonAnthony N BurkittPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 9, Iss 1, p e87123 (2014) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Medicine R Science Q |
spellingShingle |
Medicine R Science Q Robert R Kerr David B Grayden Doreen A Thomas Matthieu Gilson Anthony N Burkitt Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
description |
A fundamental goal of neuroscience is to understand how cognitive processes, such as operant conditioning, are performed by the brain. Typical and well studied examples of operant conditioning, in which the firing rates of individual cortical neurons in monkeys are increased using rewards, provide an opportunity for insight into this. Studies of reward-modulated spike-timing-dependent plasticity (RSTDP), and of other models such as R-max, have reproduced this learning behavior, but they have assumed that no unsupervised learning is present (i.e., no learning occurs without, or independent of, rewards). We show that these models cannot elicit firing rate reinforcement while exhibiting both reward learning and ongoing, stable unsupervised learning. To fix this issue, we propose a new RSTDP model of synaptic plasticity based upon the observed effects that dopamine has on long-term potentiation and depression (LTP and LTD). We show, both analytically and through simulations, that our new model can exhibit unsupervised learning and lead to firing rate reinforcement. This requires that the strengthening of LTP by the reward signal is greater than the strengthening of LTD and that the reinforced neuron exhibits irregular firing. We show the robustness of our findings to spike-timing correlations, to the synaptic weight dependence that is assumed, and to changes in the mean reward. We also consider our model in the differential reinforcement of two nearby neurons. Our model aligns more strongly with experimental studies than previous models and makes testable predictions for future experiments. |
format |
article |
author |
Robert R Kerr David B Grayden Doreen A Thomas Matthieu Gilson Anthony N Burkitt |
author_facet |
Robert R Kerr David B Grayden Doreen A Thomas Matthieu Gilson Anthony N Burkitt |
author_sort |
Robert R Kerr |
title |
Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
title_short |
Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
title_full |
Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
title_fullStr |
Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
title_full_unstemmed |
Coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
title_sort |
coexistence of reward and unsupervised learning during the operant conditioning of neural firing rates. |
publisher |
Public Library of Science (PLoS) |
publishDate |
2014 |
url |
https://doaj.org/article/75a7409e8d204eafa2be0aaf25ff03aa |
work_keys_str_mv |
AT robertrkerr coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates AT davidbgrayden coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates AT doreenathomas coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates AT matthieugilson coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates AT anthonynburkitt coexistenceofrewardandunsupervisedlearningduringtheoperantconditioningofneuralfiringrates |
_version_ |
1718421569706917888 |