Speed/accuracy trade-off between the habitual and the goal-directed processes.

Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparat...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Mehdi Keramati, Amir Dezfouli, Payam Piray
Formato:	article
Lenguaje:	EN
Publicado:	Public Library of Science (PLoS) 2011
Materias:	Biology (General) QH301-705.5
Acceso en línea:	https://doaj.org/article/1fa231b1583344fca7d94b8c5fea176d
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:1fa231b1583344fca7d94b8c5fea176d
record_format	dspace
spelling	oai:doaj.org-article:1fa231b1583344fca7d94b8c5fea176d2021-11-18T05:50:31ZSpeed/accuracy trade-off between the habitual and the goal-directed processes.1553-734X1553-735810.1371/journal.pcbi.1002055https://doaj.org/article/1fa231b1583344fca7d94b8c5fea176d2011-05-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21637741/pdf/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparative advantages that they have at different stages of learning. In this paper, we assume that the goal-directed system is behaviourally flexible, but slow in choice selection. The habitual system, in contrast, is fast in responding, but inflexible in adapting its behavioural strategy to new conditions. Based on these assumptions and using the computational theory of reinforcement learning, we propose a normative model for arbitration between the two processes that makes an approximately optimal balance between search-time and accuracy in decision making. Behaviourally, the model can explain experimental evidence on behavioural sensitivity to outcome at the early stages of learning, but insensitivity at the later stages. It also explains that when two choices with equal incentive values are available concurrently, the behaviour remains outcome-sensitive, even after extensive training. Moreover, the model can explain choice reaction time variations during the course of learning, as well as the experimental observation that as the number of choices increases, the reaction time also increases. Neurobiologically, by assuming that phasic and tonic activities of midbrain dopamine neurons carry the reward prediction error and the average reward signals used by the model, respectively, the model predicts that whereas phasic dopamine indirectly affects behaviour through reinforcing stimulus-response associations, tonic dopamine can directly affect behaviour through manipulating the competition between the habitual and the goal-directed systems and thus, affect reaction time.Mehdi KeramatiAmir DezfouliPayam PirayPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 7, Iss 5, p e1002055 (2011)
institution	DOAJ
collection	DOAJ
language	EN
topic	Biology (General) QH301-705.5
spellingShingle	Biology (General) QH301-705.5 Mehdi Keramati Amir Dezfouli Payam Piray Speed/accuracy trade-off between the habitual and the goal-directed processes.
description	Instrumental responses are hypothesized to be of two kinds: habitual and goal-directed, mediated by the sensorimotor and the associative cortico-basal ganglia circuits, respectively. The existence of the two heterogeneous associative learning mechanisms can be hypothesized to arise from the comparative advantages that they have at different stages of learning. In this paper, we assume that the goal-directed system is behaviourally flexible, but slow in choice selection. The habitual system, in contrast, is fast in responding, but inflexible in adapting its behavioural strategy to new conditions. Based on these assumptions and using the computational theory of reinforcement learning, we propose a normative model for arbitration between the two processes that makes an approximately optimal balance between search-time and accuracy in decision making. Behaviourally, the model can explain experimental evidence on behavioural sensitivity to outcome at the early stages of learning, but insensitivity at the later stages. It also explains that when two choices with equal incentive values are available concurrently, the behaviour remains outcome-sensitive, even after extensive training. Moreover, the model can explain choice reaction time variations during the course of learning, as well as the experimental observation that as the number of choices increases, the reaction time also increases. Neurobiologically, by assuming that phasic and tonic activities of midbrain dopamine neurons carry the reward prediction error and the average reward signals used by the model, respectively, the model predicts that whereas phasic dopamine indirectly affects behaviour through reinforcing stimulus-response associations, tonic dopamine can directly affect behaviour through manipulating the competition between the habitual and the goal-directed systems and thus, affect reaction time.
format	article
author	Mehdi Keramati Amir Dezfouli Payam Piray
author_facet	Mehdi Keramati Amir Dezfouli Payam Piray
author_sort	Mehdi Keramati
title	Speed/accuracy trade-off between the habitual and the goal-directed processes.
title_short	Speed/accuracy trade-off between the habitual and the goal-directed processes.
title_full	Speed/accuracy trade-off between the habitual and the goal-directed processes.
title_fullStr	Speed/accuracy trade-off between the habitual and the goal-directed processes.
title_full_unstemmed	Speed/accuracy trade-off between the habitual and the goal-directed processes.
title_sort	speed/accuracy trade-off between the habitual and the goal-directed processes.
publisher	Public Library of Science (PLoS)
publishDate	2011
url	https://doaj.org/article/1fa231b1583344fca7d94b8c5fea176d
work_keys_str_mv	AT mehdikeramati speedaccuracytradeoffbetweenthehabitualandthegoaldirectedprocesses AT amirdezfouli speedaccuracytradeoffbetweenthehabitualandthegoaldirectedprocesses AT payampiray speedaccuracytradeoffbetweenthehabitualandthegoaldirectedprocesses
_version_	1718424773995790336

Speed/accuracy trade-off between the habitual and the goal-directed processes.

Ejemplares similares