The basal ganglia (BG) are a set of subcortical nuclei
(Figure 1). They receive cortical and thalamic input mainly through the
striatum (i.e., by the putamen and the caudate nucleus) which forms the input
layer of BG. Next, the information flows through the globus pallidus to the
major output layer, the substantia nigra pars reticulata (SNr). There also
exists an internal loop between the globus pallidus and the subthalamic
nucleus. The substantia nigra pars compacta (SNc) projects back to the striatum
with dopaminergic neurons. The input layer of BG is rich of spiny neurons (SP),
which receive huge cortical (C) connections. The SPs also receive afferents
from dopaminergic neurons (DA) in SNc which synapse on to the SPs in the
striatum. Inhibitory dynamics are also present in the BG. The activity of the
SPs inhibits the DA neurons in the SNc. The subthalamic side-loop, on the
contrary, disinhibits the DA which can result in the excitation of the DA
neurons proportional to the input from that loop and a primary reinforcement
signal.

Fig.1 The basal ganglia–thalamocortical
connections. The striatum is the main input structure of the basal ganglia. It
is divided into dorsal striatum (most of the caudate and putamen) and ventral
striatum (nucleus accumbens and the ventromedial parts of the caudate and
putamen). The striatum is innervated by the entire cerebral cortex, and
projects to the output nuclei of the basal ganglia, the globus pallidus (GPi),
the substantia nigra pars reticulata (SNr) and the ventral pallidum (VP). These
nuclei project in turn to the ventral anterior (VA) and mediodorsal (MD)
thalamic nuclei, which are reciprocally connected with the frontal cortex.
Information from the striatum can also reach the output nuclei via the
‘indirect pathway’, namely, via striatal projections to the external segment of
the globus pallidus (GPe), GPe projections to the subthalamic nucleus (STN),
and the latter's projections to GPi/SNr/VP. The striatum also projects
dopaminergic neurons in the substantia nigra pars compacta (SNC), retrorubral
area (RRA) and ventral tegmental area (VTA). Please note that this scheme does
not relate to two important principles of organization of the depicted
projections. One is the compartmental organization of the dorsal striatum into
striosomes (patches, in rats) and matrix. The other is the topographical
organization of the projections between the different levels into several
‘streams’ which form several ganglia–thalamocortical circuits.
The learning role of the BG has been focused in many
studies. The firing patterns of the DA neurons would reflect information
regarding the timing of delayed rewards (relative to the reward-predicting stimulus), as seen
by the precisely time depression of DA firing when an expected reward is
omitted [1,2]. It is also known that the DA modulation of
spike-timing-dependent synaptic plasticity can reinforce firing patterns
occurring on a millisecond timescale even if the reward is expected to occur in
delayed (seconds) time [3]. This pattern of activity is very similar to
that generated by computational algorithms of reinforcement_learning (RL), in particular temporal difference (TD) models [4,5]. In the context of basal ganglia modeling, TD
learning is mainly used in the framework of actor-critic models [6-9]. In such
mappings of the Actor-Critic implementation of TD learning on to the BG, the Actor is
related to the selection function of the BG, and the Critic is related to the
RL circuit (Figure 2). As such the dopamine signal is considered as the
teaching signal that alters the Actor's responses to maximize future reward.
The actor module learns to perform actions by maximizing the value of the
expected rewards, which is determined at each step by the critic module [6]. The
critic learns to estimate the future rewards in an adaptive mode, i.e., from the sensory stimuli and the actions of
the actor. The adaptive critic
applies the TD learning rule [4] in which the error between two adjacent
predictions (the TD error) is used to update the critic’s weights. The analogy
between the basal ganglia and Actor–Critic models is based on the strong
similarity between DA neuron activity and the TD prediction error signal, and
between DA-dependent long-term synaptic plasticity in the striatum [10, 11] and learning guided by a prediction error signal in
the actor [12].
- Schultz, W., Dickinson, A. (2000). Neuronal
coding of prediction errors. Annu Rev Neurosci., 23, 473–500.
- Schultz, W., Tremblay, L., Hollerman, J. R.
(2000). Reward processing in primate orbitofrontal cortex and basal
ganglia. Cereb. Cortex, 10, 272–283.
- Izhikevich, E.M. (2007).
Solving the distal reward problem through linkage of STDP and dopamine
signalling. Cereb. Cortex 17:
2443-2452.
- Sutton, R. (1988). Learning to predict by
methods of temporal difference. Machine Learning, 3, 9–44.
- Suri, R. E. (2002). TD models of reward
predictive responses in dopamine neurons. Neural Networks, 15, PII:
S0893-6080(02)00046-1.
- Barto, A. G. (1995). Adaptive
critics and the basal ganglia. Models of Information Processing in the
Basal Ganglia. Houk, J.C., Davis, J.L., Beiser, D.G. (Eds.), MIT Press
Cambridge.
- Houk, J.C., Adams, J.L., Barto, A.G. (1995). A
model of how the basal ganglia generates and uses neural signals that
predict reinforcement. Models of Information Processing in the Basal
Ganglia. Houk, J.C., Davis, J. L. & Beiser, D. G. (Eds.). Cambridge, MA, MIT
Press: 249-274.
- Montague,
P. R., Dayan, P., Sejnowski, T. J. (1996). A
framework for mesencephalic dopamine systems based on predictive Hebbian
learning. J Neurosci. 16(5): 1936-1947.
- Schultz, W., Dayan, P., Montague, P. R. (1997). A neural substrate
of prediction and reward. Science 275: 1593-1599.
- Calabresi,
P., Gubellini, P., Centonze, D., Picconi, B., Bernardi, G., Chergui, K.,
Svenningsson, P., Fienberg, A. A., Greengard, P. (2000). Dopamine and
cAMP-regulated phosphoprotein 32 kDa controls both striatal long-term
depression and long-term potentiation, opposing forms of synaptic
plasticity. J Neurosci. 20:8443–8451.
- Wickens, J. R., Begg, A. J., Arbuthnott, G. W.
(1996). Dopamine
reverses the depression of rat corticostriatal synapses which normally
follows high-frequency stimulation of cortex in vitro. Neurosci, 70:1–5.
- Joel, D., Niv, Y.,
Ruppin, E. (2002). Actor-critic models of the basal
ganglia: new anatomical and computational perspectives. Neural
Networks 15(4-6): 535-547.
Comments
Post a Comment