Part II: Systems Neuroscience | Chapter 1

Motor Systems

Motor cortex, basal ganglia, cerebellum, central pattern generators, and the computational principles of movement control

The Neuroscience of Movement

Motor control is the ultimate output of neural computation. The motor system must solve the problem of transforming high-level goals into precise patterns of muscle activation, coordinating dozens of muscles across multiple joints in real time. This requires hierarchical planning (motor cortex), action selection (basal ganglia), online error correction (cerebellum), and rhythmic pattern generation (spinal CPGs).

This chapter explores the computational architecture of the motor system, from cortical population coding of movement direction to the reinforcement learning circuits of the basal ganglia, the adaptive filtering of the cerebellum, and the oscillatory dynamics of central pattern generators.

1. Motor Cortex and Population Coding

Primary motor cortex (M1) contains a topographic map of the body — the motor homunculus discovered by Penfield and Boldrey (1937). However, individual M1 neurons do not simply control individual muscles. Georgopoulos et al. (1982) showed that each neuron is broadly tuned to movement direction, with the population collectively encoding the intended movement vector.

Derivation 1: Population Vector Algorithm

Each M1 neuron $i$ has a preferred direction $\mathbf{d}_i$ and fires according to a cosine tuning model. For a movement in direction $\mathbf{m}$:

$$r_i = b_i + k_i \cos(\theta_m - \theta_i) = b_i + k_i \, \hat{\mathbf{d}}_i \cdot \hat{\mathbf{m}}$$

The population vector is the weighted sum of preferred directions:

$$\mathbf{P} = \sum_{i=1}^{N} (r_i - b_i) \, \hat{\mathbf{d}}_i$$

Substituting the tuning model and using the identity for uniformly distributed preferred directions:

$$\mathbf{P} = \sum_{i=1}^{N} k_i (\hat{\mathbf{d}}_i \cdot \hat{\mathbf{m}}) \hat{\mathbf{d}}_i = \frac{N \bar{k}}{2} \hat{\mathbf{m}}$$

For uniformly distributed preferred directions and equal gains $k_i = k$, the population vector points in the true movement direction with magnitude proportional to $N$. This result relies on $\sum_i \hat{d}_{ix}\hat{d}_{iy} = 0$ and$\sum_i \hat{d}_{ix}^2 = N/2$ for 2D uniform distributions.

2. Basal Ganglia and Action Selection

The basal ganglia are a group of subcortical nuclei critically involved in action selection, habit formation, and reward-based learning. The direct pathway (cortex → striatum → GPi/SNr → thalamus) facilitates movement by disinhibiting thalamocortical circuits, while the indirect pathway (cortex → striatum → GPe → STN → GPi/SNr) suppresses competing actions.

Derivation 2: Reinforcement Learning Model of Striatal Plasticity

Dopamine signals from the substantia nigra pars compacta (SNc) encode reward prediction errors (RPE), as described by the temporal difference (TD) learning rule. The RPE at time $t$ is:

$$\delta_t = r_t + \gamma V(s_{t+1}) - V(s_t)$$

where $r_t$ is reward, $\gamma$ is the discount factor, and $V(s)$ is the value function. Striatal synaptic weights update according to:

$$\Delta w_{ij} = \alpha \, \delta_t \, e_{ij}(t)$$

where $e_{ij}(t)$ is an eligibility trace capturing the history of pre-post coincidence:

$$e_{ij}(t+1) = \lambda \gamma \, e_{ij}(t) + x_i(t) \cdot y_j(t)$$

The direct pathway D1 neurons undergo LTP with positive $\delta$ (dopamine burst), strengthening rewarded actions. Indirect pathway D2 neurons undergo LTP with negative$\delta$ (dopamine dip), strengthening avoidance of punished actions. This actor-critic architecture implements a biologically plausible reinforcement learning algorithm.

3. Cerebellum and Adaptive Motor Control

The cerebellum contains more than half the brain's neurons and plays a critical role in motor coordination, timing, and motor learning. Its remarkably regular circuitry — featuring parallel fibers, Purkinje cells, and climbing fibers — implements an adaptive filter that learns internal models of the body and environment.

Derivation 3: The Marr-Albus-Ito Model of Cerebellar Learning

The cerebellum implements a supervised learning algorithm where climbing fiber inputs from the inferior olive provide error signals. The Purkinje cell output is a weighted sum of granule cell (parallel fiber) inputs:

$$y(t) = \sum_{i=1}^{N_{GC}} w_i \, x_i(t) = \mathbf{w}^T \mathbf{x}(t)$$

The climbing fiber carries the error signal $e(t) = y^*(t) - y(t)$ where $y^*$ is the desired output. The parallel fiber-Purkinje cell synaptic weights update via long-term depression (LTD):

$$\Delta w_i = -\eta \, e(t) \, x_i(t)$$

This is equivalent to the Widrow-Hoff (LMS) learning rule. The granule cell layer performs a high-dimensional expansion: $N_{GC} \gg N_{\text{input}}$ (humans have ~50 billion granule cells). By the Cover theorem, this expansion makes the representation linearly separable with high probability. The convergence rate depends on the eigenvalues of the input correlation matrix:

$$\tau_{\text{learn}} \approx \frac{1}{\eta \lambda_{\min}(\mathbf{X}^T\mathbf{X})}$$

This model explains cerebellar involvement in vestibulo-ocular reflex adaptation, saccade calibration, and reaching error correction.

4. Central Pattern Generators

Central pattern generators (CPGs) are neural circuits in the spinal cord and brainstem that produce rhythmic motor patterns (walking, breathing, swimming) without rhythmic sensory or cortical input. The half-center model, proposed by Brown (1911), consists of two mutually inhibitory neuron populations that alternate activity.

Derivation 4: Half-Center Oscillator Model

The half-center CPG can be modeled as two mutually inhibitory units with adaptation. Let $u_1, u_2$ represent the activities of flexor and extensor populations:

$$\tau \frac{du_1}{dt} = -u_1 + S\left(w_E u_1 - w_I u_2 - b \cdot a_1 + I_{\text{ext}}\right)$$

$$\tau \frac{du_2}{dt} = -u_2 + S\left(w_E u_2 - w_I u_1 - b \cdot a_2 + I_{\text{ext}}\right)$$

where $S(x) = 1/(1 + e^{-x})$ is the sigmoid activation, $w_I$ is the mutual inhibition strength, $w_E$ is self-excitation, and $a_i$ are adaptation variables with slow dynamics:

$$\tau_a \frac{da_i}{dt} = -a_i + u_i, \quad \tau_a \gg \tau$$

The oscillation period is approximately $T \approx 2\tau_a \ln\left(\frac{w_I + b}{w_I - b}\right)$when $w_I > b$. The duty cycle (fraction of time each half is active) can be modulated by asymmetric drive $I_{\text{ext}}$, explaining how descending signals from motor cortex control locomotion speed and gait.

Derivation 5: Optimal Feedback Control of Reaching

Todorov and Jordan (2002) proposed that the motor system implements optimal feedback control, minimizing a cost function that trades off accuracy and effort. For a reaching movement with state $\mathbf{x}$ (position, velocity) and control $\mathbf{u}$ (muscle forces):

$$J = \mathbf{x}(T)^T \mathbf{Q}_f \mathbf{x}(T) + \int_0^T \left[\mathbf{x}^T \mathbf{Q} \mathbf{x} + \mathbf{u}^T \mathbf{R} \mathbf{u}\right] dt$$

subject to linear dynamics $\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u} + \text{noise}$. The optimal controller is:

$$\mathbf{u}^*(t) = -\mathbf{R}^{-1}\mathbf{B}^T \mathbf{P}(t) \hat{\mathbf{x}}(t)$$

where $\mathbf{P}(t)$ satisfies the Riccati equation and $\hat{\mathbf{x}}$ is the Kalman-filtered state estimate. A key prediction is the "minimum intervention principle": the controller only corrects deviations that affect task performance, allowing variability in task-irrelevant dimensions. This explains the observed structure of movement variability in reaching and grasping.

5. Historical Development

• 1870: Fritsch and Hitzig demonstrate electrical stimulation of motor cortex produces contralateral movements.
• 1911: Graham Brown proposes the half-center model for spinal locomotor CPGs.
• 1937: Penfield and Boldrey map the motor homunculus using intraoperative cortical stimulation.
• 1969: Evarts records single neurons in motor cortex of behaving monkeys, linking neural activity to movement parameters.
• 1982: Georgopoulos introduces the population vector hypothesis for motor cortex directional coding.
• 1990s: Houk and Wise propose the cerebellar adaptive filter model; Schultz discovers dopamine RPE signals in basal ganglia.
• 2002: Todorov and Jordan formulate optimal feedback control theory, explaining movement variability structure.
• 2012: Churchland et al. reveal rotational dynamics in motor cortex population activity using jPCA.

6. Applications

Brain-Machine Interfaces

Population vector decoding from M1 enables paralyzed patients to control robotic arms and computer cursors. Kalman filter decoders incorporate the optimal control framework to predict intended movements from neural activity.

Deep Brain Stimulation

DBS of the subthalamic nucleus alleviates Parkinson's disease motor symptoms by modulating basal ganglia circuit dynamics. Understanding the direct/indirect pathway balance guides electrode placement and stimulation parameters.

Robotics and Control

Cerebellar-inspired adaptive controllers enable robots to learn complex motor skills. CPG-based locomotion controllers produce stable, adaptable gaits for legged robots without explicit trajectory planning.

Rehabilitation Engineering

Understanding motor learning principles guides the design of rehabilitation protocols for stroke recovery. Error augmentation and reinforcement-based training leverage cerebellar and basal ganglia learning mechanisms.

7. Computational Exploration

Motor Systems: Population Coding, Reinforcement Learning, Cerebellum, and CPGs

Python

script.py274 lines

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

print("=" * 72)
print("MOTOR SYSTEMS: POPULATION CODING, RL, CEREBELLUM, AND CPGs")
print("=" * 72)

np.random.seed(42)

# --------------------------------------------------
# 1. POPULATION VECTOR DECODING IN MOTOR CORTEX
# --------------------------------------------------
print()
print("1. POPULATION VECTOR DECODING (Motor Cortex)")
print("-" * 55)

N_neurons = 50
pref_dirs = np.linspace(0, 2*np.pi, N_neurons, endpoint=False)
b0 = 20.0  # baseline rate
k_gain = 30.0  # modulation depth

true_dir = np.pi / 4  # 45 degrees
n_trials = 300
decoded_dirs = []
errors = []

for trial in range(n_trials):
    rates = b0 + k_gain * np.cos(true_dir - pref_dirs)
    noisy_rates = np.random.poisson(rates.astype(int).clip(0))
    px = np.sum((noisy_rates - b0) * np.cos(pref_dirs))
    py = np.sum((noisy_rates - b0) * np.sin(pref_dirs))
    dec_dir = np.arctan2(py, px)
    decoded_dirs.append(dec_dir)
    errors.append(np.degrees(dec_dir - true_dir))

mean_err = np.mean(errors)
std_err = np.std(errors)
print("  True direction: {:.1f} deg".format(np.degrees(true_dir)))
print("  Mean decoded error: {:.2f} deg".format(mean_err))
print("  Decoding std: {:.2f} deg".format(std_err))
print("  N neurons: {}".format(N_neurons))

# --------------------------------------------------
# 2. BASAL GANGLIA: TD LEARNING SIMULATION
# --------------------------------------------------
print()
print("2. TD LEARNING IN BASAL GANGLIA")
print("-" * 55)

n_states = 10
gamma = 0.95
alpha_td = 0.1
V = np.zeros(n_states)
n_episodes = 200
reward_state = n_states - 1

rpe_history = []
value_snapshots = []

for ep in range(n_episodes):
    for s in range(n_states - 1):
        r = 1.0 if s == reward_state - 1 else 0.0
        delta = r + gamma * V[s+1] - V[s]
        V[s] += alpha_td * delta
        if ep >= n_episodes - 1:
            rpe_history.append(delta)
    if ep in [0, 10, 50, 199]:
        value_snapshots.append(V.copy())

print("  States: {}, Gamma: {}, Alpha: {}".format(n_states, gamma, alpha_td))
print("  Final value function:")
for s in range(n_states):
    print("    V[{}] = {:.4f}".format(s, V[s]))
print("  Final RPE at reward state: {:.4f}".format(rpe_history[-1]))

# --------------------------------------------------
# 3. CEREBELLAR LEARNING (LMS RULE)
# --------------------------------------------------
print()
print("3. CEREBELLAR LEARNING (Marr-Albus-Ito Model)")
print("-" * 55)

N_granule = 100
N_output = 1
eta_cb = 0.01

# Target: learn a sine wave
t_steps = 200
t_axis = np.linspace(0, 2*np.pi, t_steps)
target = np.sin(t_axis)

# Random granule cell features (basis functions)
centers = np.random.uniform(0, 2*np.pi, N_granule)
widths = np.random.uniform(0.2, 0.8, N_granule)

# Compute granule cell activations
X = np.zeros((t_steps, N_granule))
for i in range(N_granule):
    X[:, i] = np.exp(-(t_axis - centers[i])**2 / (2 * widths[i]**2))

w_cb = np.zeros(N_granule)
error_history = []

for epoch in range(100):
    y_pred = X.dot(w_cb)
    err = target - y_pred
    mse = np.mean(err**2)
    error_history.append(mse)
    w_cb += eta_cb * X.T.dot(err) / t_steps

y_final = X.dot(w_cb)
print("  Granule cells: {}".format(N_granule))
print("  Learning rate: {}".format(eta_cb))
print("  Initial MSE: {:.6f}".format(error_history[0]))
print("  Final MSE: {:.6f}".format(error_history[-1]))
print("  Correlation with target: {:.4f}".format(np.corrcoef(target, y_final)[0,1]))

# --------------------------------------------------
# 4. HALF-CENTER CPG OSCILLATOR
# --------------------------------------------------
print()
print("4. HALF-CENTER CPG OSCILLATOR")
print("-" * 55)

tau = 10.0  # ms
tau_a = 200.0  # ms (slow adaptation)
w_I = 5.0  # mutual inhibition
w_E = 2.0  # self-excitation
b_adapt = 3.0  # adaptation strength
I_ext = 2.0  # external drive
dt_cpg = 0.5  # ms
T_cpg = 3000.0  # ms
n_cpg = int(T_cpg / dt_cpg)

def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-np.clip(x, -20, 20)))

u1 = np.zeros(n_cpg)
u2 = np.zeros(n_cpg)
a1 = np.zeros(n_cpg)
a2 = np.zeros(n_cpg)
u1[0] = 0.6
u2[0] = 0.1
a1[0] = 0.3
a2[0] = 0.05

for i in range(n_cpg - 1):
    du1 = (-u1[i] + sigmoid(w_E*u1[i] - w_I*u2[i] - b_adapt*a1[i] + I_ext)) / tau
    du2 = (-u2[i] + sigmoid(w_E*u2[i] - w_I*u1[i] - b_adapt*a2[i] + I_ext)) / tau
    da1 = (-a1[i] + u1[i]) / tau_a
    da2 = (-a2[i] + u2[i]) / tau_a
    u1[i+1] = u1[i] + dt_cpg * du1
    u2[i+1] = u2[i] + dt_cpg * du2
    a1[i+1] = a1[i] + dt_cpg * da1
    a2[i+1] = a2[i] + dt_cpg * da2

t_cpg_axis = np.arange(n_cpg) * dt_cpg

# Estimate period from zero crossings
diff_signal = u1 - u2
crossings = np.where(np.diff(np.sign(diff_signal[n_cpg//2:])))[0]
if len(crossings) >= 2:
    period = 2 * np.mean(np.diff(crossings)) * dt_cpg
else:
    period = 0.0

print("  tau = {:.0f} ms, tau_a = {:.0f} ms".format(tau, tau_a))
print("  w_I = {:.1f}, w_E = {:.1f}, b = {:.1f}".format(w_I, w_E, b_adapt))
print("  Estimated oscillation period: {:.1f} ms".format(period))
print("  Frequency: {:.2f} Hz".format(1000.0 / period if period > 0 else 0))

# --------------------------------------------------
# 5. OPTIMAL FEEDBACK CONTROL (REACHING)
# --------------------------------------------------
print()
print("5. OPTIMAL FEEDBACK CONTROL (Minimum Jerk Reaching)")
print("-" * 55)

# Minimum jerk trajectory
T_reach = 1.0  # seconds
dt_reach = 0.01
t_reach = np.arange(0, T_reach + dt_reach, dt_reach)
n_reach = len(t_reach)
target_pos = 0.2  # meters

# Minimum jerk: x(t) = x_f * (10(t/T)^3 - 15(t/T)^4 + 6(t/T)^5)
s_norm = t_reach / T_reach
pos_mj = target_pos * (10*s_norm**3 - 15*s_norm**4 + 6*s_norm**5)
vel_mj = target_pos / T_reach * (30*s_norm**2 - 60*s_norm**3 + 30*s_norm**4)
acc_mj = target_pos / T_reach**2 * (60*s_norm - 180*s_norm**2 + 120*s_norm**3)

print("  Target distance: {:.1f} cm".format(target_pos * 100))
print("  Movement duration: {:.1f} s".format(T_reach))
print("  Peak velocity: {:.3f} m/s at t = {:.2f} s".format(
    np.max(vel_mj), t_reach[np.argmax(vel_mj)]))
print("  Peak acceleration: {:.3f} m/s^2".format(np.max(np.abs(acc_mj))))
print("  Velocity profile is bell-shaped (minimum jerk)")

# --------------------------------------------------
# PLOTTING
# --------------------------------------------------
fig, axes = plt.subplots(2, 3, figsize=(16, 10))
fig.suptitle("Motor Systems: Cortex, Basal Ganglia, Cerebellum, and CPGs",
             fontsize=14, fontweight='bold', color='white')
fig.patch.set_facecolor('#0a0a0a')

for ax in axes.flat:
    ax.set_facecolor('#111111')
    ax.tick_params(colors='white', labelsize=8)
    ax.xaxis.label.set_color('white')
    ax.yaxis.label.set_color('white')
    ax.title.set_color('#f0abfc')
    for spine in ax.spines.values():
        spine.set_color('#333333')

# Panel 1: Population vector decoding
axes[0,0].hist(errors, bins=30, color='#ec4899', alpha=0.7, density=True)
axes[0,0].axvline(0, color='#fbbf24', lw=2, label='True direction')
axes[0,0].set_xlabel('Decoding error (deg)')
axes[0,0].set_ylabel('Density')
axes[0,0].set_title('M1 Population Vector (N={})'.format(N_neurons))
axes[0,0].legend(fontsize=7, facecolor='#1a1a1a', edgecolor='#333', labelcolor='white')

# Panel 2: TD Learning value function
colors_td = ['#555555', '#888888', '#d946ef', '#ec4899']
labels_td = ['Episode 1', 'Episode 11', 'Episode 51', 'Episode 200']
for k, snap in enumerate(value_snapshots):
    axes[0,1].plot(range(n_states), snap, 'o-', color=colors_td[k], lw=2,
                   markersize=4, label=labels_td[k])
axes[0,1].set_xlabel('State')
axes[0,1].set_ylabel('Value V(s)')
axes[0,1].set_title('TD Learning: Value Function')
axes[0,1].legend(fontsize=7, facecolor='#1a1a1a', edgecolor='#333', labelcolor='white')

# Panel 3: Cerebellar learning
axes[0,2].semilogy(error_history, color='#d946ef', lw=2)
axes[0,2].set_xlabel('Epoch')
axes[0,2].set_ylabel('MSE')
axes[0,2].set_title('Cerebellar Learning Convergence')

# Panel 4: CPG oscillations
t_plot = t_cpg_axis[n_cpg//3:]
axes[1,0].plot(t_plot, u1[n_cpg//3:], color='#ec4899', lw=1.5, label='Flexor')
axes[1,0].plot(t_plot, u2[n_cpg//3:], color='#60a5fa', lw=1.5, label='Extensor')
axes[1,0].set_xlabel('Time (ms)')
axes[1,0].set_ylabel('Activity')
axes[1,0].set_title('Half-Center CPG Oscillator')
axes[1,0].legend(fontsize=7, facecolor='#1a1a1a', edgecolor='#333', labelcolor='white')

# Panel 5: Minimum jerk trajectory
axes[1,1].plot(t_reach * 1000, pos_mj * 100, color='#ec4899', lw=2, label='Position (cm)')
axes[1,1].plot(t_reach * 1000, vel_mj * 100, color='#d946ef', lw=2, label='Velocity (cm/s)')
axes[1,1].set_xlabel('Time (ms)')
axes[1,1].set_ylabel('Position / Velocity')
axes[1,1].set_title('Minimum Jerk Reaching')
axes[1,1].legend(fontsize=7, facecolor='#1a1a1a', edgecolor='#333', labelcolor='white')

# Panel 6: Cerebellar fit
axes[1,2].plot(t_axis, target, 'w--', lw=1.5, label='Target', alpha=0.7)
axes[1,2].plot(t_axis, y_final, color='#ec4899', lw=2, label='Learned output')
axes[1,2].set_xlabel('Phase')
axes[1,2].set_ylabel('Output')
axes[1,2].set_title('Cerebellar Function Approximation')
axes[1,2].legend(fontsize=7, facecolor='#1a1a1a', edgecolor='#333', labelcolor='white')

plt.tight_layout()
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#0a0a0a')
plt.close()
print()
print("[Plot saved: population vector, TD learning, cerebellar learning,")
print(" CPG oscillations, minimum jerk reaching, and function approximation]")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Chapter Summary

• Motor cortex encodes movement direction via population vectors, with decoding precision scaling as $1/\sqrt{N}$.
• Basal ganglia implement an actor-critic reinforcement learning architecture, with dopamine encoding reward prediction errors $\delta = r + \gamma V(s') - V(s)$.
• Cerebellum learns internal models via supervised learning (LMS rule) with climbing fiber error signals and granule cell basis expansion.
• Central pattern generators produce rhythmic outputs through mutual inhibition and slow adaptation, with period controlled by adaptation timescale.
• Optimal feedback control predicts bell-shaped velocity profiles and the minimum intervention principle.

Share:X Reddit LinkedIn

← Sensory Processing Learning & Memory →