Part I: Neural Basics | Chapter 4

Sensory Processing

Receptive fields, feature detection, population coding, and the hierarchical organization of visual cortex V1

From Photons to Percepts

Sensory systems transform physical stimuli into neural representations that the brain can interpret and act upon. The visual system, the best-studied sensory modality, illustrates how hierarchical processing extracts increasingly complex features from raw sensory input. From the retina's center-surround receptive fields to the orientation columns of V1, a beautiful computational architecture emerges.

This chapter explores the mathematical foundations of receptive fields, Gabor-like filtering in V1, population coding of stimulus features, and the computational principles underlying feature detection. We derive key results for linear filtering, orientation tuning, and optimal population decoding in sensory cortex.

1. Receptive Fields

The receptive field of a sensory neuron is the region of stimulus space that influences its firing rate. In vision, Hartline (1938) first described receptive fields of retinal ganglion cells. Kuffler (1953) discovered the center-surround organization: ON-center cells are excited by light in the center and inhibited by light in the surround, while OFF-center cells show the reverse pattern.

Derivation 1: Center-Surround Receptive Fields as Difference of Gaussians

The center-surround receptive field can be modeled as a Difference of Gaussians (DoG). For an ON-center cell at position $(x_0, y_0)$:

$$RF(x, y) = \frac{A_c}{2\pi\sigma_c^2} \exp\left(-\frac{(x-x_0)^2 + (y-y_0)^2}{2\sigma_c^2}\right) - \frac{A_s}{2\pi\sigma_s^2} \exp\left(-\frac{(x-x_0)^2 + (y-y_0)^2}{2\sigma_s^2}\right)$$

where $A_c, \sigma_c$ are the center amplitude and width, and $A_s, \sigma_s$ are the surround parameters. Typically $\sigma_s \approx 3\sigma_c$ and the total integral is approximately zero, ensuring the cell responds to contrast rather than absolute luminance.

The response to a stimulus $I(x,y)$ is computed by convolution:

$$r = \int\int RF(x, y) \cdot I(x, y) \, dx \, dy$$

In the Fourier domain, the DoG acts as a bandpass filter: $\widetilde{RF}(f) = A_c e^{-2\pi^2 \sigma_c^2 f^2} - A_s e^{-2\pi^2 \sigma_s^2 f^2}$, which peaks at an optimal spatial frequency $f^* = \frac{1}{2\pi}\sqrt{\frac{2\ln(A_c\sigma_s^2 / A_s\sigma_c^2)}{\sigma_s^2 - \sigma_c^2}}$.

The center-surround structure implements lateral inhibition, which enhances edges and normalizes the response to overall luminance. This computational motif appears across sensory systems: somatosensory receptive fields show similar center-surround organization, and auditory neurons exhibit analogous spectral contrast enhancement.

2. Feature Detection in Primary Visual Cortex

Hubel and Wiesel's Nobel Prize-winning work (1959, 1962) revealed that neurons in primary visual cortex (V1) are selective for oriented edges and bars. Simple cells respond to edges at a specific orientation and position, while complex cells respond to oriented edges regardless of exact position within their receptive field.

Derivation 2: Gabor Model of V1 Simple Cell Receptive Fields

A V1 simple cell's receptive field is well described by a Gabor function — a sinusoid modulated by a Gaussian envelope. In 2D:

$$G(x, y) = A \exp\left(-\frac{x'^2 + \gamma^2 y'^2}{2\sigma^2}\right) \cos(2\pi f x' + \phi)$$

where the rotated coordinates are:

$$x' = x\cos\theta + y\sin\theta, \quad y' = -x\sin\theta + y\cos\theta$$

Here $\theta$ is the preferred orientation, $f$ is the spatial frequency,$\sigma$ is the Gaussian width, $\gamma$ is the aspect ratio (typically$\gamma \approx 0.5$), and $\phi$ is the spatial phase. The orientation bandwidth (half-width at half-maximum) is:

$$\Delta\theta_{1/2} = \arctan\left(\frac{1}{\pi f \sigma}\sqrt{\frac{\ln 2}{2}}\right)$$

Typical V1 neurons have orientation bandwidths of 20–40 degrees and spatial frequency bandwidths of 1–1.5 octaves. Daugman (1985) showed that Gabor functions achieve the theoretical minimum joint uncertainty in space and spatial frequency, making them optimal local feature detectors.

2.1 Simple vs Complex Cells

Simple cells have spatially separated ON and OFF subregions and respond to the precise position of an edge. Their response is well predicted by the linear receptive field model:$r(t) = [RF * I]^+$ where $[\cdot]^+$ denotes rectification. Complex cells, modeled by the energy model, compute the sum of squared responses from a quadrature pair of simple cells:

$$r_{\text{complex}} = \left(\int RF_{\text{even}} \cdot I \, dA\right)^2 + \left(\int RF_{\text{odd}} \cdot I \, dA\right)^2$$

This energy model produces phase-invariant orientation selectivity, consistent with the response properties of complex cells recorded in vivo.

3. Orientation Tuning and Columnar Organization

V1 neurons are organized into orientation columns — vertical slabs of cortex where all neurons prefer the same orientation. Adjacent columns prefer slightly different orientations, and a full 180-degree rotation occurs over approximately 1 mm of cortical surface (a hypercolumn). This regular arrangement, visualized using optical imaging by Blasdel and Salama (1986), contains pinwheel centers where all orientations converge.

Derivation 3: Orientation Tuning from Feedforward Connectivity

The feedforward model (Hubel and Wiesel, 1962) proposes that a V1 simple cell's orientation selectivity arises from the alignment of LGN inputs. Consider a simple cell receiving input from $N$ LGN cells with center-surround receptive fields arranged along a line at angle $\theta_0$:

$$RF_{\text{simple}}(x, y) = \sum_{i=1}^{N} w_i \cdot DoG(x - x_i, y - y_i)$$

When centers $(x_i, y_i)$ lie along the preferred axis and weights $w_i$ follow a Gaussian envelope, the resulting receptive field approximates a Gabor function. The orientation tuning curve is approximately Gaussian:

$$r(\theta) = r_{\max} \exp\left(-\frac{(\theta - \theta_0)^2}{2\sigma_\theta^2}\right) + r_{\text{base}}$$

However, the feedforward model alone produces broader tuning than observed. Recurrent cortical interactions sharpen tuning through a combination of recurrent excitation among similarly tuned neurons and cross-orientation inhibition, producing the narrow tuning widths ($\sigma_\theta \approx 15\text{--}25^\circ$) observed experimentally.

4. Population Coding of Visual Features

Individual neurons have noisy, broadly tuned responses, yet populations of neurons can represent stimuli with remarkable precision. Population coding theory addresses how stimulus information is distributed across many neurons and how downstream areas can decode this information.

Derivation 4: Population Vector Decoding of Orientation

For a population of $N$ neurons with Gaussian orientation tuning curves and independent Poisson noise, the maximum likelihood estimate of the stimulus orientation$\hat{\theta}$ can be obtained analytically. The log-likelihood is:

$$\ln P(\mathbf{r} | \theta) = \sum_{i=1}^{N} \left[ r_i \ln f_i(\theta) - f_i(\theta) - \ln(r_i!) \right]$$

Setting the derivative to zero:

$$\frac{\partial}{\partial\theta} \ln P = \sum_{i=1}^{N} \left(\frac{r_i}{f_i(\theta)} - 1\right) f_i'(\theta) = 0$$

For uniformly spaced preferred orientations and narrow tuning, this simplifies to a population vector:

$$\hat{\theta} = \frac{1}{2}\arg\left(\sum_{i=1}^{N} r_i \, e^{2i\theta_i^{\text{pref}}}\right)$$

The factor of 2 in the exponent accounts for the 180-degree periodicity of orientation. The variance of this estimate satisfies the Cramer-Rao bound:$\text{Var}(\hat{\theta}) \geq 1/J(\theta)$ where $J(\theta) = \sum_i [f_i'(\theta)]^2 / f_i(\theta)$scales linearly with $N$.

Derivation 5: Noise Correlations and Population Information

In reality, neurons share noise correlations $\rho_{ij}$, which fundamentally alter the information content of the population. The Fisher information for correlated neurons is:

$$J(\theta) = \mathbf{f}'(\theta)^T \, \mathbf{Q}^{-1} \, \mathbf{f}'(\theta) + \frac{1}{2}\text{tr}\left[\mathbf{Q}^{-1}\frac{\partial\mathbf{Q}}{\partial\theta}\mathbf{Q}^{-1}\frac{\partial\mathbf{Q}}{\partial\theta}\right]$$

where $\mathbf{Q}$ is the noise covariance matrix and $\mathbf{f}'(\theta)$ is the vector of tuning curve derivatives. For a population with uniform pairwise correlation$\rho$ between all neuron pairs:

$$J_N(\theta) \approx \frac{J_1(\theta)}{1 + (N-1)\rho} \cdot N$$

When $\rho > 0$ (positive correlations, as commonly observed), information saturates at $J_\infty = J_1 / \rho$ as $N \to \infty$. This means that adding more neurons cannot improve precision beyond a limit set by correlation structure — a result with profound implications for neural coding efficiency (Zohary et al., 1994; Averbeck et al., 2006).

5. Historical Development

  • 1938: Hartline defines the concept of a receptive field from recordings of frog retinal ganglion cells (Nobel Prize, 1967).
  • 1953: Kuffler discovers center-surround receptive fields in cat retinal ganglion cells, demonstrating lateral inhibition.
  • 1959: Hubel and Wiesel discover orientation selectivity in cat V1, identifying simple and complex cells (Nobel Prize, 1981).
  • 1962: Hubel and Wiesel propose the hierarchical model: LGN center-surround inputs converge to create V1 orientation selectivity.
  • 1985: Daugman demonstrates that V1 simple cells are optimally described by 2D Gabor functions, achieving minimum uncertainty.
  • 1986: Blasdel and Salama use optical imaging of intrinsic signals to visualize orientation maps and pinwheel centers in V1.
  • 1994: Zohary, Shadlen, and Newsome demonstrate that noise correlations limit population coding capacity in MT.
  • 1996: Olshausen and Field show that sparse coding applied to natural images produces Gabor-like receptive fields, linking V1 to efficient coding theory.

6. Applications

Computer Vision

Gabor filter banks inspired by V1 are widely used for texture classification, edge detection, and feature extraction. Convolutional neural networks learn filters remarkably similar to V1 receptive fields in their first layer.

Retinal Prosthetics

Understanding receptive field organization guides the design of electrode arrays that mimic retinal processing. Stimulation patterns based on center-surround models improve the quality of artificial vision.

Image Compression

The sparse coding hypothesis suggests that V1 represents images using a small number of active Gabor-like filters. This principle underlies compressed sensing and efficient image representations in engineering.

Clinical Diagnostics

Receptive field mapping using reverse correlation reveals functional deficits in amblyopia, glaucoma, and cortical lesions. Population coding models predict perceptual performance from neural recordings.

7. Computational Exploration

Sensory Processing: Receptive Fields, Gabor Filters, and Population Coding

Python
script.py276 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Chapter Summary

  • Center-surround receptive fields implement bandpass filtering via Difference of Gaussians, enhancing edges and normalizing luminance.
  • V1 simple cells are optimally modeled by Gabor functions, achieving minimum joint uncertainty in space and spatial frequency.
  • Orientation selectivity arises from aligned feedforward inputs, sharpened by recurrent cortical interactions.
  • Population vector decoding achieves precision scaling as $1/\sqrt{N}$ with independent noise.
  • Noise correlations impose a fundamental ceiling on population information: $J_\infty = J_1 / \rho$.