Module 0

Bayes & Beliefs

The Bayesian-brain hypothesis starts from a simple premise: to interact with the world, a brain must infer hidden causes from noisy observations. This module introduces Bayes’ theorem, the frequentist-Bayesian divide, generative vs. discriminative models, and the idea of a belief as a probability distribution over states of the world — the conceptual scaffolding for modules 1–8.

1. Bayes’ Theorem

For hidden cause H and observation D:

\[ P(H\mid D) \;=\; \frac{P(D\mid H)\,P(H)}{P(D)},\qquad P(D) = \int P(D\mid H')P(H')\,dH' \]

Four standard terms: prior P(H) is what the agent believed before the observation; the likelihoodP(D|H) is how probable the observation is given each hypothesis; the evidence P(D) is a normaliser; the posterior P(H|D) is the updated belief. Sequential observations compound by iterative update: yesterday’s posterior is today’s prior.

2. Beliefs as Distributions

Unlike the binary “I believe X” of everyday talk, Bayesian beliefs are continuous distributions: P(θ) assigns probability mass to each possible value of the world parameter. The mean tells you best guess; the variance tells you confidence. Updating a belief is updating a distribution, not flipping a boolean. This is the essential conceptual move of the Bayesian brain.

3. Generative vs. Discriminative

A generative model represents P(D, H) — how observations arise from causes. A discriminative model represents only P(H|D) — the conditional classifier. Generative models can reason in reverse, produce data, reason about counterfactuals, and support active exploration (M5). Deep learning defaulted to discriminative models until the diffusion/generative-AI turn of the 2020s; the Bayesian brain has been generative throughout.

Simulation: Belief Update & Conjugate Priors

Python
script.py61 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Frequentist vs. Bayesian

Frequentist statistics treats parameters as unknown but fixed; probability is the long-run frequency of an event. Bayesian statistics treats parameters as random variables with subjective probability distributions. For the Bayesian brain hypothesis, the frequentist framework is simply unusable — single-trial cognitive judgements cannot be interpreted as “repeated sampling.” Bayesian inference is the natural computational substrate of belief.

Key References

• Knill, D. C. & Pouget, A. (2004). “The Bayesian brain: the role of uncertainty in neural coding and computation.” Trends Neurosci., 27, 712–719.

• MacKay, D. J. C. (2003). Information Theory, Inference, and Learning Algorithms. Cambridge UP.

• Gelman, A. et al. (2013). Bayesian Data Analysis, 3rd ed. CRC Press.

• Doya, K., Ishii, S., Pouget, A. & Rao, R. P. N. (2007). Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press.

Share:XRedditLinkedIn