Machine Learning

Part V: Probabilistic ML

Deterministic models give a single answer; probabilistic models give a distribution over answers, quantifying what they know and what they don't. This part builds three pillars of probabilistic machine learning: Bayesian inference for updating beliefs, Gaussian processes for non-parametric function learning, and variational inference for scalable approximation. Together they form the foundation of modern probabilistic deep learning.

Chapter 13: Bayesian Inference

Bayes’ theorem as a learning rule: prior beliefs updated by data to form posteriors. Conjugate priors, MCMC, and the predictive distribution.

Bayes theorem: P(θ|D) = P(D|θ)P(θ)/P(D)Conjugate priors: Beta-Bernoulli, Normal-NormalPosterior mean as weighted average of prior and MLEMetropolis-Hastings & Gibbs sampling

Chapter 14: Gaussian Processes

A distribution over functions defined by a mean function and kernel. Full GP regression derivation with posterior predictive and marginal likelihood.

Multivariate Gaussian conditioning formulasKernel functions: RBF, Matérn, periodicGP posterior: μ* = K(X*,X)(K+σ²I)⁻¹yMarginal likelihood for hyperparameter optimization

Chapter 15: Variational Inference

When posteriors are intractable, optimise a simpler distribution to approximate them. Full ELBO derivation, mean-field VI, and the VAE connection.

ELBO: log p(x) = ELBO + KL(q||p)Mean-field factorisation & coordinate ascentOptimal q_j derivation from functional derivativeAmortised inference & connection to VAEs (Ch 12)

What you will learn

✓Derive the posterior distribution for Gaussian and Bernoulli likelihoods

✓Understand why conjugate priors make Bayesian updating exact

✓Implement Metropolis-Hastings MCMC for arbitrary posteriors

✓Define a Gaussian process through its mean and covariance function

✓Derive the GP posterior predictive mean and variance from scratch

✓Choose and tune kernel hyperparameters via marginal likelihood

✓Derive the ELBO and show it lower-bounds the log-evidence

✓Connect variational inference to the VAE objective of Chapter 12

Prerequisites

Part I: Probability

Bayes' theorem, Gaussian distribution, MLE and MAP estimation

Part I: Linear Algebra

Matrix inversion, positive definite matrices, multivariate Gaussians

Part IV: VAEs (Ch 12)

The reparameterisation trick and the VAE ELBO (revisited in Ch 15)

Share:X Reddit LinkedIn

← Ch 12: Autoencoders & VAEs Chapter 13: Bayesian Inference