Machine Learning

Part I: Mathematical Foundations

Every machine learning algorithm rests on three mathematical pillars: linear algebra provides the geometry of data and transformations; probability theory formalises uncertainty and inference; optimisation theory explains how models improve. This part builds each pillar from first principles, with full derivations and Python simulations.

Chapter 1: Linear Algebra for ML

Vectors, matrices, eigendecomposition, SVD — the geometric and algebraic backbone of every machine learning algorithm.

Matrix multiplication & transposeEigendecomposition A = QΛQ⁻¹SVD: A = UΣVᵀ (full derivation)Rank, null space, positive definiteness

Chapter 2: Probability & Statistics for ML

Probability axioms, Bayes’ theorem, distributions, MLE and MAP estimation — the language of uncertainty in learning.

Bayes theorem from joint probabilityGaussian, Bernoulli, Categorical, PoissonMLE: derive normal equationsMAP estimation & conjugate priors

Chapter 3: Optimization Theory

Gradient descent, convexity, Adam optimizer, Lagrange multipliers and KKT conditions — how ML models actually learn.

Gradient, Hessian, Jacobian, Taylor expansionConvexity: definition & second-order conditionGD, Momentum, Adam derivationsLagrange multipliers & full KKT conditions

What you will learn

✓Represent data as vectors and matrices and reason geometrically

✓Decompose matrices with eigendecomposition and SVD for compression and analysis

✓Model uncertainty with probability distributions and derive MLE/MAP estimators

✓Apply Bayes’ theorem to update beliefs as data arrives

✓Prove gradient descent converges on convex objectives

✓Derive the Adam optimiser from first principles

✓Formulate constrained optimisation with Lagrange multipliers and KKT conditions

✓Understand every Part II–VII algorithm through these three lenses

Share:X Reddit LinkedIn

Course Overview Chapter 1: Linear Algebra