Part I — Foundations

Chapter 1: Signals & Systems

We begin our journey into signal theory by establishing the mathematical language used to describe signals and the systems that process them. This chapter covers the classification of signals and systems, the convolution integral and its worked examples, deconvolution, regularisation, and the fundamental signals that recur throughout all subsequent chapters.

1.1 What Is a Signal?

A signal is a function that carries information. Formally it is a mapping from an independent variable — most often time — to a value that encodes some physical quantity: voltage, pressure, pixel intensity, stock price, etc.

Continuous-Time vs Discrete-Time

Continuous-Time (CT) Signal

A CT signal $x(t)$ is defined for every real value of $t \in \mathbb{R}$. Examples include analogue audio, sensor voltages, and electromagnetic waveforms.

Discrete-Time (DT) Signal

A DT signal $x[n]$ is defined only at integer indices $n \in \mathbb{Z}$. It typically arises by sampling a CT signal at uniform intervals: $x[n] = x(nT_s)$, where $T_s$ is the sampling period.

Energy and Power Signals

The total energy of a CT signal is

$$E_x = \int_{-\infty}^{\infty} |x(t)|^2 \, dt$$

and its average power is

$$P_x = \lim_{T \to \infty} \frac{1}{2T} \int_{-T}^{T} |x(t)|^2 \, dt$$

Energy Signal

A signal is an energy signal if $0 < E_x < \infty$ (and consequently $P_x = 0$). Example: a rectangular pulse of finite duration.

Power Signal

A signal is a power signal if $0 < P_x < \infty$ (and consequently $E_x = \infty$). Example: a sinusoid $x(t) = A\cos(\omega_0 t)$, which has $P_x = A^2/2$.

Some signals are neither energy nor power signals. For instance, $x(t) = t$ has $E_x = \infty$ and $P_x = \infty$.

For the discrete-time case the definitions are analogous:

$$E_x = \sum_{n=-\infty}^{\infty} |x[n]|^2, \qquad P_x = \lim_{N\to\infty} \frac{1}{2N+1} \sum_{n=-N}^{N} |x[n]|^2$$

1.2 Classification of Systems

A system $\mathcal{T}$ is any operation that transforms an input signal $x(t)$ into an output signal $y(t)$:

$$y(t) = \mathcal{T}\{x(t)\}$$

Linearity

Linear System

A system is linear if it obeys superposition. For any signals $x_1, x_2$ and scalars $a_1, a_2$:

$$\mathcal{T}\{a_1 x_1 + a_2 x_2\} = a_1 \mathcal{T}\{x_1\} + a_2 \mathcal{T}\{x_2\}$$

This combines both the additivity and homogeneity (scaling) properties.

Time-Invariance

Time-Invariant System

A system is time-invariant (TI) if a time shift of the input produces an identical time shift of the output:

$$\text{If } y(t) = \mathcal{T}\{x(t)\}, \quad \text{then } \mathcal{T}\{x(t - t_0)\} = y(t - t_0) \quad \forall \, t_0.$$

LTI System Characterisation

A system that is both linear and time-invariant (LTI) is completely characterised by its impulse response $h(t) = \mathcal{T}\{\delta(t)\}$. The output for any input is given by the convolution integral:

$$y(t) = x(t) * h(t) = \int_{-\infty}^{\infty} x(\tau) \, h(t - \tau) \, d\tau$$

This is the single most important result of this chapter and the foundation upon which all linear systems analysis rests.

Additional System Properties

Causal

The output at time $t$ depends only on current and past input values. Equivalently, $h(t) = 0$ for $t < 0$.

Stable (BIBO)

Bounded input implies bounded output. For LTI systems this requires $\int_{-\infty}^{\infty} |h(t)| \, dt < \infty$.

Memoryless

The output at $t$ depends only on the input at time $t$. For LTI systems: $h(t) = k\delta(t)$.

Why LTI matters: In practice, many physical systems are approximately linear and time-invariant over their operating range. The LTI framework gives us convolution for time-domain analysis and transfer functions for frequency-domain analysis — an extraordinarily powerful toolkit.

1.3 The Convolution Integral

Derivation from Impulse Decomposition

The key insight is that any signal can be expressed as a superposition of shifted, scaled impulses. Consider a CT signal $x(t)$. We can write:

$$x(t) = \int_{-\infty}^{\infty} x(\tau) \, \delta(t - \tau) \, d\tau$$

This is the sifting property of the Dirac delta. Now apply the LTI system $\mathcal{T}$ to both sides:

$$y(t) = \mathcal{T}\{x(t)\} = \mathcal{T}\left\{\int_{-\infty}^{\infty} x(\tau)\,\delta(t-\tau)\,d\tau\right\}$$

By linearity, the operator passes through the integral and the scalar $x(\tau)$:

$$y(t) = \int_{-\infty}^{\infty} x(\tau) \, \mathcal{T}\{\delta(t-\tau)\} \, d\tau$$

By time-invariance, $\mathcal{T}\{\delta(t-\tau)\} = h(t-\tau)$, so:

Convolution Integral

$$y(t) = (x * h)(t) = \int_{-\infty}^{\infty} x(\tau) \, h(t - \tau) \, d\tau$$

Properties of Convolution

PropertyStatement
Commutativity$x * h = h * x$
Associativity$(x * h_1) * h_2 = x * (h_1 * h_2)$
Distributivity$x * (h_1 + h_2) = x * h_1 + x * h_2$
Identity$x * \delta = x$
Time shift$x(t-t_0) * h(t) = (x*h)(t-t_0)$

The Convolution Theorem

Convolution Theorem

Convolution in the time domain corresponds to multiplication in the frequency domain:

$$y(t) = x(t) * h(t) \quad \Longleftrightarrow \quad Y(\omega) = X(\omega) \cdot H(\omega)$$

where $X(\omega) = \mathcal{F}\{x(t)\}$ and $H(\omega) = \mathcal{F}\{h(t)\}$ denote the Fourier transforms. This is computationally crucial: convolution costs $O(N^2)$ in the time domain but only $O(N \log N)$ via FFT-based multiplication.

Example: RC Low-Pass Filter

Consider a simple series RC circuit with input voltage $x(t)$ and output taken across the capacitor. The impulse response is:

$$h(t) = \frac{1}{\tau} e^{-t/\tau} u(t), \qquad \tau = RC$$

For a step input $x(t) = u(t)$, the output is:

$$y(t) = \int_0^t \frac{1}{\tau} e^{-(t-s)/\tau} ds = \left(1 - e^{-t/\tau}\right) u(t)$$

The output rises exponentially toward 1, reaching 63.2% at $t = \tau$. We verify this numerically below.

RC Circuit Step Response (Convolution in Action)

Numerical convolution of a step input with the RC impulse response, compared against the exact analytical solution.

Click Run to execute the Python code

First run will download Python environment (~15MB)

1.4 Convolution Worked Examples

Example 1 — Two Rectangular Pulses (Triangle)

Let $x(t) = \operatorname{rect}(t)$ and $h(t) = \operatorname{rect}(t)$. Both have unit width and height. Their convolution yields the triangle function:

$$(x * h)(t) = \operatorname{tri}(t) = \begin{cases} 1 - |t|, & |t| \le 1 \\ 0, & \text{otherwise} \end{cases}$$

This result is fundamental: it shows that convolving a pulse with itself smooths it, converting a discontinuous function into a continuous one.

Convolution of Rectangular Pulses

Convolving two rectangular pulses to produce a triangular pulse — the foundational convolution example.

Click Run to execute the Python code

First run will download Python environment (~15MB)

Example 2 — Exponential with Step (Capacitor Charging)

Let $x(t) = u(t)$ and $h(t) = e^{-\alpha t} u(t)$ with $\alpha > 0$. Then:

$$(x * h)(t) = \int_0^t e^{-\alpha \tau} d\tau = \frac{1}{\alpha}\left(1 - e^{-\alpha t}\right) u(t)$$

This is exactly the capacitor charging curve. The output approaches $1/\alpha$ as $t \to \infty$, reaching 63.2% of its final value at $t = 1/\alpha$.

Example 3 — Gaussian * Gaussian (Central Limit Theorem)

Let $g_{\sigma}(t) = \frac{1}{\sigma\sqrt{2\pi}} e^{-t^2/(2\sigma^2)}$. The convolution of two Gaussians is another Gaussian:

$$g_{\sigma_1} * g_{\sigma_2} = g_{\sqrt{\sigma_1^2 + \sigma_2^2}}$$

The variances add. This is the continuous analogue of the Central Limit Theorem: repeatedly convolving any distribution with itself produces a Gaussian in the limit. It also explains why cascading Gaussian blurs in image processing yields a single wider Gaussian blur.

Example 4 — Moving Average Filter (Discrete)

The discrete-time $M$-point moving average filter has impulse response:

$$h[n] = \frac{1}{M} \sum_{k=0}^{M-1} \delta[n-k] = \begin{cases} 1/M, & 0 \le n \le M-1 \\ 0, & \text{otherwise} \end{cases}$$

The output is the running average:

$$y[n] = (x * h)[n] = \frac{1}{M} \sum_{k=0}^{M-1} x[n-k]$$

This is a low-pass filter. It is widely used to smooth noisy data (e.g., stock prices, sensor readings). Its frequency response is a sinc-like function with nulls at multiples of $2\pi/M$.

Example 5 — Cross-Correlation as Convolution (Radar / Sonar)

The cross-correlation of $x$ and $h$ is defined as:

$$R_{xh}(t) = \int_{-\infty}^{\infty} x(\tau) \, h(\tau + t) \, d\tau = x(-t) * h(t)$$

That is, correlation is convolution with the time-reversed version of one signal. In radar and sonar, a known pulse $x(t)$ is transmitted and the received signal $h(t)$ is cross-correlated with $x(t)$ to detect echoes. The peak of $R_{xh}(t)$ reveals the round-trip delay, hence the target range.

Matched filtering — the optimal detector in additive white Gaussian noise — is precisely this correlation operation.

1.5 Deconvolution

Convolution blurs and mixes. Deconvolution is the inverse problem: given the observed output $y(t) = x(t) * h(t) + n(t)$ and (possibly) the kernel $h(t)$, recover the original signal $x(t)$.

Inverse Filtering

In the frequency domain the forward model is $Y(\omega) = X(\omega)H(\omega) + N(\omega)$. The naive inverse filter is:

$$\hat{X}(\omega) = \frac{Y(\omega)}{H(\omega)}$$

This fails catastrophically whenever $H(\omega) \approx 0$, because the noise term $N(\omega)/H(\omega)$ is massively amplified. This is the fundamental ill-posedness of deconvolution.

Wiener Deconvolution

Wiener Filter

The Wiener filter minimises the mean-square error $\mathbb{E}\left[|x(t) - \hat{x}(t)|^2\right]$. Its transfer function is:

$$G(\omega) = \frac{H^*(\omega)}{|H(\omega)|^2 + \frac{S_{nn}(\omega)}{S_{xx}(\omega)}}$$

where $S_{nn}$ and $S_{xx}$ are the power spectral densities of noise and signal respectively. When the SNR is high ($|H|^2 \gg S_{nn}/S_{xx}$), the filter approximates $1/H$. When the SNR is low, the filter attenuates those frequency components instead of amplifying noise.

In practice, the ratio $S_{nn}/S_{xx}$ is often unknown and is replaced by a single regularisation parameter $\lambda$:

$$G(\omega) = \frac{H^*(\omega)}{|H(\omega)|^2 + \lambda}$$

Wiener Deconvolution Demo

Recovering a spike train from a blurred and noisy observation using the Wiener filter.

Click Run to execute the Python code

First run will download Python environment (~15MB)

Applications of Deconvolution

Seismology

A seismic trace is the convolution of the source wavelet with the Earth's reflectivity series. Deconvolution recovers the reflectivity — a map of subsurface layer boundaries — essential for oil and gas exploration.

Astronomy (Richardson-Lucy)

Telescope images are blurred by atmospheric turbulence and the point spread function (PSF) of the optics. The Richardson-Lucy algorithm — an iterative maximum-likelihood deconvolution — sharpens images while preserving positivity (photon counts cannot be negative).

Spectroscopy

Spectral line shapes are broadened by the instrument line function (ILF). Deconvolution with the known ILF recovers the true spectral lines, improving resolution and enabling accurate identification of chemical species.

Medical Imaging

In MRI and CT, the measured signal is a blurred version of the tissue density map. Deconvolution (often combined with regularisation) is used in perfusion imaging, diffusion tensor imaging, and super-resolution reconstruction.

1.6 Regularisation Methods

Deconvolution is an ill-posed inverse problem: small perturbations (noise) in the data can cause large changes in the solution. Regularisation stabilises the inversion by incorporating prior information about the expected solution.

Tikhonov Regularisation

Tikhonov Regularisation

Instead of minimising $\|y - Hx\|^2$ alone, we minimise:

$$\hat{x} = \arg\min_x \left\{ \|y - Hx\|^2 + \lambda \|\Gamma x\|^2 \right\}$$

where $\lambda > 0$ is the regularisation parameter and $\Gamma$ is the Tikhonov matrix (often $\Gamma = I$ for zeroth-order, or a derivative operator for higher-order regularisation). The closed-form solution is:

$$\hat{x} = \left(H^H H + \lambda \Gamma^H \Gamma\right)^{-1} H^H y$$

In the frequency domain with $\Gamma = I$ this reduces to the parametric Wiener filter: $\hat{X}(\omega) = \frac{H^*(\omega)}{|H(\omega)|^2 + \lambda} Y(\omega)$.

Choosing $\lambda$: Too small and the solution is dominated by noise; too large and the solution is over-smoothed. Common selection methods include the L-curve criterion, generalised cross-validation (GCV), and the discrepancy principle (choosing $\lambda$ so that the residual norm matches the expected noise level).

Truncated SVD

Truncated Singular Value Decomposition

Write the system matrix as $H = U \Sigma V^H$ via the SVD. The pseudo-inverse is $H^+ = V \Sigma^{-1} U^H$, but small singular values $\sigma_i$ cause instability. In truncated SVD (TSVD) we keep only the $k$ largest singular values:

$$\hat{x}_k = \sum_{i=1}^{k} \frac{u_i^H y}{\sigma_i} v_i$$

The truncation index $k$ plays the role of the regularisation parameter. Components corresponding to small singular values (high-frequency noise amplification) are discarded entirely.

TSVD is especially popular in numerical linear algebra and geophysics. Compared to Tikhonov, it provides a sharper cutoff — components are either kept fully or discarded — rather than a smooth damping.

Connection between methods: Both Tikhonov and TSVD can be understood through the lens of spectral filtering. Let $\phi_i$ be the filter factor for the $i$-th singular component. For Tikhonov, $\phi_i = \sigma_i^2 / (\sigma_i^2 + \lambda)$ (a smooth sigmoid); for TSVD, $\phi_i = 1$ if $i \le k$ and 0 otherwise (a hard cutoff).

1.7 Fundamental Signals

The following signals form the building blocks of signal theory. They recur in every subsequent chapter.

SignalNotationDefinitionKey Property
Dirac impulse$\delta(t)$$\int f(t)\delta(t-a)\,dt = f(a)$Identity for convolution
Unit step$u(t)$$u(t) = \begin{cases}1,&t>0\\0,&t<0\end{cases}$$u'(t) = \delta(t)$
Complex exponential$e^{j\omega_0 t}$$\cos\omega_0 t + j\sin\omega_0 t$Eigenfunction of LTI systems
Sinc$\operatorname{sinc}(t)$$\frac{\sin(\pi t)}{\pi t}$$\mathcal{F}\{\operatorname{rect}\} = \operatorname{sinc}$
Rectangle$\operatorname{rect}(t)$$1 \text{ for } |t| \le \tfrac{1}{2}, \; 0 \text{ else}$Ideal frequency window
Triangle$\operatorname{tri}(t)$$\max(1-|t|, \; 0)$$\operatorname{rect} * \operatorname{rect} = \operatorname{tri}$

Plot Fundamental Signals

Visualise the unit step, sinc, rect, and decaying exponential — the building blocks of signal theory.

Click Run to execute the Python code

First run will download Python environment (~15MB)

The complex exponential as eigenfunction: If $x(t) = e^{j\omega_0 t}$ is passed through an LTI system with impulse response $h(t)$, the output is $y(t) = H(\omega_0) e^{j\omega_0 t}$, where $H(\omega_0) = \int h(\tau) e^{-j\omega_0 \tau} d\tau$ is the transfer function evaluated at $\omega_0$. The complex exponential passes through unchanged in shape — only scaled and phase-shifted. This eigenfunction property is the reason the Fourier transform is so central to LTI system analysis.

Chapter Summary

  • Signals are functions carrying information; they can be continuous-time or discrete-time, energy or power.
  • A system is LTI if it is both linear and time-invariant. Such a system is fully characterised by its impulse response $h(t)$.
  • The output of an LTI system is the convolution $y = x * h$, which can be computed efficiently in the frequency domain as $Y = X \cdot H$.
  • Deconvolution inverts the blurring process but is ill-posed in the presence of noise.
  • The Wiener filter gives the optimal (MSE-sense) linear deconvolution by balancing inversion accuracy against noise amplification.
  • Tikhonov regularisation and truncated SVD are general-purpose stabilisation techniques for ill-posed problems.
  • The fundamental signals — $\delta$, $u$, $e^{j\omega t}$, sinc, rect, tri — are the vocabulary of signal processing.