Module 5

Climate Models & GCMs

From primitive equations to petascale simulations of Earth's future

5.1 The Primitive Equations

General circulation models (GCMs) solve the primitive equations, which are the full Navier-Stokes equations on a rotating sphere simplified by the hydrostatic approximation (valid for large-scale flow where horizontal scales \(\gg\) vertical scales). The system consists of five coupled equations:

Horizontal Momentum (Zonal & Meridional)

\( \frac{Du}{Dt} - fv + \frac{1}{\rho}\frac{\partial p}{\partial x} = \mathcal{F}_x \)

\( \frac{Dv}{Dt} + fu + \frac{1}{\rho}\frac{\partial p}{\partial y} = \mathcal{F}_y \)

where \(D/Dt = \partial/\partial t + u\partial/\partial x + v\partial/\partial y + \omega\partial/\partial p\)is the material derivative, \(f = 2\Omega\sin\phi\) is the Coriolis parameter, and\(\mathcal{F}\) represents subgrid-scale friction and turbulent mixing.

Hydrostatic Equation

\( \frac{\partial p}{\partial z} = -\rho g \quad \Longleftrightarrow \quad \frac{\partial \Phi}{\partial p} = -\frac{RT}{p} \)

The hydrostatic balance replaces the vertical momentum equation, filtering out sound waves and allowing much larger time steps. This is valid for horizontal scales \(\gtrsim\) 10 km.

Continuity Equation

\( \frac{\partial u}{\partial x} + \frac{\partial v}{\partial y} + \frac{\partial \omega}{\partial p} = 0 \)

In pressure coordinates, the continuity equation takes this simple divergence-free form (mass conservation). Here \(\omega = Dp/Dt\) is the pressure vertical velocity.

Thermodynamic Energy Equation

\( c_p \frac{DT}{Dt} - \frac{RT\omega}{p} = Q_{\text{rad}} + Q_{\text{latent}} + Q_{\text{diff}} \)

The first law of thermodynamics, where the left-hand side gives the adiabatic temperature change (including work done by expansion), and the right-hand side contains radiative heating, latent heat release from condensation/precipitation, and diffusive mixing.

Moisture Conservation

\( \frac{Dq}{Dt} = E - C + \mathcal{D}_q \)

where \(q\) is specific humidity, \(E\) is evaporation source,\(C\) is condensation sink, and \(\mathcal{D}_q\) is turbulent moisture diffusion. This equation is critical because water vapour is both the dominant greenhouse gas and the source of latent heating.

5.2 Discretisation Methods

The continuous equations must be discretised for numerical solution. Four major approaches are used in modern climate models:

Finite Difference (GFDL FMS, CESM)

Approximate derivatives on a lat-lon grid: \(\partial u/\partial x \approx (u_{i+1} - u_{i-1})/(2\Delta x)\). Simple to implement but suffers from the pole problem (converging meridians create very small grid cells near the poles).

Spectral (ECMWF IFS, older NCAR models)

Expand fields in spherical harmonics: \(\psi(\lambda,\phi) = \sum_{m,n} \psi_n^m Y_n^m(\lambda,\phi)\). Differentiation becomes multiplication in spectral space. Truncation at wavenumber \(N\) gives resolution \(\sim 180°/N\). T255 \(\approx\) 80 km.

Finite Volume (GFDL FV3, MPAS)

Conserves mass, momentum, and energy by integrating fluxes through cell boundaries. MPAS uses a Voronoi mesh on an icosahedral grid, avoiding the pole problem. FV3 uses a cubed-sphere grid.

Finite Element (FESOM, ICON)

Unstructured triangular meshes that can refine locally (e.g., higher resolution near coastlines). Combines flexibility of finite volumes with higher-order accuracy.

CFL Stability Condition

All explicit time-stepping schemes must satisfy the Courant-Friedrichs-Lewy (CFL)condition for numerical stability. For a wave with phase speed \(c\):

\( \text{CFL} = \frac{c \cdot \Delta t}{\Delta x} \leq 1 \quad \Longrightarrow \quad \Delta t \leq \frac{\Delta x}{c} \)

For climate models, the fastest wave is typically the external gravity wave with\(c \approx \sqrt{gH} \approx 300\) m/s. At \(\Delta x = 100\) km, this gives \(\Delta t \leq 333\) s (\(\sim\)5 min). At convection-permitting resolution (\(\Delta x = 3\) km), \(\Delta t \leq 10\) s, making these simulations enormously expensive.

5.3 Subgrid Parameterisation

Processes occurring below the grid scale (\(\lesssim \Delta x\)) cannot be resolved explicitly and must be parameterised: expressed in terms of resolved-scale variables. This is the largest source of uncertainty in climate projections.

Convection

Individual cumulus clouds (\(\sim\)1\(\text{--}\)10 km) are unresolved at typical GCM resolution. Mass-flux schemes (e.g., Zhang-McFarlane, Tiedtke) represent the collective effect of a population of clouds in each grid cell.

Cloud Microphysics

Processes like droplet formation, ice nucleation, coalescence, and precipitation occur at \(\mu\)m\(\text{--}\)mm scales. Bulk schemes track mass mixing ratios of cloud water, ice, rain, snow, and graupel.

Radiation

Radiative transfer through the atmosphere is computed column-by-column using correlated-k methods (e.g., RRTMGP). Cloud-radiation interaction (overlap assumptions) remains a major source of uncertainty.

Boundary Layer

Turbulent mixing in the lowest 1\(\text{--}\)2 km is parameterised using eddy diffusivity or mass-flux approaches. Surface exchange coefficients (\(C_k\), \(C_d\) as in Module 3) are critical for air-sea interaction.

Gravity Wave Drag

Mountain waves and convectively generated gravity waves transport momentum vertically. Unresolved orographic drag decelerates the flow, affecting jet stream position and blocking frequency.

Convective Adjustment Scheme

The simplest convection scheme: if the lapse rate exceeds the moist adiabatic rate, the column is unstable and is adjusted to neutral buoyancy. The buoyancy of a parcel is:

\( B = g\frac{\theta_v' - \bar{\theta}_v}{\bar{\theta}_v} \)

where \(\theta_v\) is the virtual potential temperature. When \(B > 0\)(unstable), the scheme redistributes heat and moisture vertically until \(B = 0\)(neutral), mimicking the effect of convective overturning. The heating rate from convective adjustment is:

\( \frac{\partial T}{\partial t}\bigg|_{\text{conv}} = \frac{T - T_{\text{adj}}}{\tau_{\text{conv}}} \)

where \(\tau_{\text{conv}} \sim 1\text{--}4\) hours is the convective relaxation timescale.

5.4 CMIP6 & Model Intercomparison

The Coupled Model Intercomparison Project Phase 6 (CMIP6)coordinates climate modelling centres worldwide to run standardised experiments with shared forcing scenarios (SSPs). CMIP6 includes \(\sim\)50 models from \(\sim\)30 centres.

Shared Socioeconomic Pathways (SSPs)

Scenario
Forcing (2100)
Warming (2100)
SSP1-1.9
1.9 W/m\(^2\)
1.0\(\text{--}\)1.8°C
SSP1-2.6
2.6 W/m\(^2\)
1.3\(\text{--}\)2.4°C
SSP2-4.5
4.5 W/m\(^2\)
2.1\(\text{--}\)3.5°C
SSP3-7.0
7.0 W/m\(^2\)
2.8\(\text{--}\)4.6°C
SSP5-8.5
8.5 W/m\(^2\)
3.3\(\text{--}\)5.7°C

Multi-Model Ensembles

Model agreement is a key metric for confidence. When >80% of models agree on the sign of a change (e.g., precipitation increase), confidence is high. The multi-model mean often outperforms individual models because random errors partially cancel. However, structural uncertainties (shared biases) do not cancel. As discussed in ourClimate & Biodiversity course, model spread is particularly large for regional precipitation projections that drive ecosystem impacts.

5.5 Why Fortran Dominates Climate Modelling

Nearly all production climate models (CESM, GFDL, ECMWF IFS, UKESM, MPAS) are written in Fortran. This dominance persists because:

  • Array operations: Fortran's native multi-dimensional array support maps directly to stencil computations on grids
  • Compiler optimisation: Fortran's strict aliasing rules allow aggressive optimisation (vectorisation, loop unrolling)
  • Parallelism: MPI + OpenMP + coarray Fortran enable efficient scaling to \(\sim 10^5\) cores
  • Legacy code: decades of validated physics packages that would be costly and risky to rewrite
  • Numerical precision: built-in support for various floating-point precisions critical for conservation properties

5.6 Model Evaluation & Taylor Diagrams

Climate models are evaluated against observations using several metrics:

Root Mean Square Error (RMSE)

\( \text{RMSE} = \sqrt{\frac{1}{N}\sum_{i=1}^N (m_i - o_i)^2} \) where \(m_i\) and \(o_i\) are model and observation at point \(i\)

Bias

\( \text{Bias} = \bar{m} - \bar{o} \) — systematic offset between model mean and observed mean

Correlation

\( R = \frac{\sum(m_i - \bar{m})(o_i - \bar{o})}{\sqrt{\sum(m_i - \bar{m})^2 \sum(o_i - \bar{o})^2}} \) — pattern correlation measures spatial structure

Taylor Diagram Geometry

The Taylor diagram (Taylor, 2001) compactly displays three statistics on a single plot using the mathematical identity:

\( \text{RMSE}'^2 = \sigma_m^2 + \sigma_o^2 - 2\sigma_m \sigma_o R \)

where RMSE\('\) is the centred (bias-removed) RMSE, \(\sigma_m\) and\(\sigma_o\) are the standard deviations of model and observations, and \(R\)is the correlation coefficient. This has the form of the law of cosines:

\( c^2 = a^2 + b^2 - 2ab\cos\theta \)

with \(a = \sigma_m\), \(b = \sigma_o\), \(\theta = \arccos(R)\), and \(c = \text{RMSE}'\). This means we can place \(\sigma_m\) as the radial distance, \(R\) as the azimuthal angle, and read RMSE\('\) as the distance from the observation point. A perfect model sits exactly on the observation point.

5.7 Resolution Hierarchy

Climate modelling spans a vast range of resolutions, each suited to different questions:

CMIP6 GCMs: ~100 km

Multi-century projections, parameterised convection. Grid cells cover ~\(10^4\) km\(^2\). Cannot resolve individual storms but captures large-scale circulation.

Regional models / high-res GCMs: ~25 km

Better representation of orography, coastlines, tropical cyclones (see Module 3). Still requires convective parameterisation.

Convection-permitting: ~1\(\text{--}\)4 km

Explicitly resolves deep convective cells. Critical for extreme precipitation (see Module 4) and urban flooding. Typical runs cover small domains for ~10 year periods.

Large Eddy Simulation (LES): ~10\(\text{--}\)100 m

Resolves turbulent eddies in the boundary layer and individual cloud dynamics. Used for process studies and developing parameterisations for coarser models.

Computational cost scales as \(\propto (\Delta x)^{-3}\) in 3D (doubling resolution requires \(2^3 = 8\times\) more grid points) and \(\propto (\Delta x)^{-4}\)when the CFL condition reduces \(\Delta t\) proportionally. Running a 100-year CMIP6 simulation at convection-permitting resolution would require\(\sim (100/3)^4 \approx 10^6\) times more compute than current practice.

5.8 Ocean Component Models

The ocean component solves similar primitive equations but with additional complexities: the equation of state is nonlinear (UNESCO or TEOS-10), requiring both temperature and salinity as prognostic variables. The Boussinesq approximation is typically applied:

\( \frac{\partial \mathbf{u}}{\partial t} + \mathbf{u}\cdot\nabla\mathbf{u} + f\hat{k}\times\mathbf{u} = -\frac{1}{\rho_0}\nabla p + \frac{\rho'}{\rho_0}g\hat{k} + \mathcal{F} \)

\( \frac{\partial T}{\partial t} + \mathbf{u}\cdot\nabla T = \nabla\cdot(K\nabla T) + Q_T \)

\( \frac{\partial S}{\partial t} + \mathbf{u}\cdot\nabla S = \nabla\cdot(K\nabla S) + Q_S \)

Major ocean models include MOM6 (GFDL), NEMO (European), POP (CESM), and MPAS-Ocean. A critical parameterisation is the Gent-McWilliams scheme for mesoscale eddies (\(\sim\)10\(\text{--}\)100 km), which are unresolved at typical 1° ocean resolution. These eddies transport heat poleward and mix tracers, playing a key role in Southern Ocean dynamics and the overturning circulation (seeModule 2).

5.9 Data Assimilation & Reanalysis

Data assimilation optimally combines model predictions with observations to produce the best estimate of the atmospheric state. The most widely used approach is 4D-Var (four-dimensional variational assimilation), which minimises a cost function:

\( J(\mathbf{x}_0) = \frac{1}{2}(\mathbf{x}_0 - \mathbf{x}_b)^T \mathbf{B}^{-1}(\mathbf{x}_0 - \mathbf{x}_b) + \frac{1}{2}\sum_i (\mathbf{y}_i - H_i[\mathbf{x}_i])^T \mathbf{R}_i^{-1}(\mathbf{y}_i - H_i[\mathbf{x}_i]) \)

where \(\mathbf{x}_0\) is the initial state, \(\mathbf{x}_b\) is the background (model forecast), \(\mathbf{B}\) is the background error covariance, \(\mathbf{y}_i\) are observations at time \(t_i\),\(H_i\) is the observation operator, and \(\mathbf{R}_i\) is the observation error covariance.

Reanalysis products (ERA5, JRA-55, MERRA-2) apply data assimilation retrospectively to produce gridded, physically consistent records of the atmosphere spanning decades. ERA5 provides hourly fields at 31 km resolution from 1940 to present, ingesting millions of observations from radiosondes, satellites, aircraft, and surface stations. These products are essential for evaluating climate models and studying extreme events (see Module 4).

Climate Model Architecture

A coupled Earth System Model consists of multiple component models exchanging fluxes through a coupler:

Coupled Earth System Model ArchitectureCOUPLER(OASIS, MCT, ESMF)ATMOSPHEREDynamics + PhysicsRadiation | Convection | BL | CloudsGravity wave drag | ChemistryT, q, u, vSST, iceOCEAN3D circulation (MOM6, NEMO)Thermohaline | Mesoscale eddiesBiogeochemistry (optional)SST, currentsHeat, freshwater fluxSEA ICECICE, SI3Thermodynamics | RheologyAlbedo feedbackIce frac, albedoSW, wind stressLAND SURFACECLM, JSBACH, JULESSoil moisture | VegetationCarbon cycle | HydrologyAlbedo, ET, CO2Precip, radiationICE SHEETSBISICLES, Elmer/IceGlen's flow law | CalvingGrounding line dynamicsFreshwater, topographySMB, ocean heatTypical resolution: Atm ~100 km | Ocean ~50 kmTime step: ~30 min (atm) | ~1 hr (ocean) | coupled every ~1 hr

Simulation: 1D Radiative-Convective Equilibrium

A minimal climate model: iterate radiative heating and convective adjustment to find the equilibrium temperature profile. This captures the essential physics of Earth's mean temperature structure:

1D Radiative-Convective Equilibrium Model

Python

Iterating to steady state: grey-atmosphere radiation + convective adjustment

script.py185 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Simulation: CFL Condition & Numerical Stability

CFL Condition Demonstration

Python

Showing numerical instability when CFL > 1 and stability when CFL < 1

script.py148 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Simulation: Spectral Methods in Climate Models

Spectral methods expand fields in basis functions (Fourier modes or spherical harmonics). Differentiation in spectral space is exact (just multiplication by wavenumber), eliminating truncation error from finite differences:

\( f(x) = \sum_{k=-N/2}^{N/2} \hat{f}_k e^{ikx} \quad \Longrightarrow \quad \frac{df}{dx} = \sum_k ik\hat{f}_k e^{ikx} \)

Spectral vs Finite Difference Accuracy

Python

Comparing derivative accuracy: spectral (FFT) vs finite difference on a periodic domain

script.py118 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Simulation: Taylor Diagram for Model Evaluation

Taylor Diagram: Comparing Climate Models to Observations

Python

Visualising correlation, standard deviation ratio, and centred RMSE for multiple models

script.py124 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

5.8 Emergent Constraints & Climate Sensitivity

Emergent constraints use observable relationships in the present climate to narrow projections of future change. The idea is to find a quantity\(X\) that can be observed and a future projection \(Y\) such that across the model ensemble, \(X\) and \(Y\) are strongly correlated:

\( Y = \alpha + \beta X + \epsilon \)

The observed value \(X_{\text{obs}}\) then constrains \(Y\)through the regression. Key examples include:

  • Equilibrium Climate Sensitivity (ECS): constrained by subtropical low-cloud feedback, seasonal cycle amplitude, and ENSO variability. IPCC AR6 assessed ECS = 2.5\(\text{--}\)4.0°C (likely range)
  • Arctic amplification: constrained by observed seasonal sea ice retreat
  • Land carbon sink: constrained by present-day CO\(_2\) seasonal cycle amplitude
  • Tropical precipitation: constrained by present-day interannual variability

The formal definition of ECS itself comes from the energy balance:

\( \text{ECS} = \frac{\Delta F_{2\times\text{CO}_2}}{\lambda} = \frac{3.7\;\text{W/m}^2}{\lambda} \)

where \(\lambda\) is the total climate feedback parameter (W/m\(^2\)/K) and \(\Delta F_{2\times\text{CO}_2} \approx 3.7\) W/m\(^2\) is the radiative forcing from CO\(_2\) doubling. The feedback parameter decomposes as:

\( \lambda = \lambda_{\text{Planck}} + \lambda_{\text{WV}} + \lambda_{\text{LR}} + \lambda_{\text{albedo}} + \lambda_{\text{cloud}} \)

The Planck response (\(\lambda_{\text{Planck}} \approx -3.2\) W/m\(^2\)/K) is the stabilising blackbody feedback. Water vapour (\(\sim +1.8\)), lapse rate (\(\sim -0.6\)), and albedo (\(\sim +0.4\)) are well constrained. The cloud feedback (\(\sim +0.5 \pm 0.4\)) remains the dominant source of uncertainty in ECS, hence the importance of cloud parameterisation discussed in Section 5.3.

5.9 Machine Learning in Climate Modelling

Machine learning is increasingly used to augment traditional climate modelling:

  • ML parameterisation: neural networks trained on high-resolution simulations to replace or augment physics-based parameterisations of convection, radiation, and turbulence. Achieved 10\(\text{--}\)100\(\times\) speedup over traditional schemes
  • Weather foundation models: Pangu-Weather, GraphCast, FourCastNet produce 10-day forecasts in seconds rather than hours, competitive with ECMWF IFS
  • Downscaling: super-resolution networks map coarse GCM output to high-resolution regional climate, replacing expensive dynamical downscaling
  • Emulators: surrogate models that approximate the full GCM response surface, enabling rapid exploration of parameter uncertainty
  • Bias correction: ML learns systematic errors in model output and corrects them using observational constraints

Challenges remain: ML models lack physical conservation guarantees (mass, energy, momentum), may fail under out-of-distribution scenarios (novel climate states), and their black-box nature makes physical interpretation difficult. Hybrid approaches embedding physical constraints into neural network architectures (physics-informed neural networks, equivariant architectures) are an active research frontier.

A particularly promising direction is using ML to accelerate the most expensive components of climate models. Radiation calculations, which can consume 30\(\text{--}\)50% of total compute, have been successfully emulated with neural networks achieving\(R^2 > 0.999\) and \(100\times\) speedup. Similarly, ML-based ocean mixing parameterisations trained on high-resolution simulations show improved fidelity compared to traditional schemes, with implications for long-term climate drift in coupled models.

5.10 Computational Requirements & Exascale

Climate modelling is among the most computationally demanding scientific applications. Representative costs for a 100-year simulation:

Resolution
Grid Points
Core-Hours (100 yr)
100 km
\(\sim 10^5\)
\(\sim 10^5\)
25 km
\(\sim 10^7\)
\(\sim 10^7\)
3 km
\(\sim 10^9\)
\(\sim 10^{11}\)
100 m (LES)
\(\sim 10^{12}\)
\(\sim 10^{16}\)

The exascale computing era (\(10^{18}\) FLOPS) enables the first global convection-permitting simulations covering decades. Projects like DYAMOND and nextGEMS have demonstrated 2.5\(\text{--}\)5 km global simulations for multi-month periods, revealing storm-resolving features impossible at CMIP6 resolution. GPU-accelerated climate codes are achieving order-of-magnitude speedups, potentially making km-scale century-length projections feasible within the next decade.

References

Washington, W. M. & Parkinson, C. L. (2005). An Introduction to Three-Dimensional Climate Modeling. University Science Books, 2nd edition.

Eyring, V., et al. (2016). Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6). Geoscientific Model Development, 9(5), 1937-1958.

Taylor, K. E. (2001). Summarizing multiple aspects of model performance in a single diagram. Journal of Geophysical Research, 106(D7), 7183-7192.

Courant, R., Friedrichs, K., & Lewy, H. (1928). Über die partiellen Differenzengleichungen der mathematischen Physik. Mathematische Annalen, 100(1), 32-74.

Randall, D. A. (2015). An Introduction to the Global Circulation of the Atmosphere. Princeton University Press.

Satoh, M. (2014). Atmospheric Circulation Dynamics and General Circulation Models. Springer, 2nd edition.

O'Neill, B. C., et al. (2016). The Scenario Model Intercomparison Project (ScenarioMIP) for CMIP6. Geoscientific Model Development, 9(9), 3461-3482.

Palmer, T. N. & Stevens, B. (2019). The scientific challenge of understanding and estimating climate change. Proceedings of the National Academy of Sciences, 116(49), 24390-24395.

Sherwood, S. C., et al. (2020). An assessment of Earth's climate sensitivity using multiple lines of evidence. Reviews of Geophysics, 58(4), e2019RG000678.

Lam, R., et al. (2023). Learning skillful medium-range global weather forecasting. Science, 382(6677), 1416-1421.

Stevens, B., et al. (2019). DYAMOND: the DYnamics of the Atmospheric general circulation Modeled On Non-hydrostatic Domains. Progress in Earth and Planetary Science, 6, 61.

Hersbach, H., et al. (2020). The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730), 1999-2049.