Statistical Mechanics of Biomolecules
The statistical mechanical framework for understanding biomolecular behavior: partition functions, two-state systems, cooperative binding through the Hill equation, helix-coil transitions via the Zimm-Bragg model, and free energy perturbation methods for computing binding free energies.
Table of Contents
1. Partition Function for Two-State Systems
The partition function is the central object of statistical mechanics. It encodes all thermodynamic information about a system in equilibrium. For biomolecules, the simplest and most useful application is the two-state model, which describes protein folding, ligand binding, and conformational transitions.
The Canonical Partition Function
For a system in thermal equilibrium at temperature $T$, the canonical partition function is:
where $g_i$ is the degeneracy (number of microstates) of energy level$E_i$. All thermodynamic quantities follow from $Z$:
- • Helmholtz free energy: $F = -k_BT \ln Z$
- • Average energy: $\langle E \rangle = -\frac{\partial \ln Z}{\partial \beta}$
- • Entropy: $S = k_B\left(\ln Z + \beta\langle E \rangle\right)$
- • Heat capacity: $C_V = k_B\beta^2\left(\langle E^2 \rangle - \langle E \rangle^2\right)$
Two-State Model for Protein Folding
For a protein with only two macrostates — folded (F) and unfolded (U) — the partition function is:
where $F_i = E_i - TS_i$ includes the entropic contributions through the degeneracies ($S_i = k_B \ln g_i$). Setting $F_F = 0$ as reference:
The fraction of folded molecules is:
This is a sigmoidal curve. The midpoint (melting temperature) occurs when $\Delta G = 0$:
The sharpness of the transition depends on $\Delta H$: larger enthalpy changes produce sharper transitions. This is why small proteins show clear two-state behavior with well-defined melting temperatures.
Heat Capacity and the Van't Hoff Criterion
The heat capacity of a two-state system shows a peak at $T_m$:
The area under the $C_p$ curve equals the calorimetric enthalpy $\Delta H_{\text{cal}}$. The van't Hoff enthalpy is extracted from the shape of the melting curve:
If $\Delta H_{\text{vH}} = \Delta H_{\text{cal}}$, the transition is truly two-state. A ratio less than one indicates intermediate states.
Fluctuations, Response, and Susceptibility
The Fluctuation-Response Relation
A powerful consequence of statistical mechanics is the connection between equilibrium fluctuations and the response to perturbation. For any observable$A$ conjugate to a field $h$:
For the two-state folding model, this gives the heat capacity as a fluctuation:
The peak in $C_V$ at the folding transition temperature reflects maximum energy fluctuations as the system switches between folded and unfolded states. Near $T_m$, each state is equally populated, maximizing the variance.
2. Cooperative Binding & the Hill Equation
Non-Cooperative (Langmuir) Binding
For a macromolecule with $n$ independent, identical binding sites with dissociation constant $K_d$, the binding polynomial is:
The average fractional occupancy (binding isotherm) is:
This is the standard Langmuir isotherm (or Michaelis-Menten equation in enzyme kinetics), independent of $n$.
Cooperative Binding: Statistical Mechanical Derivation
When binding sites interact, we introduce interaction energies. Consider a macromolecule with $n$ sites where each bound ligand interacts with other bound ligands. In the extreme cooperative limit (all-or-none binding), only the empty ($k=0$) and fully occupied ($k=n$) states are significantly populated:
The fractional occupancy becomes the Hill equation:
The Hill coefficient $n_H$ measures cooperativity:
- • $n_H = 1$: no cooperativity (Langmuir/hyperbolic)
- • $n_H > 1$: positive cooperativity (sigmoidal)
- • $n_H < 1$: negative cooperativity
The Hill coefficient is extracted from the Hill plot:
For hemoglobin (4 binding sites), $n_H \approx 2.8$, indicating strong positive cooperativity. The Hill coefficient cannot exceed the number of binding sites, and for real systems, $1 \leq n_H \leq n$.
The MWC (Monod-Wyman-Changeux) Model
A more physical model assumes the macromolecule exists in two conformations: tense (T, low affinity) and relaxed (R, high affinity), with equilibrium constant$L_0 = [T]/[R]$ and ratio of affinities $c = K_R/K_T$. The binding isotherm is:
where $\alpha = [L]/K_R$. This produces sigmoidal binding curves without requiring all-or-none binding.
3. Helix-Coil Transitions
The Zimm-Bragg Model
The helix-coil transition in polypeptides is a fundamental example of a cooperative phase transition in one dimension. Each residue is either in a helical (h) or coil (c) state. The Zimm-Bragg model uses two parameters:
- • $s$ (propagation parameter): The equilibrium constant for adding one helical residue to an existing helical segment. $s = \exp(-\Delta G_h/k_BT)$.
- • $\sigma$ (nucleation parameter): The penalty for initiating a new helical segment. Typically $\sigma \sim 10^{-3}$ to $10^{-4}$ for polypeptides.
The statistical weight of a configuration is the product over all residues of:
- • $1$ for a coil residue following a coil
- • $\sigma s$ for a helix residue starting a new helix segment
- • $s$ for a helix residue continuing an existing helix
- • $1$ for a coil residue following a helix
Transfer Matrix Solution
The partition function can be computed exactly using the transfer matrix method. Define the $2 \times 2$ transfer matrix:
where rows correspond to the state of residue $i$ (c, h) and columns to residue $i+1$. For a chain of $N$ residues:
The eigenvalues of $\mathbf{T}$ are:
For large $N$, $Z \approx \lambda_+^N$ and the helix fraction is:
The transition width is controlled by $\sigma$: smaller $\sigma$produces sharper transitions (more cooperative). At $s = 1$,$f_h = 1/2$ regardless of $\sigma$. The transition sharpens with increasing chain length $N$ for finite chains.
Physical Interpretation
The average length of a helical segment at the midpoint ($s = 1$) is:
For a typical polypeptide with $\sigma \approx 2 \times 10^{-4}$,$\langle n_h \rangle \approx 70$ residues. This explains why alpha-helices in real proteins are typically 10-20 residues long — finite-size effects prevent full cooperative behavior.
The nucleation barrier $\sigma$ reflects the entropic cost of constraining the first few residues in helical dihedral angles, while $s$encodes the enthalpic gain from each hydrogen bond minus the entropic loss of fixing backbone angles.
4. Free Energy Perturbation
The Zwanzig Formula
Free energy perturbation (FEP) computes the free energy difference between two states A and B by sampling from one state. The exact identity (Zwanzig, 1954) is:
Derivation: The ratio of partition functions is:
Since $\Delta G = -k_BT\ln(Z_B/Z_A)$, the result follows immediately.
Lambda Staging
Direct FEP fails when states A and B are too different (poor overlap of phase space). The solution is to introduce intermediate states parameterized by $\lambda$:
The total free energy difference is decomposed into small, overlapping steps:
An alternative, more efficient approach uses thermodynamic integration:
For the linear mixing rule, $\partial U/\partial \lambda = U_B - U_A$, and the integral can be computed by numerical quadrature from simulations at each$\lambda$ value.
Applications in Drug Design
FEP is widely used to compute relative binding free energiesfor drug design. To compute the difference in binding affinity between two ligands L1 and L2 to a protein P, we use a thermodynamic cycle:
Modern FEP calculations achieve accuracies of 1-2 kJ/mol for relative binding free energies, making them a powerful tool for lead optimization in pharmaceutical research.
Advanced Topics in Free Energy Methods
Bennett Acceptance Ratio (BAR)
The Bennett acceptance ratio method provides the minimum-variance estimate of$\Delta G$ from forward and reverse work measurements. The optimal estimate satisfies:
This is solved self-consistently for $\Delta G$. BAR is asymptotically optimal among all methods that use both forward and reverse work measurements, achieving the minimum possible variance given the data.
Multistate Bennett Acceptance Ratio (MBAR)
MBAR generalizes BAR to optimally combine data from multiple thermodynamic states simultaneously. For $K$ states with $N_k$ samples each:
where $\hat{f}_i = -\beta G_i$ are the reduced free energies. MBAR is now the gold standard for alchemical free energy calculations in drug discovery, providing both free energy estimates and their statistical uncertainties.
Fluctuation-Dissipation and Linear Response
An alternative to FEP for small perturbations is linear response theory. The free energy difference to first order in the perturbation$\delta U = U_B - U_A$ is:
The first term is the average perturbation energy, and the second is the fluctuation correction (always negative, lowering $\Delta G$). This can also be written symmetrically using data from both end states:
This is the thermodynamic integration trapezoidal rule, exact when $\langle \delta U \rangle_\lambda$ is linear in $\lambda$.
Related Video Lectures
Deriving the Boltzmann Distribution
Deriving the Boltzmann Distribution (General Case)
The Partition Function
Shannon Entropy
5. Interactive Simulations
Partition Function & Two-State Folding
PythonCompute partition functions, folding equilibria, and vibrational mode populations for biomolecular systems.
Click Run to execute the Python code
Code will be executed with Python 3 on the server
Hill Equation, Helix-Coil Transition & FEP
PythonCooperative binding curves, Zimm-Bragg helix-coil transition, and free energy perturbation calculation.
Click Run to execute the Python code
Code will be executed with Python 3 on the server