Part III: Scientific Applications | Chapter 11

ML for Molecular Dynamics

Neural network potentials, equivariant architectures, message passing, and ML-accelerated simulations

The Speed-Accuracy Trade-off in MD

Molecular dynamics (MD) simulations propagate Newton's equations of motion for a system of atoms. The computational bottleneck is evaluating the potential energy surface (PES):ab initio methods (DFT, coupled cluster) are accurate but scale as$\mathcal{O}(N^3)$ to $\mathcal{O}(N^7)$, limiting simulations to hundreds of atoms for nanoseconds. Classical force fields are fast ($\mathcal{O}(N)$) but lack the accuracy to capture bond breaking, charge transfer, or reactive chemistry.

Machine-learned interatomic potentials (MLIPs) bridge this gap: they are trained onab initio data but evaluate at near-force-field speed, enabling million-atom simulations with quantum-mechanical accuracy.

1. Molecular Dynamics Fundamentals

In MD, we integrate Newton's equations for $N$ atoms with positions$\{r_1, \ldots, r_N\}$ and a potential energy function $E(\{r_i\})$:

Equations of Motion

$$m_i \ddot{r}_i = F_i = -\nabla_{r_i} E(\{r_j\})$$

The force $F_i$ on atom $i$ is the negative gradient of the total energy with respect to its position. The Velocity Verlet integrator updates positions and velocities with time step $\Delta t$:

$$r_i(t + \Delta t) = r_i(t) + v_i(t)\Delta t + \frac{F_i(t)}{2m_i}\Delta t^2$$

$$v_i(t + \Delta t) = v_i(t) + \frac{F_i(t) + F_i(t+\Delta t)}{2m_i}\Delta t$$

The ML challenge is: given a dataset of atomic configurations and their energies/forces from quantum calculations, learn a function $\hat{E}_\theta(\{r_i, Z_i\})$ that predicts the energy (and forces via automatic differentiation) with near-DFT accuracy.

2. Symmetry Requirements for MLIPs

Physical potentials must respect fundamental symmetries. Any valid MLIP must be:

1. Invariant under translations:

$$E(\{r_i + t\}) = E(\{r_i\}) \quad \forall\, t \in \mathbb{R}^3$$

2. Invariant under rotations and reflections (O(3)):

$$E(\{R r_i\}) = E(\{r_i\}) \quad \forall\, R \in O(3)$$

3. Invariant under permutations of like atoms:

$$E(\{r_{\pi(i)}\}) = E(\{r_i\}) \quad \forall\, \pi \in S_N$$

4. Extensive (linear in system size):

$$E = \sum_{i=1}^{N} \varepsilon_i \quad \text{(decomposable into per-atom contributions)}$$

Forces must transform as vectors (equivariant under O(3)):$F_i(\{Rr_j\}) = R\, F_i(\{r_j\})$. This is automatic when forces are computed as $F_i = -\nabla_{r_i} E$, since the gradient of an invariant scalar is equivariant.

3. Neural Network Potentials (Behler-Parrinello)

The seminal approach of Behler & Parrinello (2007) decomposes the total energy into per-atom contributions, each computed by a feed-forward neural network acting on handcrafted symmetry functions (descriptors):

Atom-Centred Symmetry Functions

Radial symmetry functions (two-body):

$$G_i^{\text{rad}} = \sum_{j \neq i} e^{-\eta(r_{ij} - R_s)^2} f_c(r_{ij})$$

Angular symmetry functions (three-body):

$$G_i^{\text{ang}} = 2^{1-\zeta} \sum_{j,k \neq i} (1 + \lambda \cos\theta_{ijk})^\zeta \, e^{-\eta(r_{ij}^2 + r_{ik}^2 + r_{jk}^2)} f_c(r_{ij}) f_c(r_{ik}) f_c(r_{jk})$$

The cutoff function ensures smooth decay:

$$f_c(r) = \begin{cases} \frac{1}{2}\left[\cos\left(\frac{\pi r}{r_c}\right) + 1\right] & r \leq r_c \\ 0 & r > r_c \end{cases}$$

The descriptor vector $\mathbf{G}_i = (G_i^{(1)}, G_i^{(2)}, \ldots)$ is by construction invariant under rotation, translation, and permutation. The total energy is:

$$E = \sum_{i=1}^{N} \text{NN}_{Z_i}(\mathbf{G}_i)$$

Python

script.py89 lines

import numpy as np

# Behler-Parrinello symmetry functions for a simple configuration
np.random.seed(42)

def cutoff_function(r, r_c):
    """Smooth cosine cutoff"""
    result = np.zeros_like(r)
    mask = r < r_c
    result[mask] = 0.5 * (np.cos(np.pi * r[mask] / r_c) + 1)
    return result

def radial_symmetry(positions, center_idx, eta, Rs, r_c):
    """G^rad for atom center_idx"""
    r_i = positions[center_idx]
    G = 0.0
    for j in range(len(positions)):
        if j == center_idx:
            continue
        r_ij = np.linalg.norm(positions[j] - r_i)
        if r_ij < r_c:
            G += np.exp(-eta * (r_ij - Rs)**2) * cutoff_function(np.array([r_ij]), r_c)[0]
    return G

def angular_symmetry(positions, center_idx, eta, zeta, lam, r_c):
    """G^ang for atom center_idx"""
    r_i = positions[center_idx]
    N = len(positions)
    G = 0.0
    for j in range(N):
        if j == center_idx:
            continue
        r_ij_vec = positions[j] - r_i
        r_ij = np.linalg.norm(r_ij_vec)
        if r_ij >= r_c:
            continue
        for k in range(j+1, N):
            if k == center_idx:
                continue
            r_ik_vec = positions[k] - r_i
            r_ik = np.linalg.norm(r_ik_vec)
            r_jk = np.linalg.norm(positions[k] - positions[j])
            if r_ik >= r_c or r_jk >= r_c:
                continue
            cos_theta = np.dot(r_ij_vec, r_ik_vec) / (r_ij * r_ik)
            cos_theta = np.clip(cos_theta, -1, 1)
            fc_prod = (cutoff_function(np.array([r_ij]), r_c)[0] *
                       cutoff_function(np.array([r_ik]), r_c)[0] *
                       cutoff_function(np.array([r_jk]), r_c)[0])
            G += (2**(1-zeta) * (1 + lam * cos_theta)**zeta *
                  np.exp(-eta * (r_ij**2 + r_ik**2 + r_jk**2)) * fc_prod)
    return G

# Example: water molecule (O at origin, 2 H)
positions = np.array([
    [0.0, 0.0, 0.0],    # O
    [0.96, 0.0, 0.0],   # H1
    [-0.24, 0.93, 0.0],  # H2
])

r_c = 5.0  # cutoff in Angstrom

print("Symmetry Functions for H2O molecule:")
for i, label in enumerate(["O", "H1", "H2"]):
    G_rad = []
    for eta in [0.5, 1.0, 2.0]:
        for Rs in [0.0, 1.0, 2.0]:
            G_rad.append(radial_symmetry(positions, i, eta, Rs, r_c))

G_ang = []
    for zeta in [1, 2, 4]:
        for lam in [-1, 1]:
            G_ang.append(angular_symmetry(positions, i, 0.5, zeta, lam, r_c))

print(f"  {label}: G_rad = {np.array(G_rad[:6]).round(4)}")
    print(f"       G_ang = {np.array(G_ang[:6]).round(4)}")

# Verify rotation invariance
R = np.array([[np.cos(0.7), -np.sin(0.7), 0],
              [np.sin(0.7),  np.cos(0.7), 0],
              [0, 0, 1]])
positions_rot = (R @ positions.T).T

print("\nRotation invariance test:")
for i, label in enumerate(["O", "H1", "H2"]):
    G_orig = radial_symmetry(positions, i, 1.0, 1.0, r_c)
    G_rot = radial_symmetry(positions_rot, i, 1.0, 1.0, r_c)
    print(f"  {label}: G_rad(original)={G_orig:.6f}, G_rad(rotated)={G_rot:.6f}, match={np.isclose(G_orig, G_rot)}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

4. Gaussian Approximation Potentials (GAP) & SOAP

GAP (Bartók et al., 2010) uses Gaussian process regression with the SOAP (Smooth Overlap of Atomic Positions) kernel to learn the PES.

SOAP Descriptor

The atomic neighbour density around atom $i$ is expanded in a basis of radial functions and spherical harmonics:

$$\rho_i(\mathbf{r}) = \sum_{j \in \text{neigh}(i)} g(|\mathbf{r} - \mathbf{r}_{ij}|) = \sum_{nlm} c_{nlm}^{(i)} R_n(r) Y_l^m(\hat{r})$$

The SOAP power spectrum (rotationally invariant) is:

$$p_{nn'l}^{(i)} = \pi\sqrt{\frac{8}{2l+1}} \sum_{m=-l}^{l} (c_{nlm}^{(i)})^* c_{n'lm}^{(i)}$$

The SOAP kernel between two environments is$k(i, j) = |\mathbf{p}^{(i)} \cdot \mathbf{p}^{(j)}|^\zeta / (|\mathbf{p}^{(i)}|^\zeta |\mathbf{p}^{(j)}|^\zeta)$, where $\zeta$ controls sensitivity.

GAP Prediction

Given training data $\{(\mathbf{p}^{(i)}, \varepsilon_i)\}$, the GAP prediction is:

$$\hat{\varepsilon}(\mathbf{p}^*) = \sum_{i=1}^{M} \alpha_i k(\mathbf{p}^*, \mathbf{p}^{(i)})$$

where $\alpha = (K + \sigma^2 I)^{-1} \varepsilon$ and$K_{ij} = k(\mathbf{p}^{(i)}, \mathbf{p}^{(j)})$. The GP also provides uncertainty estimates $\sigma^2(\mathbf{p}^*) = k(\mathbf{p}^*, \mathbf{p}^*) - \mathbf{k}^\top (K + \sigma_n^2 I)^{-1} \mathbf{k}$.

5. Message Passing Neural Networks

Instead of handcrafted descriptors, message-passing neural networks (MPNNs) learn the atomic environment representation end-to-end through iterative message passing on the molecular graph.

MPNN Framework (Gilmer et al., 2017)

Each atom $i$ has a feature vector $h_i^{(0)}$ (initialised from element type). At each message-passing layer $l$:

$$m_i^{(l)} = \sum_{j \in \mathcal{N}(i)} M^{(l)}(h_i^{(l)}, h_j^{(l)}, e_{ij}) \quad \text{(aggregate messages)}$$

$$h_i^{(l+1)} = U^{(l)}(h_i^{(l)}, m_i^{(l)}) \quad \text{(update features)}$$

After $L$ layers, the per-atom energy is:$\varepsilon_i = R(h_i^{(L)})$ where $R$ is a readout network. The total energy $E = \sum_i \varepsilon_i$ is permutation-invariant by construction (sum over atoms).

Python

script.py97 lines

import numpy as np

# Simple message-passing neural network for molecular energy prediction
np.random.seed(42)

def relu(x): return np.maximum(0, x)

class SimpleSchNet:
    """Simplified SchNet-like MPNN"""
    def __init__(self, n_features=16, n_filters=16, n_interactions=3, r_cut=5.0):
        self.r_cut = r_cut
        self.n_feat = n_features
        self.n_filt = n_filters
        self.n_int = n_interactions

# Embedding (element -> features), simplified: just random
        self.embed = np.random.randn(10, n_features) * 0.3  # up to element 10

# Interaction layers
        self.W_cf = []  # continuous filter weights
        self.W_s = []   # atom-wise update weights
        for _ in range(n_interactions):
            self.W_cf.append(np.random.randn(n_filters, n_filters) * 0.3)
            self.W_s.append(np.random.randn(n_features, n_filters) * 0.3)

# Readout
        self.W_out1 = np.random.randn(n_features, 16) * 0.3
        self.W_out2 = np.random.randn(16, 1) * 0.3

def rbf_expansion(self, distances, n_rbf=16, r_cut=5.0):
        """Radial basis function expansion of distances"""
        mu = np.linspace(0, r_cut, n_rbf)
        gamma = 1.0 / (0.5 * (mu[1] - mu[0]))**2
        return np.exp(-gamma * (distances[:, None] - mu[None, :])**2)

def forward(self, positions, elements):
        N = len(positions)
        # Initialize atom features
        h = np.array([self.embed[z] for z in elements])  # (N, n_feat)

# Build neighbor list
        for layer in range(self.n_int):
            h_new = h.copy()
            for i in range(N):
                msg = np.zeros(self.n_filt)
                for j in range(N):
                    if i == j:
                        continue
                    r_ij = np.linalg.norm(positions[j] - positions[i])
                    if r_ij > self.r_cut:
                        continue
                    # Continuous filter: RBF -> filter weight
                    rbf = self.rbf_expansion(np.array([r_ij]))  # (1, n_rbf)
                    W_filter = self.W_cf[layer] @ rbf.T  # (n_filt, 1)
                    msg += W_filter.flatten()[:self.n_filt] * h[j, :self.n_filt]

h_new[i] = h[i] + self.W_s[layer] @ msg[:self.n_filt]
            h = h_new

# Readout: per-atom energy
        e_atom = relu(h @ self.W_out1) @ self.W_out2  # (N, 1)
        return np.sum(e_atom), e_atom.flatten()

# Test on small molecules
model = SimpleSchNet(n_features=16, n_filters=16, n_interactions=3)

# Water: O(8), H(1), H(1)
pos_water = np.array([[0.0, 0.0, 0.0], [0.96, 0.0, 0.0], [-0.24, 0.93, 0.0]])
elems_water = [8, 1, 1]
E_water, e_atoms_water = model.forward(pos_water, elems_water)

# Methane: C(6), H(1)x4
pos_ch4 = np.array([[0.0, 0.0, 0.0], [1.09, 0.0, 0.0], [-0.36, 1.03, 0.0],
                     [-0.36, -0.51, 0.89], [-0.36, -0.51, -0.89]])
elems_ch4 = [6, 1, 1, 1, 1]
E_ch4, e_atoms_ch4 = model.forward(pos_ch4, elems_ch4)

print("Message-Passing Neural Network (SchNet-like):")
print(f"  H2O: E = {E_water:.4f}, per-atom = {e_atoms_water.round(4)}")
print(f"  CH4: E = {E_ch4:.4f}, per-atom = {e_atoms_ch4.round(4)}")

# Test rotation invariance
theta = np.pi/3
R = np.array([[np.cos(theta), -np.sin(theta), 0],
              [np.sin(theta),  np.cos(theta), 0],
              [0, 0, 1]])
pos_water_rot = (R @ pos_water.T).T
E_rot, _ = model.forward(pos_water_rot, elems_water)
print(f"\nRotation invariance: E(orig)={E_water:.6f}, E(rotated)={E_rot:.6f}")
print(f"  Match: {np.isclose(E_water, E_rot, atol=1e-10)}")

# Test permutation invariance (swap the two H atoms)
pos_perm = pos_water[[0, 2, 1]]
elems_perm = [8, 1, 1]
E_perm, _ = model.forward(pos_perm, elems_perm)
print(f"  Permutation: E(orig)={E_water:.6f}, E(permuted)={E_perm:.6f}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

6. SchNet: Continuous-Filter Convolutions

SchNet Architecture (Schütt et al., 2017)

SchNet replaces discrete graph convolutions with continuous-filter convolutions that operate directly on interatomic distances. The key innovation is the filter-generating network:

$$W(r_{ij}) = \text{MLP}(\text{RBF}(r_{ij})) \in \mathbb{R}^{F}$$

where the RBF expansion uses Gaussian basis functions centred at different distances:

$$e_k(r_{ij}) = \exp\left(-\gamma_k (r_{ij} - \mu_k)^2\right), \quad k = 1, \ldots, K$$

The interaction update for atom $i$ in layer $l$ is:

$$h_i^{(l+1)} = h_i^{(l)} + \sum_{j \in \mathcal{N}(i)} h_j^{(l)} \odot W^{(l)}(r_{ij})$$

By depending only on scalar distances $r_{ij}$, SchNet is automatically invariant under translations and rotations. However, it cannot represent directional (tensorial) properties.

7. Equivariant Neural Networks

While SchNet achieves invariance by working only with scalar distances, more expressive architectures encode equivariance: features transform predictably under rotations, enabling the network to represent vectorial and tensorial quantities.

Equivariance Under O(3)

A function $f$ mapping between feature spaces is equivariant under a group $G$ if:

$$f(D_{\text{in}}(g) \cdot x) = D_{\text{out}}(g) \cdot f(x) \quad \forall\, g \in G$$

where $D_{\text{in}}, D_{\text{out}}$ are representations of $G$. For O(3), features are decomposed into irreducible representations (irreps) labelled by angular momentum $l$:

$l = 0$: scalars (1D, invariant) — energies, charges
$l = 1$: vectors (3D) — forces, dipole moments
$l = 2$: rank-2 traceless symmetric tensors (5D) — polarisability, stress

Under rotation $R$, an $l$-type feature transforms as$h^{(l)} \to D^{(l)}(R)\, h^{(l)}$ where $D^{(l)}$ is the$(2l+1) \times (2l+1)$ Wigner D-matrix.

Tensor Product in Equivariant Layers

The key operation is the tensor product of irreps, governed by Clebsch-Gordan coefficients:

$$(h^{(l_1)} \otimes h^{(l_2)})_m^{(l_3)} = \sum_{m_1, m_2} C_{l_1 m_1, l_2 m_2}^{l_3 m} h_{m_1}^{(l_1)} h_{m_2}^{(l_2)}$$

where $|l_1 - l_2| \leq l_3 \leq l_1 + l_2$ (triangle inequality). This allows combining features of different angular momenta while preserving equivariance. For example, multiplying two $l=1$ vectors produces $l=0$ (scalar: dot product),$l=1$ (vector: cross product), and $l=2$ (traceless symmetric tensor).

Python

script.py74 lines

import numpy as np

# Demonstrate equivariance: l=1 features transform as vectors under rotation
np.random.seed(42)

def rotation_matrix(axis, angle):
    """Rotation matrix via Rodrigues formula"""
    axis = axis / np.linalg.norm(axis)
    K = np.array([[0, -axis[2], axis[1]],
                  [axis[2], 0, -axis[0]],
                  [-axis[1], axis[0], 0]])
    return np.eye(3) + np.sin(angle) * K + (1 - np.cos(angle)) * K @ K

# Wigner D-matrix for l=1 is just the rotation matrix itself
def wigner_D_l1(R):
    return R  # For l=1, D^(1) = R

# Simplified equivariant layer: scalar x vector -> vector
def equivariant_interaction(h_scalar, h_vector, positions, i, neighbors, W_s, W_v):
    """
    h_scalar: (N, F) scalar features (l=0)
    h_vector: (N, F, 3) vector features (l=1)
    """
    msg_s = np.zeros_like(h_scalar[i])
    msg_v = np.zeros(h_vector[i].shape)

for j in neighbors:
        r_ij = positions[j] - positions[i]
        d_ij = np.linalg.norm(r_ij)
        r_hat = r_ij / d_ij if d_ij > 1e-10 else np.zeros(3)

# Scalar message
        msg_s += W_s @ h_scalar[j] * np.exp(-d_ij)

# Vector message: combine neighbor scalar with direction (scalar x unit vector = vector)
        for f in range(len(h_scalar[j])):
            msg_v[f] += h_scalar[j, f] * r_hat * np.exp(-d_ij)

new_scalar = h_scalar[i] + msg_s
    new_vector = h_vector[i] + msg_v @ W_v.T[:, :msg_v.shape[0]][:msg_v.shape[0], :]
    return new_scalar, new_vector

# Test equivariance
N = 3
F = 4
positions = np.array([[0.0, 0.0, 0.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]])
h_s = np.random.randn(N, F)
h_v = np.random.randn(N, F, 3)
W_s = np.random.randn(F, F) * 0.3
W_v = np.random.randn(F, F) * 0.3

# Forward on original
s_out, v_out = equivariant_interaction(h_s, h_v, positions, 0, [1, 2], W_s, W_v)

# Now rotate everything
R = rotation_matrix(np.array([1, 1, 1]), np.pi/4)
positions_rot = (R @ positions.T).T
h_v_rot = np.einsum('ij,nfj->nfi', R, h_v)  # Rotate vector features

s_out_rot, v_out_rot = equivariant_interaction(h_s, h_v_rot, positions_rot, 0, [1, 2], W_s, W_v)

# Check: scalars should be identical, vectors should be rotated
print("Equivariance Test:")
print(f"  Scalar output (original):  {s_out.round(4)}")
print(f"  Scalar output (rotated):   {s_out_rot.round(4)}")
print(f"  Scalars match: {np.allclose(s_out, s_out_rot, atol=1e-10)}")

# For vector output, rotated output should equal R @ original output
v_expected = (R @ v_out.T).T
print(f"\n  Vector output (original):      {v_out[0].round(4)}")
print(f"  Vector output (rotated):       {v_out_rot[0].round(4)}")
print(f"  R @ original:                  {v_expected[0].round(4)}")
print(f"  Vectors equivariant: {np.allclose(v_out_rot, v_expected, atol=1e-10)}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

8. MACE: Multi-ACE Architecture

Atomic Cluster Expansion (ACE)

MACE (Batatia et al., 2022) builds on the Atomic Cluster Expansion framework, which systematically constructs many-body invariant descriptors. The key idea is the body-ordered expansion:

$$\varepsilon_i = \varepsilon_i^{(1)} + \varepsilon_i^{(2)} + \varepsilon_i^{(3)} + \cdots$$

where $\varepsilon_i^{(\nu)}$ is the $\nu$-body contribution. MACE achieves high body order efficiently by using symmetric contractions of equivariant message-passing features:

$$A_{i,zlm}^{(t)} = \sum_{j \in \mathcal{N}(i)} R_{nl}^{(t)}(r_{ij})\, Y_l^m(\hat{r}_{ij})\, h_{j,z}^{(t)}$$

The key advantage of MACE over earlier MPNNs is that each message-passing layer effectively increases the body order by one, so with just 2 layers, MACE captures 4-body interactions, sufficient for most chemical systems.

9. ML-Based Coarse-Graining

Coarse-graining (CG) reduces the number of degrees of freedom by grouping atoms into "beads." ML can learn the effective CG potential from atomistic simulations.

Force-Matching (Multiscale Coarse-Graining)

Given a mapping operator $M$ from atomistic to CG coordinates$R_I = M_I(\{r_i\})$, the CG force on bead $I$ is defined as the average atomistic force projected onto the CG space:

$$F_I^{\text{CG}} = \sum_{i \in \text{group}(I)} F_i^{\text{atom}}$$

The ML CG potential is trained by minimising:

$$\mathcal{L} = \sum_{I=1}^{N_{\text{CG}}} \left\|F_I^{\text{CG}} - \left(-\nabla_{R_I} U_\theta^{\text{CG}}\right)\right\|^2$$

This force-matching approach is variational: the optimal CG potential minimises the relative entropy $S_{\text{rel}} = \int p_{\text{CG}}^{\text{atom}} \ln(p_{\text{CG}}^{\text{atom}} / p_\theta^{\text{CG}}) dR$between the atomistic CG distribution and the model CG distribution.

Python

script.py88 lines

import numpy as np

# Simplified coarse-graining demo: 6-atom chain -> 2-bead CG model
np.random.seed(42)

# Atomistic simulation of a 6-atom chain with Lennard-Jones potential
N_atoms = 6
N_beads = 2  # groups of 3 atoms each

def lj_force(r, epsilon=1.0, sigma=1.0):
    """Lennard-Jones pair force (scalar, along r direction)"""
    sr6 = (sigma / r)**6
    return 24 * epsilon / r * (2 * sr6**2 - sr6)

# Generate atomistic configurations
N_configs = 200
configs = []
cg_forces_true = []

for _ in range(N_configs):
    # Random 1D positions with some structure
    x = np.sort(np.cumsum(1.0 + 0.2 * np.random.randn(N_atoms)))
    configs.append(x)

# Compute atomistic forces
    F_atom = np.zeros(N_atoms)
    for i in range(N_atoms):
        for j in range(N_atoms):
            if i == j: continue
            r = abs(x[j] - x[i])
            if r < 0.5: r = 0.5  # avoid singularity
            f = lj_force(r)
            F_atom[i] += f * np.sign(x[j] - x[i])

# CG mapping: bead 1 = atoms 0,1,2; bead 2 = atoms 3,4,5
    F_cg = np.array([np.sum(F_atom[:3]), np.sum(F_atom[3:])])
    cg_forces_true.append(F_cg)

configs = np.array(configs)
cg_forces_true = np.array(cg_forces_true)

# CG coordinates: center of mass of each group
R_cg = np.column_stack([configs[:, :3].mean(axis=1), configs[:, 3:].mean(axis=1)])

# Learn CG potential using simple neural network
# Input: distance between beads, Output: force
distances = R_cg[:, 1] - R_cg[:, 0]  # (N_configs,)

# NN: distance -> 16 hidden -> force
W1 = np.random.randn(1, 16) * 0.3
b1 = np.zeros(16)
W2 = np.random.randn(16, 1) * 0.3
b2 = np.zeros(1)

lr = 0.001
for epoch in range(500):
    # Forward
    z1 = distances.reshape(-1, 1) @ W1 + b1  # (N, 16)
    a1 = np.tanh(z1)
    F_pred = (a1 @ W2 + b2).flatten()  # force on bead 2

# Loss: predict the CG force on bead 2
    loss = np.mean((cg_forces_true[:, 1] - F_pred)**2)

# Backward
    dF = -2 * (cg_forces_true[:, 1] - F_pred).reshape(-1, 1) / N_configs
    dW2 = a1.T @ dF
    db2 = dF.sum(axis=0)
    da1 = dF @ W2.T
    dz1 = da1 * (1 - a1**2)
    dW1 = distances.reshape(-1, 1).T @ dz1
    db1 = dz1.sum(axis=0)

W1 -= lr * dW1; b1 -= lr * db1
    W2 -= lr * dW2; b2 -= lr * db2

if epoch % 100 == 0:
        print(f"  Epoch {epoch:3d}: force MSE = {loss:.6f}")

# Final evaluation
z1 = distances.reshape(-1, 1) @ W1 + b1
a1 = np.tanh(z1)
F_pred_final = (a1 @ W2 + b2).flatten()

corr = np.corrcoef(cg_forces_true[:, 1], F_pred_final)[0, 1]
print(f"\nFinal force correlation: {corr:.4f}")
print(f"Mean absolute error: {np.mean(np.abs(cg_forces_true[:, 1] - F_pred_final)):.4f}")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

10. ML-Enhanced Sampling

Collective Variables from Autoencoders

Enhanced sampling methods (metadynamics, umbrella sampling) requirecollective variables (CVs) that capture the slow degrees of freedom. Autoencoders can learn CVs directly from MD trajectories:

$$\xi(x) = f_\phi(x) \in \mathbb{R}^d, \quad d = 1, 2$$

The encoder latent space defines a low-dimensional reaction coordinate. This can be combined with metadynamics by adding a bias potential $V_{\text{bias}}(\xi)$ in the learned CV space to accelerate sampling of rare events.

Recent approaches use time-lagged autoencoders (TAE) that maximise the autocorrelation of the latent variables, preferentially selecting slow modes:$\mathcal{L}_{\text{TAE}} = \mathcal{L}_{\text{recon}} - \lambda \sum_k \text{Corr}(z_k(t), z_k(t+\tau))$

Boltzmann Generators

Normalising flows trained to sample from the Boltzmann distribution$p(x) \propto e^{-\beta E(x)}$ by transforming a simple prior through invertible neural networks. This allows direct sampling of equilibrium configurations without running MD.

Active Learning for MLIPs

During MD with an MLIP, configurations where the model is uncertain (e.g., high GP variance or disagreement between ensemble members) are selected for expensiveab initio recalculation. The MLIP is then retrained, creating a self-improving simulation loop.

Chapter Summary

• MLIPs must respect translational, rotational, and permutational symmetries, plus extensivity.
• Behler-Parrinello NNPs use handcrafted symmetry functions as invariant descriptors for per-atom neural networks.
• GAP/SOAP uses Gaussian process regression with rotationally invariant power spectrum descriptors.
• Message-passing NNs (SchNet, DimeNet, PaiNN) learn representations end-to-end through iterative neighbourhood aggregation.
• Equivariant networks use irreducible representations of O(3) and tensor products (Clebsch-Gordan coefficients) to build expressive architectures.
• MACE achieves high body order efficiently through symmetric contractions of equivariant features.
• Coarse-graining via force-matching learns effective potentials for reduced representations.
• Enhanced sampling benefits from ML-learned collective variables, Boltzmann generators, and active learning strategies.

Share:X Reddit LinkedIn

← Symbolic Regression ML in Cosmology & Astrophysics →