Part IV: Metabolomics | Chapter 14

NMR & Mass Spectrometry Platforms

Analytical technologies for comprehensive metabolome characterization

1. Nuclear Magnetic Resonance (NMR) Spectroscopy

Nuclear Magnetic Resonance (NMR) spectroscopy exploits the magnetic properties of atomic nuclei to reveal the chemical structure and concentration of metabolites in complex biological mixtures. When nuclei with non-zero spin quantum numbers (such as $^1$H, $^{13}$C, $^{31}$P, and $^{15}$N) are placed in an external magnetic field $B_0$, they precess about the field axis at a characteristic frequency called the Larmor frequency. This precession frequency depends on the gyromagnetic ratio of the nucleus and the local magnetic field it experiences, which is modulated by the surrounding electronic environment.

For metabolomics, $^1$H-NMR is the workhorse technique due to the high natural abundance and sensitivity of protons. A single $^1$H-NMR spectrum of a biofluid such as urine or plasma can contain hundreds of resonances from dozens of metabolites simultaneously, acquired in a non-destructive measurement that takes only a few minutes. The technique is inherently quantitative because peak area is directly proportional to the number of protons giving rise to that signal, allowing absolute quantification without the need for individual calibration curves — provided a reference compound of known concentration (e.g., TSP or DSS) is added.

The Larmor Frequency

The fundamental resonance frequency of a nuclear spin in an external magnetic field is given by:

$$\nu_0 = \frac{\gamma B_0}{2\pi}$$

where $\nu_0$ is the Larmor frequency (Hz), $\gamma$ is the gyromagnetic ratio of the nucleus (rad·s$^{-1}$·T$^{-1}$), and $B_0$ is the strength of the external magnetic field (Tesla). For $^1$H at 14.1 T, $\nu_0 \approx 600$ MHz.

Chemical Shift

The chemical shift is the dimensionless quantity that describes the resonance frequency of a nucleus relative to a reference compound:

$$\delta = \frac{\nu - \nu_{\text{ref}}}{\nu_{\text{ref}}} \times 10^6 \quad (\text{ppm})$$

where $\nu$ is the resonance frequency of the nucleus of interest and $\nu_{\text{ref}}$ is the resonance frequency of the reference standard (typically TMS or DSS). Chemical shift is reported in parts per million (ppm) and is independent of field strength, making it a universal property useful for metabolite identification.

Spin-Spin Coupling (J-Coupling)

Scalar coupling between neighboring nuclei splits NMR resonances into multiplets, providing information about molecular connectivity. The coupling constant J (measured in Hz) depends on the number and geometry of bonds between coupled nuclei. For example, adjacent CH groups in alanine produce a characteristic doublet ($J \approx 7.2$ Hz) for the methyl protons and a quartet for the CH proton. The multiplicity follows the n+1 rule for first-order spectra: a proton coupled to n equivalent neighbors produces n+1 lines.

2D-NMR Experiments for Metabolomics

ExperimentCorrelationsInformation ProvidedTypical Use
COSY$^1$H – $^1$HThrough-bond connectivity (2–3 bonds)Identify coupled proton networks
TOCSY$^1$H – $^1$HTotal correlation within spin systemMap entire spin systems
HSQC$^1$H – $^{13}$CDirect one-bond H–C connectivityResolve overlapping $^1$H signals
HMBC$^1$H – $^{13}$CLong-range H–C connectivity (2–4 bonds)Connect molecular fragments
J-RES$^1$H J-couplingSeparates chemical shift & couplingSimplify crowded spectra

2. Liquid Chromatography–Mass Spectrometry (LC-MS)

LC-MS has emerged as the most widely used platform for metabolomics due to its exceptional combination of sensitivity, selectivity, and metabolome coverage. The technique couples liquid chromatographic separation with mass spectrometric detection, enabling the analysis of a vast range of polar, semi-polar, and non-polar metabolites without the need for derivatization. Modern ultra-high-performance liquid chromatography (UHPLC) systems employ sub-2 μm particle columns that achieve superior chromatographic resolution in run times of 10–30 minutes.

The choice of chromatographic mode determines which metabolite classes are best resolved. Reversed-phase (RP) chromatography using C18 columns is the default for semi-polar and non-polar metabolites, separating compounds based on hydrophobicity. Hydrophilic interaction liquid chromatography (HILIC) retains highly polar metabolites (amino acids, sugars, nucleotides) that would elute in the void volume of RP columns, using a polar stationary phase with an organic-rich mobile phase. Ion-pairing chromatography adds reagents such as tributylamine (TBA) to the mobile phase to retain charged metabolites (e.g., phosphorylated intermediates, organic acids) on RP columns.

Mass Accuracy

In high-resolution mass spectrometry, mass accuracy is expressed in parts per million (ppm):

$$\text{Mass Accuracy (ppm)} = \frac{m_{\text{measured}} - m_{\text{theoretical}}}{m_{\text{theoretical}}} \times 10^6$$

Modern Orbitrap and QTOF instruments routinely achieve sub-5 ppm mass accuracy with external calibration and sub-2 ppm with internal calibration (lock mass). At 1 ppm accuracy for a metabolite of 300 Da, the mass window is only$\pm 0.0003$ Da, dramatically reducing the number of candidate molecular formulas.

Mass Resolution

Resolving power quantifies the ability to distinguish two closely spaced peaks:

$$R = \frac{m}{\Delta m}$$

where $m$ is the mass of the ion and $\Delta m$ is the full width at half maximum (FWHM) of the peak. An Orbitrap operating at $R = 140{,}000$ at $m/z = 200$ can resolve peaks separated by just 0.0014 Da, essential for distinguishing isobaric metabolites.

Ionization Modes

Electrospray ionization (ESI) is the standard interface for LC-MS metabolomics. Analysis is typically performed in both polarities:

Positive Mode (ESI+)

Produces [M+H]$^+$, [M+Na]$^+$, [M+NH₄]$^+$ ions. Favors amines, amino acids, nucleosides, and many alkaloids and drugs. Typically detects more features than negative mode.

Negative Mode (ESI−)

Produces [M−H]$^-$, [M+Cl]$^-$, [M+HCOO]$^-$ ions. Favors organic acids, phosphorylated compounds, sugar phosphates, and fatty acids. Lower chemical noise.

LC Column Comparison

Column TypeStationary PhaseTarget MetabolitesTypical Mobile Phases
Reversed-Phase (C18)OctadecylsilaneLipids, steroids, non-polarH₂O/ACN + 0.1% FA
HILICAmide, ZIC, BEHAmino acids, sugars, nucleotidesACN/H₂O + ammonium salts
Ion-Pairing RPC18 + TBAPhosphorylated metabolites, organic acidsH₂O/MeOH + TBA
Mixed-ModeRP + ion exchangeBroad polarity rangeVariable pH gradients

3. Gas Chromatography–Mass Spectrometry (GC-MS)

GC-MS remains a gold standard for metabolomics of small polar metabolites, particularly primary metabolites of central carbon metabolism. The technique requires that analytes be volatile and thermally stable, which necessitates chemical derivatization of most biological metabolites. The standard two-step derivatization protocol involves (1) methoximation with methoxyamine hydrochloride in pyridine, which protects carbonyl groups and prevents cyclization of reducing sugars, followed by (2) trimethylsilylation with reagents such as MSTFA (N-methyl-N-(trimethylsilyl)trifluoroacetamide), which replaces active hydrogen atoms with trimethylsilyl (TMS) groups, increasing volatility and thermal stability.

GC-MS metabolomics typically employs electron ionization (EI) at 70 eV, which produces highly reproducible fragmentation patterns that serve as molecular fingerprints. This reproducibility enables the construction of universal spectral libraries (such as the NIST Mass Spectral Library and the Fiehn GC-MS Metabolomics Library) that can be used across different instruments and laboratories. Compounds are identified by matching their mass spectra and retention indices against these libraries. GC-MS offers excellent chromatographic resolution (typically 200,000–500,000 theoretical plates) and high sensitivity (low picomole to femtomole detection limits), but its coverage is limited to volatile or derivatizable metabolites, excluding large polar molecules and intact lipids.

GC-MS Derivatization Workflow

Step 1Dried extract dissolved in methoxyamine-HCl/pyridine (20 mg/mL), 37 °C, 90 min
Step 2Addition of MSTFA (+ 1% TMCS), 37 °C, 30 min for silylation
Step 3Addition of retention index markers (FAMEs or alkane series)
Step 4Injection (1 μL, splitless mode) into GC-MS system

4. CE-MS, Direct Infusion MS & Ion Mobility

Capillary electrophoresis–mass spectrometry (CE-MS) separates metabolites based on their charge-to-size ratio in an electric field applied across a narrow-bore fused silica capillary (typically 50–75 μm inner diameter). CE-MS excels at separating highly polar and charged metabolites such as amino acids, organic acids, nucleotides, and sugar phosphates with minimal sample volume requirements (nanoliter injection volumes). The technique offers high separation efficiency (100,000–1,000,000 theoretical plates) but can suffer from limited sensitivity due to the small injection volumes and challenges with reproducibility of migration times.

Direct infusion mass spectrometry (DIMS), also called flow injection analysis (FIA-MS), bypasses chromatographic separation entirely by infusing the sample directly into the mass spectrometer. This approach offers extremely high throughput (analysis times of 1–5 minutes per sample) and captures the broadest mass range simultaneously. However, without chromatographic separation, isomeric metabolites cannot be distinguished, and ion suppression effects are maximized. DIMS is best suited for rapid screening and fingerprinting applications where high-throughput is prioritized over deep annotation.

Ion mobility spectrometry (IMS) adds an additional dimension of separation based on the size, shape, and charge of gas-phase ions. Coupled with LC-MS or DIMS, ion mobility provides collisional cross-section (CCS) values that serve as an additional physicochemical property for metabolite identification. Major IMS technologies include drift tube IMS (DTIMS), traveling wave IMS (TWIMS), trapped IMS (TIMS), and field asymmetric IMS (FAIMS). CCS values are highly reproducible and can differentiate structural isomers that are indistinguishable by mass and chromatographic retention alone.

Collisional Cross Section (CCS)

In drift tube ion mobility, the CCS is derived from the Mason-Schamp equation:

$$\Omega = \frac{3ze}{16N_0}\sqrt{\frac{2\pi}{\mu k_B T}}\frac{1}{K_0}$$

where $\Omega$ is the CCS, $z$ is the charge state, $e$ is the elementary charge,$N_0$ is the buffer gas number density, $\mu$ is the reduced mass, $k_B$ is the Boltzmann constant, $T$ is the temperature, and $K_0$ is the reduced mobility.

5. Platform Comparison & Selection

No single analytical platform can capture the entire metabolome. Each technology has distinct strengths in terms of sensitivity, metabolome coverage, throughput, reproducibility, and cost. A comprehensive metabolomics study often employs multiple complementary platforms to maximize metabolome coverage. The table below summarizes the key performance characteristics of the major metabolomics platforms.

PlatformSensitivityCoverageThroughputQuantificationSample Volume
$^1$H-NMRμM range40–200 metabolites5–15 minAbsolute (inherent)300–600 μL
LC-MS (HRMS)nM–pM range2000–10,000+ features10–30 minSemi-quantitative5–50 μL
GC-MSnM range200–800 metabolites20–60 minQuantitative (with IS)1–50 μL
CE-MSnM range500–1500 metabolites20–40 minSemi-quantitativenL–μL
DIMSnM–μM range1000–5000 features1–5 minSemi-quantitative5–20 μL

6. Data Processing Pipeline

Raw data from metabolomics experiments must undergo extensive computational processing before biological interpretation. The standard data processing pipeline includes several sequential steps, each critical for generating a clean, reliable feature table suitable for statistical analysis. Open-source tools such as XCMS, MZmine, MS-DIAL, and MetaboAnalyst have become the workhorses of metabolomics data processing.

Peak Detection

Algorithms such as centWave (in XCMS) or matched filter detect chromatographic peaks in the raw data by fitting models to ion chromatograms. Each detected feature is characterized by its m/z value, retention time, peak area, and peak height. Parameters such as minimum peak width, signal-to-noise threshold, and m/z tolerance must be optimized for each dataset.

Retention Time Alignment

Small drifts in retention time between samples must be corrected to ensure that corresponding peaks are properly matched across the dataset. Algorithms such as obiwarp (warping-based) or loess correction align retention times using pooled QC samples or landmark peaks as reference points.

Feature Grouping & Adduct Annotation

A single metabolite can generate multiple ions in the mass spectrum (isotopes, adducts, in-source fragments, multimers). Tools such as CAMERA (in XCMS) or MS-DIAL group correlated features originating from the same compound, reducing feature redundancy and facilitating metabolite-level analysis.

Normalization

Normalization corrects for systematic variation unrelated to biology. Common approaches include total ion current (TIC) normalization, probabilistic quotient normalization (PQN), median-fold change normalization, internal standard normalization, and QC-RLSC (locally estimated scatterplot smoothing using QC samples to correct drift).

Gap Filling (Missing Value Imputation)

Missing values are common in untargeted metabolomics (features detected in some samples but not others). Gap filling returns to the raw data to search for low-intensity signals at the expected m/z and retention time. Remaining missing values may be imputed using methods such as k-nearest neighbors (kNN), random forest imputation, or replacement with a fraction of the minimum observed value (e.g., 1/5 of the minimum).

NMR Data Processing

NMR spectra require distinct processing steps: Fourier transformation of the free induction decay (FID), phase correction, baseline correction, chemical shift referencing (to TSP or DSS), and either binning (dividing the spectrum into fixed-width segments, typically 0.01–0.04 ppm, and integrating each bin) or targeted profiling (fitting known metabolite spectral signatures to the observed spectrum using tools like Chenomx NMR Suite, BATMAN, or rNMR). Binning is simpler but loses resolution; targeted profiling provides metabolite-level quantification but requires extensive reference libraries.