Protein Structure & Folding

How do linear chains of amino acids spontaneously fold into precise three-dimensional structures that perform all the functions of life? Explore the physics, chemistry, and quantum mechanics of protein folding.

🧬 Primary to Quaternary Structure📐 Anfinsen's Principle⏱️ Levinthal's Paradox🌄 Energy Landscapes

The Protein Folding Problem

Proteins are the workhorses of biology—enzymes, structural elements, transporters, signaling molecules, and more. Each protein is a polymer of 20 different amino acids, and the sequence determines the structure, which determines the function. But how does a protein "know" how to fold?

Anfinsen's Dogma (1973)

Christian Anfinsen demonstrated that the native structure of a protein is determined by its amino acid sequenceand represents the thermodynamic minimum of free energy under physiological conditions.

His experiments with ribonuclease A showed that denatured proteins can spontaneously refold to their native state, proving that all the information needed for folding is encoded in the sequence.

Levinthal's Paradox (1969)

If a protein were to sample all possible conformations randomly, even a small protein with 100 residues would require > 10²⁷ years to find the correct fold. Yet proteins fold in microseconds to seconds!

Resolution: Proteins do not search randomly—they follow folding pathways guided by energy landscapes.

📺 Video Lectures

Comprehensive lecture series from MIT 5.08J Biological Chemistry II with Prof. Elizabeth Nolan, covering protein folding mechanisms, kinetics, energy landscapes, and experimental techniques.

Lecture 8: Protein Folding 1

Introduction to protein folding: primary through quaternary structure, forces stabilizing folded proteins, and the thermodynamics of folding. Prof. Nolan explores Anfinsen's principle and Levinthal's paradox.

Recitation 3: Pre-Steady State and Steady-State Kinetic Methods Applied to Translation

Application of kinetic methods to study biological processes, focusing on translation machinery. Essential background for understanding protein folding kinetics and experimental approaches.

Lecture 9: Protein Folding 2

Energy landscapes, folding funnels, and kinetic pathways. Discussion of molten globule states, folding intermediates, and the role of conformational entropy in guiding folding.

Lecture 10: Protein Folding 3

Experimental methods for studying protein folding: circular dichroism, fluorescence spectroscopy, hydrogen-deuterium exchange, and NMR. Introduction to chaperones and protein quality control.

Lecture 11: Protein Folding 4

Molecular chaperones (GroEL/GroES, Hsp70, Hsp90), protein misfolding diseases (Alzheimer's, Parkinson's, prion diseases), and the cellular protein quality control system. Proteostasis and aggregation.

Connecting Molecular Details to Macroscopic Behaviors with Thermodynamics and Information Theory

An advanced lecture exploring how thermodynamic principles and information theory bridge molecular-scale details (amino acid sequences, atomic interactions, conformational states) to macroscopic observables (folding rates, stability, function). This connects statistical mechanics, entropy, free energy landscapes, and information content in sequences to the emergent behavior of protein systems.

Key Concepts:

  • Statistical mechanics of protein ensembles
  • Thermodynamic principles governing folding transitions
  • Information theory and sequence entropy
  • Bridging microscopic and macroscopic descriptions
  • Free energy landscapes from molecular interactions
  • Evolutionary information in protein families

Course Information

MIT 5.08J Biological Chemistry II
Instructor: Prof. Elizabeth Nolan
These lectures provide comprehensive coverage of protein folding from thermodynamic, kinetic, and structural perspectives, essential for understanding how sequence determines structure and function.

Four Levels of Protein Structure

1. Primary Structure

The linear sequence of amino acids connected by peptide bonds. This is the genetic information translated from mRNA.

NH₂ - Gly - Ala - Val - Leu - ... - Tyr - Trp - COOH

The sequence completely determines all higher levels of structure (Anfinsen's principle).

2. Secondary Structure

Local folding patterns stabilized by hydrogen bonds between backbone atoms:

  • α-helix: Right-handed helix with 3.6 residues per turn (φ ≈ -60°, ψ ≈ -45°)
  • β-sheet: Extended strands connected by hydrogen bonds (parallel or antiparallel)
  • β-turn: Reversal of chain direction, often connecting β-strands
  • Random coil: Unstructured regions with no regular pattern

Ramachandran angles define allowed conformations:

$\phi = \text{C}_{\text{i-1}} - \text{N}_{\text{i}} - \text{C}_{\alpha,\text{i}} - \text{C}_{\text{i}}$
$\psi = \text{N}_{\text{i}} - \text{C}_{\alpha,\text{i}} - \text{C}_{\text{i}} - \text{N}_{\text{i+1}}$

Dihedral angles defining backbone conformation

3. Tertiary Structure

The overall 3D shape of a single polypeptide chain, determined by:

  • Disulfide bonds (S-S): Covalent bonds between cysteine residues
  • Hydrophobic effect: Nonpolar residues buried in core, polar residues on surface
  • Hydrogen bonds: Between side chains and backbone
  • Electrostatic interactions: Salt bridges between charged residues
  • Van der Waals forces: Weak attractions between atoms in close proximity

Free energy of folding:

$\Delta G_{\text{fold}} = \Delta H - T\Delta S$

Typically ΔGfold ≈ -5 to -15 kcal/mol (marginally stable!)

The balance between enthalpic stabilization (favorable interactions) and entropic cost (loss of conformational freedom) is delicate—proteins are only marginally stable.

4. Quaternary Structure

The assembly of multiple polypeptide chains (subunits) into a functional complex.

Examples:

  • Hemoglobin: α₂β₂ tetramer (4 subunits) — oxygen transport
  • DNA polymerase III: 10+ subunits — DNA replication
  • Proteasome: 28 subunits — protein degradation
  • Ribosome: 50+ proteins + RNA — protein synthesis

Subunit interactions provide allosteric regulation, increased stability, and functional diversity.

Energy Landscapes and Folding Funnels

Modern understanding of protein folding views the process as navigation through a funnel-shaped energy landscape. This explains how proteins fold quickly despite the astronomical number of possible conformations.

The Folding Funnel Model

Instead of a single folding pathway, proteins fold via many parallel routes through conformational space, all leading downhill toward the native state:

  • Unfolded ensemble: High entropy, high energy, many conformations
  • Molten globule: Partially collapsed with some secondary structure
  • Transition states: Rate-limiting barriers along folding pathways
  • Native state: Low entropy, low energy, unique structure

Conformational entropy vs. energy:

$S_{\text{conf}} = k_B \ln \Omega(\mathbf{r})$

Ω(r) = number of accessible conformations at structure r

$F(\mathbf{r}) = E(\mathbf{r}) - TS_{\text{conf}}(\mathbf{r})$

Free energy surface: folding proceeds downhill in F, not just E

Folding Kinetics

Two-state folding model (applicable to small, single-domain proteins):

$$\text{U} \underset{k_u}{\overset{k_f}{\rightleftharpoons}} \text{N}$$

U = unfolded, N = native, kf = folding rate, ku = unfolding rate

$$k_f = k_0 \exp\left(-\frac{\Delta G^\ddagger}{k_B T}\right)$$

Arrhenius/Eyring equation: folding rate depends on transition state barrier height

Typical folding times range from microseconds (ultra-fast folders) to seconds (complex multi-domain proteins).

Quantum Mechanics in Protein Folding

While protein folding is often treated classically, quantum effects contribute at multiple levels:

1. Hydrogen Bonding

H-bonds stabilizing secondary structures involve quantum mechanical effects:

  • Proton delocalization and tunneling between donor/acceptor
  • Zero-point vibrational energy affects bond strengths
  • Cooperative effects in α-helices and β-sheets require quantum treatment
$E_{\text{H-bond}} \approx -5 \text{ kcal/mol} \approx -0.2 \text{ eV}$

Comparable to kBT at room temperature — quantum effects matter!

2. Electronic Structure of Side Chains

Aromatic residues (Phe, Tyr, Trp), disulfide bonds, and metal coordination sites require quantum chemical treatment:

  • π-π stacking interactions between aromatic rings
  • Cation-π interactions (Arg/Lys with aromatics)
  • Charge transfer and dispersion forces

3. Hydrophobic Effect

The primary driving force for folding—burial of nonpolar residues—has quantum origins:

Water structure around hydrophobic groups involves hydrogen bonding networks that depend on quantum nuclear effects (particularly important for accurate computational modeling).

4. Conformational Dynamics

Proteins are not static—they undergo thermal fluctuations and conformational changes:

  • Quantum tunneling through torsional barriers in backbone and side chains
  • Zero-point motion affects conformational sampling
  • Protein breathing motions and allosteric transitions may involve quantum coherence

Protein Misfolding and Disease

When proteins fail to fold correctly, the consequences can be catastrophic. Misfolded proteins are implicated in numerous diseases.

Protein Aggregation Diseases

Alzheimer's Disease:Amyloid-β plaques and tau tangles
Parkinson's Disease:α-synuclein aggregation in Lewy bodies
Huntington's Disease:Huntingtin protein with expanded polyglutamine repeats
Prion Diseases:Infectious misfolded PrP proteins (Creutzfeldt-Jakob disease, mad cow disease)
Type 2 Diabetes:Islet amyloid polypeptide aggregation

Molecular Chaperones

Cells employ specialized proteins called chaperones to assist folding and prevent aggregation:

  • Hsp70 family: Bind hydrophobic patches on nascent chains
  • GroEL/GroES (Hsp60): Barrel-shaped chamber providing isolated folding environment
  • Hsp90: Stabilizes metastable conformations of signaling proteins
  • Small HSPs: Prevent aggregation under stress conditions

Computational Protein Folding

Predicting protein structure from sequence is one of the grand challenges in molecular biology.

Molecular Dynamics Simulations

Classical MD simulations propagate Newton's equations of motion for all atoms:

$m_i \frac{d^2\mathbf{r}_i}{dt^2} = -\nabla_i V(\mathbf{r}_1, \dots, \mathbf{r}_N)$

Force on atom i from potential energy function V

$V = V_{\text{bonds}} + V_{\text{angles}} + V_{\text{torsions}} + V_{\text{nonbonded}}$

Limitations: Force fields are approximate, long timescales difficult to reach, quantum effects neglected.

AlphaFold Revolution

DeepMind's AlphaFold2 (2020) achieved near-experimental accuracy in structure prediction using deep learning:

  • Transformer-based neural network trained on PDB structures
  • Multiple sequence alignments capture evolutionary constraints
  • Attention mechanisms model residue-residue contacts
  • Predicts inter-residue distances and backbone angles

Impact:

AlphaFold has predicted structures for > 200 million proteins across all known species, revolutionizing structural biology and drug discovery.

📚 Key References

Anfinsen, C. B. (1973)

"Principles that Govern the Folding of Protein Chains"

Science 181(4096): 223-230. Nobel Prize lecture.

Dill, K. A. & MacCallum, J. L. (2012)

"The Protein-Folding Problem, 50 Years On"

Science 338(6110): 1042-1046.

Jumper, J. et al. (2021)

"Highly accurate protein structure prediction with AlphaFold"

Nature 596: 583-589.

Dobson, C. M. (2003)

"Protein folding and misfolding"

Nature 426: 884-890.