3. Protein Folding & Higher-Order Structure

From local backbone geometry to the three-dimensional architecture that dictates biological function.

Secondary Structure

Secondary structure refers to local, regular folding patterns stabilized by backbone hydrogen bonds between the amide N-H and carbonyl C=O groups of the peptide backbone. The two principal elements are the $\alpha$-helix and the $\beta$-sheet.

The $\alpha$-Helix

The $\alpha$-helix is a right-handed coil in which every backbone C=O of residue $i$ forms a hydrogen bond with the backbone N-H of residue $i+4$. This produces a tightly wound helix with well-defined geometric parameters:

Residues per turn: 3.6

Rise per residue (d): 1.5 A

Pitch (p): 5.4 A (one complete turn)

Hydrogen bond length: ~2.8 A (N...O distance)

Dihedral angles: $\phi \approx -57°$, $\psi \approx -47°$

The pitch of the helix relates to the rise per residue and the number of residues per turn:

$$p = n \times d = 3.6 \times 1.5\;\text{\AA} = 5.4\;\text{\AA}$$

The angular rotation per residue about the helix axis is:

$$\theta = \frac{360°}{3.6} = 100°\;\text{per residue}$$

The $\beta$-Sheet

$\beta$-Sheets are formed by laterally packed $\beta$-strands connected by hydrogen bonds between adjacent strands. Two arrangements exist:

Parallel $\beta$-Sheet

Adjacent strands run in the same N-to-C direction. Hydrogen bonds are evenly spaced but slightly angled. Dihedral angles: $\phi \approx -119°$, $\psi \approx +113°$.

Antiparallel $\beta$-Sheet

Adjacent strands run in opposite directions. Hydrogen bonds are linear and evenly spaced, making antiparallel sheets more stable. Dihedral angles: $\phi \approx -139°$, $\psi \approx +135°$.

Turns and Loops

$\beta$-Turns are short segments (typically 4 residues) that reverse the direction of the polypeptide chain. A hydrogen bond between the C=O of residue $i$ and the N-H of residue $i+3$ stabilizes the turn.

Type I Turn

Residue $i+1$: $\phi = -60°$, $\psi = -30°$. Residue $i+2$: $\phi = -90°$, $\psi = 0°$. Pro often found at position $i+1$.

Type II Turn

Residue $i+1$: $\phi = -60°$, $\psi = +120°$. Residue $i+2$: $\phi = +80°$, $\psi = 0°$. Gly is preferred at position $i+2$.

Supersecondary Structure (Motifs)

Supersecondary structures are recurring combinations of secondary structure elements found across many unrelated proteins. They represent common solutions to structural and functional problems in protein architecture.

$\beta$-$\alpha$-$\beta$ Motif

Two parallel $\beta$-strands connected by an $\alpha$-helix. Common in $\alpha/\beta$ barrel enzymes (e.g., TIM barrel). The helix shields hydrophobic residues from solvent.

Helix-Turn-Helix

Two $\alpha$-helices separated by a short turn. Found in DNA-binding proteins where the recognition helix inserts into the major groove of DNA.

Zinc Finger

A compact domain stabilized by a $\text{Zn}^{2+}$ ion coordinated by Cys and His residues. Cys$_2$His$_2$ is the classic type. Found in transcription factors (e.g., TFIIIA).

Leucine Zipper

Two $\alpha$-helices with leucine at every 7th position (heptad repeat) forming a coiled-coil. Mediates dimerization in transcription factors (e.g., GCN4, c-Fos/c-Jun).

The Coiled-Coil

In a coiled-coil, two $\alpha$-helices wind around each other with a left-handed superhelical twist. The heptad repeat (positions labeled $a$ through $g$) places hydrophobic residues at positions $a$ and $d$, forming a hydrophobic interface between the helices. The crossing angle between the two helices is approximately:

$$\theta = \frac{2\pi \times 3.6}{7} \approx 20°$$

This arises because 3.6 residues per turn of a standard $\alpha$-helix does not perfectly match a 7-residue repeat. The helices tilt slightly to bring the hydrophobic stripe into register every two turns.

Tertiary Structure

Tertiary structure is the overall three-dimensional arrangement of all atoms in a single polypeptide chain. It arises from long-range interactions between amino acid residues that may be far apart in the primary sequence but close in three-dimensional space.

Forces Driving Protein Folding

Hydrophobic Effect (Dominant)

Nonpolar side chains are buried in the protein interior, releasing ordered water molecules. This entropy increase is the single largest driving force for folding.

Hydrogen Bonds

Backbone and side-chain H-bonds contribute ~2-7 kJ/mol each. Although individually weak, hundreds of H-bonds collectively stabilize the fold.

Van der Waals Forces

Close packing of atoms in the protein interior generates many weak (~0.4-4 kJ/mol) but additive London dispersion interactions.

Ionic Interactions & Disulfide Bonds

Salt bridges between charged residues (Lys/Arg with Asp/Glu). Disulfide bonds (Cys-Cys) provide covalent stabilization, especially in extracellular proteins.

Thermodynamics of Folding

The Gibbs free energy of folding determines whether a protein will adopt its native conformation:

$$\Delta G_{\text{folding}} = \Delta H_{\text{folding}} - T\Delta S_{\text{folding}}$$

Marginal Stability

The net stability of a folded protein is surprisingly small: $\Delta G_{\text{folding}} \approx -20$ to $-60$ kJ/mol. This is the small difference between enormous opposing contributions: the conformational entropy lost upon folding ($T\Delta S \sim$ thousands of kJ/mol) is nearly balanced by the enthalpy gained from favorable interactions ($\Delta H \sim$ thousands of kJ/mol). This marginal stability allows proteins to be flexible and regulatable.

Example:

For a typical globular protein with $\Delta H_{\text{folding}} = -300$ kJ/mol and $T\Delta S_{\text{folding}} = -260$ kJ/mol at 25 C:

$$\Delta G_{\text{folding}} = -300 - (-260) = -40\;\text{kJ/mol}$$

This corresponds to the energy of just a few hydrogen bonds, illustrating marginal stability.

Quaternary Structure

Quaternary structure describes the arrangement of multiple polypeptide subunits in a multisubunit protein complex. The same non-covalent forces that stabilize tertiary structure (hydrophobic interactions, hydrogen bonds, ionic interactions) hold subunits together at protein-protein interfaces.

Homomultimers

Composed of identical subunits. Examples: glutathione S-transferase (homodimer), p53 (homotetramer), GroEL (homo-14-mer arranged as two stacked heptameric rings).

Heteromultimers

Composed of non-identical subunits. Examples: hemoglobin ($\alpha_2\beta_2$), DNA polymerase III holoenzyme (10+ different subunits), ATP synthase ($F_1$: $\alpha_3\beta_3\gamma\delta\varepsilon$).

Symmetry in Oligomers

Most homo-oligomers display rotational symmetry:

Cyclic Symmetry ($C_n$)

One axis of rotational symmetry. A $C_3$ trimer has a 3-fold rotation axis ($120°$ rotation). Example: bacteriorhodopsin trimer.

Dihedral Symmetry ($D_n$)

One $n$-fold axis plus $n$ perpendicular 2-fold axes. Hemoglobin ($\alpha_2\beta_2$) exhibits pseudo-$D_2$ symmetry. Aspartate transcarbamoylase is a $D_3$ complex.

Protein Folding

Levinthal's Paradox

If a protein of $n$ residues sampled all possible conformations randomly (each residue having ~$3$ rotameric states for each of 2 backbone torsions), the total number of conformations would be:

$$N_{\text{conformations}} = 3^{2n}$$

For a 100-residue protein, $N \approx 3^{200} \approx 10^{95}$. Even sampling each in $10^{-13}$ s (a bond vibration), the search would take $\sim 10^{82}$ s, far exceeding the age of the universe ($\sim 4 \times 10^{17}$ s). Yet proteins fold in milliseconds to seconds. This is Levinthal's paradox: folding cannot proceed by random search.

The Folding Funnel

The modern resolution of Levinthal's paradox is the energy landscape or folding funnel model. The conformational energy landscape is funnel-shaped: many high-energy unfolded states converge toward a single low-energy native state:

$$E(\text{native}) \ll E(\text{unfolded})$$

The funnel has a broad rim (many unfolded conformations with high entropy) and a narrow bottom (the unique native state). Local energy minima along the funnel surface can trap the protein in intermediate or misfolded states.

The Two-State Model

Many small globular proteins fold in a cooperative, two-state process with no detectable intermediates:

$$\text{N} \rightleftharpoons \text{U}, \quad K_{\text{eq}} = \frac{[\text{U}]}{[\text{N}]} = e^{\Delta G_{\text{folding}} / RT}$$

where N is the native state and U is the unfolded state.

Anfinsen's Thermodynamic Hypothesis

Christian Anfinsen demonstrated (1973 Nobel Prize) that the native structure of a protein is determined solely by its amino acid sequence. Denatured and reduced ribonuclease A spontaneously refolded to its active conformation upon removal of denaturant, proving that the native state corresponds to the global free energy minimum of the polypeptide chain under physiological conditions.

Chaperones and Assisted Folding

Although the native fold is encoded in the amino acid sequence, many proteins require molecular chaperones to fold efficiently in the crowded cellular environment (protein concentration ~300 g/L). Chaperones do not alter the final structure; they prevent misfolding and aggregation.

Hsp70 (DnaK)

Binds exposed hydrophobic segments of nascent or unfolded polypeptides. ATP binding triggers substrate release; ATP hydrolysis promotes substrate binding. Co-chaperones: DnaJ (Hsp40) and GrpE.

Hsp60 (GroEL/GroES)

A large barrel-shaped complex (chaperonin). The unfolded protein enters the GroEL cavity, GroES caps it, and ATP hydrolysis drives conformational changes that promote folding in an isolated environment.

Hsp90

Stabilizes metastable client proteins (kinases, steroid receptors, transcription factors). Forms a dimeric clamp. ATP-dependent cycle with co-chaperones (Hop, p23, Aha1).

Protein Misfolding Diseases

When the folding machinery fails, proteins can misfold into toxic aggregates, particularly amyloid fibrils characterized by cross-$\beta$ structure:

Prion Diseases

PrP$^{\text{C}}$ (normal, $\alpha$-helical) converts to PrP$^{\text{Sc}}$ ($\beta$-sheet-rich, infectious). Creutzfeldt-Jakob disease, BSE, kuru.

Alzheimer's Disease

Amyloid-$\beta$ peptide (A$\beta$42) aggregates into extracellular plaques. Tau protein forms intracellular neurofibrillary tangles.

Parkinson's Disease

$\alpha$-Synuclein aggregates into Lewy bodies in dopaminergic neurons. Fibrils adopt cross-$\beta$ amyloid structure.

Experimental Methods for Protein Structure

Determining protein structure at atomic resolution requires sophisticated physical techniques. Each method has distinct strengths and limitations.

X-ray Crystallography

Requires protein crystals. X-rays diffract from electron clouds; Bragg's law ($n\lambda = 2d\sin\theta$) relates diffraction angle to interplanar spacing. Resolution typically 1.5-3.0 A. Provides static, time-averaged structures. Over 85% of PDB structures.

NMR Spectroscopy

Works in solution (near-physiological conditions). Uses nuclear spin interactions ($^1$H, $^{13}$C, $^{15}$N). NOESY cross-peaks reveal interatomic distances. Limited to proteins under ~40 kDa. Provides information on dynamics and conformational exchange.

Cryo-Electron Microscopy (Cryo-EM)

Samples flash-frozen in vitreous ice. Single-particle reconstruction from thousands of 2D projections. Resolution revolution: now routinely achieves sub-3 A resolution. No crystallization needed. Ideal for large complexes (ribosomes, viruses, membrane proteins).

Circular Dichroism (CD)

Measures differential absorption of left- and right-circularly polarized UV light. Far-UV CD (190-250 nm) reports on secondary structure content: $\alpha$-helix shows minima at 208 and 222 nm; $\beta$-sheet shows a minimum near 218 nm. Useful for monitoring folding/unfolding.

FRET (Forster Resonance Energy Transfer)

Non-radiative energy transfer between donor and acceptor fluorophores. Efficiency depends on distance ($r$) as $E = \frac{1}{1+(r/R_0)^6}$, where $R_0$ is the Forster radius (~20-90 A). Single-molecule FRET can monitor real-time conformational changes during folding.

Key Concepts

  • *The $\alpha$-helix has 3.6 residues per turn, a pitch of 5.4 A, and is stabilized by $i \to i+4$ backbone hydrogen bonds.
  • *$\beta$-Sheets can be parallel or antiparallel; antiparallel sheets have more linear, stronger hydrogen bonds.
  • *Coiled-coils use a heptad repeat to create a hydrophobic interface between two $\alpha$-helices with a ~20 degree crossing angle.
  • *The hydrophobic effect is the dominant force driving protein folding, supplemented by H-bonds, van der Waals, and ionic interactions.
  • *Folded proteins have marginal stability ($\Delta G \approx -20$ to $-60$ kJ/mol), the small difference of large opposing terms.
  • *Levinthal's paradox shows that folding cannot occur by random conformational search; the folding funnel model resolves this.
  • *Anfinsen's experiment demonstrated that amino acid sequence alone encodes the native fold (thermodynamic hypothesis).
  • *Molecular chaperones (Hsp70, GroEL/GroES, Hsp90) prevent misfolding and aggregation in an ATP-dependent manner.
  • *Protein misfolding leads to amyloid diseases: prion diseases, Alzheimer's (A$\beta$), and Parkinson's ($\alpha$-synuclein).
  • *Major structural methods: X-ray crystallography, NMR, cryo-EM, CD spectroscopy, and single-molecule FRET.