← Previous: Echolocation Next: Baleen & Filter Feeding →

Module 4: Whale Song & Communication

In 1970 Roger Payne released Songs of the Humpback Whale, an album that transformed public perception of cetaceans overnight and helped catalyze the international ban on commercial whaling. The structured, repetitive, hour-long vocalizations of male humpback whales revealed a musical complexity previously associated only with humans and songbirds. In the decades since, cetacean bioacoustics has uncovered cultural transmission of song across ocean basins, clan-specific dialects in sperm whales, and signature whistles in dolphins that function as proper names. This module derives the physical acoustics of long-range underwater propagation (the SOFAR channel), examines the structure of humpback song, and surveys the growing threat of anthropogenic noise pollution.

1. Humpback Whale Song: Structure and Cultural Transmission

Male humpback whales (Megaptera novaeangliae) on their tropical breeding grounds produce hierarchically structured songs. Payne & McVay (1971) identified four levels of organization:

Unit: a single sound (moan, grunt, cry, chirp), lasting ~1–4 s
Phrase: a short sequence of units (~5–15 s)
Theme: repeated phrases (2–4 minutes)
Song: a sequence of themes in a fixed order (20 minutes to a few hours)

A singer repeats its song with the themes in the same order, for hours on end. All males in a given population sing the same song at a given time. But the song changes progressively: themes are added, dropped, and modified over the course of a breeding season. Most remarkably, these changes sweep across populations. Noad et al. (2000) documented how a novel song from the Indian Ocean (west Australia) replaced the ancestral east Australian song over two years as it traveled eastward — a clear case of cultural transmission and population-scale fashion cycles in non-human animals.

1.1 Functional Hypotheses

Despite fifty years of study, the function of humpback song remains debated. Leading hypotheses:

Sexual selection (male display): analog to bird song; a complex learned performance signaling fitness or condition to females.
Male-male signaling: coordination or competition between males; song stops or changes when two singers meet.
Sonar/biosonar: acoustic probing of the environment to locate other whales or features (Mercado 2018).

Most likely the song serves multiple functions simultaneously. The cultural revolution phenomenon (sudden adoption of a new song across a population) suggests an element of social learning beyond pure genetic or developmental programming.

2. The SOFAR Channel and Long-Range Propagation

The SOFAR channel (Sound Fixing and Ranging), also called the deep sound channel or the Munk channel, is a horizontal waveguide at around 1000 m depth in the temperate ocean. It arises because the sound speed in seawater has a minimum at depth, creating an acoustic “minimum of c” that traps horizontally-propagating rays.

2.1 Sound Speed in Seawater

Sound speed in seawater depends on temperature, salinity, and pressure:

\[ c(T,S,z) \approx 1449.2 + 4.6\,T - 0.055\,T^2 + 1.34(S-35) + 0.016\,z \]

(Mackenzie simplified formula; T in °C, S in ppt, z in meters)

Near the surface in temperate oceans, T is high (~20°C) and z is shallow — both push c higher. At depth, T drops but z rises; T wins first, making c fall to a minimum. Below ~1 km, pressure dominates and c rises again. The sound speed minimum (“SOFAR axis”) sits at ~1000 m in mid-latitudes, shallower in the tropics, essentially at the surface in polar seas where there is no thermocline.

2.2 Acoustic Waveguide

A ray launched near the SOFAR axis and with small grazing angle never reaches the surface or bottom — it is continuously refracted back toward the axis (Snell's law applied in a continuously varying medium). The mode is analogous to light in a graded-index optical fiber. Geometric spreading for a trapped cylindrical mode is only \(r^{-1/2}\) (vs \(r^{-1}\) for spherical spreading), so transmission loss scales as:

\[ \text{TL}_{SOFAR} = 10 \log_{10}(r) + \alpha(f)\cdot r \]

vs \(20\log_{10}(r)\) for spherical spreading — 10 dB less loss per decade of range.

At 20 Hz (blue whale call frequency), absorption is tiny:\(\alpha \approx 0.001\,\text{dB/km}\). Combined with cylindrical spreading, a blue whale vocalization at source level 188 dB can reach thousands of kilometers before attenuating below background ocean noise. Before the advent of extensive ship traffic, blue whales may have been able to communicate across entire ocean basins.

2.3 Blue Whale Calls

Blue whales (Balaenoptera musculus) produce some of the most intense biological sounds on Earth: ~188 dB re 1 μPa at 10–40 Hz. Nine regional populations have distinct call types (the “songs” of various oceans), and puzzlingly the peak frequencies of these calls have been shifting downward over decades — roughly 30% drop from 1960 to 2010. Proposed explanations include population recovery (if only the largest males sing, their fundamental frequencies are lower), sexual selection for lower pitch, or cultural drift.

3. Sperm Whale Codas and Dolphin Signature Whistles

Beyond sonar clicks for prey detection, sperm whales produce social codas— short patterned click sequences (typically 3–20 clicks with distinctive inter-click intervals). Codas are used for social communication at the surface, not for foraging. Rendell & Whitehead (2003) demonstrated that sperm whales in the Pacific partition into acoustic clans: each clan uses a distinct repertoire of codas, and clan boundaries transcend geographic ocean regions. A sperm whale raised in a particular clan learns its coda repertoire and retains it for life. These are among the clearest documented examples of animal culture outside humans.

3.1 Coda Syntax

A coda is notated by its inter-click intervals, e.g. the common Caribbean “1+1+3” coda has one click followed (after a pause) by three quick clicks. Different clans use distinct “identity codas” that are particularly diagnostic of clan membership (Gero et al. 2016). Acoustic analysis suggests that coda vocal learning in sperm whales is comparable to dialect learning in humans or song-type culture in birds.

3.2 Dolphin Signature Whistles Revisited

Introduced in Module 3, the dolphin signature whistle functions as an individual acoustic label. Janik (2013) compiled the experimental evidence:

Each bottlenose dolphin develops a unique signature whistle by 1–2 months of age
Signature whistles are used upon separation (mother-calf, pod members)
Dolphins copy each other's signature whistles to address specific individuals (King & Janik 2013)
Signature whistles remain stable for decades (demonstrated in wild populations for >10 years)

This is arguably the closest analog to human names in any non-human animal.

4. SOFAR Channel and Global Propagation Diagram

5. Anthropogenic Noise and Masking

Industrial shipping has transformed the acoustic environment of the oceans. Low-frequency (10–500 Hz) ambient noise has risen by ~3 dB per decade since 1950 (Andrew et al. 2002); in shipping lanes the increase exceeds 15 dB over the same period. This is precisely the frequency band used by blue, fin, and bowhead whales.

5.1 Masking

The effective communication range for a whale call declines with ambient noise. If the detection threshold is\(\text{NL} + \text{DT}\), the maximum range is:

\[ \text{SL} - \text{TL}(r_{max}) = \text{NL} + \text{DT} \]

A 15 dB increase in ambient noise reduces communication range by up to 10^15/20 = 5.6×. For a blue whale with pre-industrial range ~5000 km, this contraction puts the animals within hailing distance of only ~900 km — still impressive, but a dramatic reduction in the acoustic “world” available to a social animal.

5.2 The Lombard Effect in Whales

North Atlantic right whales and other species exhibit a Lombard effect — raising their call amplitude in proportion to ambient noise, just as humans speak louder in a loud room. Parks et al. (2011) measured right whale up-call source levels increasing by ~0.5 dB for every 1 dB rise in background noise. This physiological compensation is incomplete (Lombard ratio ~0.5), so effective communication range still contracts with noise. It also has metabolic costs: louder calls require more energy.

5.3 The 2020 Pandemic as Natural Experiment

The COVID-19 pandemic reduced marine traffic dramatically in spring 2020. Measurements in the Strait of Georgia (Thomson & Barclay 2020) showed a 1.5 dB drop in underwater noise. Concurrent studies of whale vocalization behavior suggested that whales responded immediately — calling less often at higher amplitude, or calling more often at lower amplitude. The pandemic quieting constitutes perhaps the clearest evidence that chronic noise pollution was constraining cetacean acoustic behavior.

5.4 Seismic Airgun Impacts

Oil and gas exploration uses arrays of compressed-air guns that fire every 10–20 seconds, producing broadband impulsive sounds at source levels of ~260 dB re 1 μPa. Airgun surveys continue for months and propagate detectably across ocean basins. Nowacek et al. (2015) documented cessation of sperm whale foraging, fin whale abandonment of breeding grounds, and blue whale avoidance behavior in response to airgun surveys. Cumulative exposure during seismic surveys is among the highest anthropogenic acoustic impacts a cetacean can experience short of direct military sonar.

6. Vocal Learning and Development

Vocal learning — modifying vocalizations based on auditory experience — is rare in mammals. Among mammals it has been documented in humans, cetaceans, pinnipeds, elephants, and bats. Cetaceans are the most accomplished mammalian vocal learners outside our own species: both odontocetes (dolphins, orcas, belugas) and mysticetes (humpbacks) exhibit production learning, in which individuals acquire new sound categories by listening.

6.1 Signature Whistle Development

Bottlenose calves develop a signature whistle over the first 6–12 months of life. They seem to listen to a range of whistles in their social environment and then select a pattern that is distinct from both their mother's and from other close associates (Fripp et al. 2005). This deliberate differentiation ensures the signature is uniquely identifying. Some calves copy elements of their mother's signature; others show more independent innovation.

6.2 Orca Dialect Learning

Orca pods have pod-specific dialects consisting of several stereotyped call types. Juvenile orcas learn these calls from their mothers and close relatives during the first year of life (Ford 1991). Transient and resident orca ecotypes have completely different repertoires despite overlap in range — a pure cultural distinction. Captive orcas have been observed to imitate the calls of companions from different wild populations, confirming vocal production learning directly.

6.3 Interspecific Imitation

Captive belugas (Ridgway et al. 2012) and killer whales (Musser et al. 2014) have been documented imitating human speech sounds — the only non-primate mammals reliably documented doing so. The beluga NOC at the National Marine Mammal Foundation produced recognizable “out!” calls; an orca named Wikie learned to mimic short English words. These abilities suggest a vocal learning capacity comparable to some bird species and well beyond anything documented in non-human primates.

6.4 Neural Substrates

Vocal learning requires a neural circuit linking the forebrain motor cortex to the brainstem motor neurons that drive the sound-production organs. In song birds this circuit is well characterized (HVC → RA). In cetaceans the analogous pathway includes strong direct projections from motor cortex to nucleus ambiguus (which innervates the phonic lips and laryngeal muscles) — a feature shared only with humans among mammals. The neuroanatomical substrate for cetacean vocal learning thus appears homologous to the substrate of human speech.

6.5 FOXP2 and the Genetics of Vocal Learning

The transcription factor FOXP2, known to be involved in human speech development, shows accelerated evolution in mammalian vocal learners including cetaceans. Zhang et al. (2013) found that cetacean FOXP2 carries the same two amino-acid substitutions (T303N, N325S) that distinguish the human allele from that of other primates. This is convergent molecular evolution at a specific gene linked to vocal learning, paralleling the Prestin convergence between bats and whales for echolocation.

6.6 Syntax in Cetacean Song?

The hierarchical structure of humpback song (units → phrases → themes) superficially resembles linguistic syntax. Suzuki et al. (2006) applied information-theoretic analyses to humpback song and found long-range statistical dependencies that resemble “hierarchical structure” in human language. Whether these patterns constitute syntax in a meaningful sense (carrying compositional semantics) or merely a hierarchical pattern without referential content is contested. Current consensus is that song is more analogous to instrumental music than to linguistic utterances — structured but not propositional.

6. Simulation: Song Spectrograms and SOFAR Propagation

The simulation (i) synthesizes a simplified humpback whale song and computes its spectrogram, showing the hierarchical unit-phrase-theme structure; (ii) computes the SOFAR channel effective range as a function of frequency, demonstrating why blue whale calls can reach thousands of kilometers but dolphin echolocation is limited to ~100 m; (iii) plots the sound-speed profile showing the SOFAR axis; and (iv) presents a simple representation of four dolphins' signature whistles, illustrating the individual stability of these “acoustic names.”

Python

script.py168 lines

import numpy as np
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt

fig = plt.figure(figsize=(16, 10))
fig.patch.set_facecolor('#020617')
gs = fig.add_gridspec(2, 2, hspace=0.32, wspace=0.28)

# -----------------------------------------------------------------
# Panel 1 : Synthesized humpback whale song spectrogram
# -----------------------------------------------------------------
ax1 = fig.add_subplot(gs[0,0])
ax1.set_facecolor('#042f2e')
fs = 4000.0
t_total = 30.0
t = np.arange(0, t_total, 1/fs)
# Song is structured as units, phrases, themes
# Unit 1: moan descending 200 → 100 Hz
def chirp(t, f0, f1, t0, tf, amp=1.0):
    mask = (t >= t0) & (t < tf)
    tt = np.where(mask, t - t0, 0)
    dur = tf - t0
    f = f0 + (f1 - f0) * tt / dur
    phi = 2*np.pi * (f0 * tt + 0.5*(f1 - f0)/dur * tt**2)
    return np.where(mask, amp*np.sin(phi), 0.0)
sig = np.zeros_like(t)
# Phrase A: moan 0-3 s, cry 3-5 s, note 5-7 s
sig += chirp(t, 250, 80, 0.0, 2.5)
sig += chirp(t, 400, 1200, 3.0, 4.5)
sig += chirp(t, 600, 400, 5.0, 6.8)
# Phrase B (repeat with variation) 8-15 s
sig += chirp(t, 260, 90, 8.0, 10.5)
sig += chirp(t, 420, 1100, 11.0, 12.5)
sig += chirp(t, 650, 420, 13.0, 14.8)
# Theme transition: low moan
sig += chirp(t, 100, 50, 16.0, 20.0, amp=1.5)
# Phrase C: high cries 22-28
sig += chirp(t, 500, 1500, 22.0, 23.5)
sig += chirp(t, 520, 1400, 25.0, 26.5)

# Spectrogram
from numpy.fft import rfft, rfftfreq
N = int(fs * 0.5)    # window 0.5 s
hop = int(N*0.25)
frames = []
centers = []
for i in range(0, len(t)-N, hop):
    seg = sig[i:i+N] * np.hanning(N)
    S = np.abs(rfft(seg))
    frames.append(S)
    centers.append(t[i + N//2])
S = np.array(frames).T
freqs = rfftfreq(N, 1/fs)
mask = freqs < 2000
im = ax1.imshow(20*np.log10(S[mask]+1e-10), aspect='auto', origin='lower',
                extent=[centers[0], centers[-1], freqs[mask][0], freqs[mask][-1]],
                cmap='magma', vmin=-10, vmax=60)
ax1.set_xlabel('Time (s)', color='white')
ax1.set_ylabel('Frequency (Hz)', color='white')
ax1.set_title('Humpback Whale Song (synthesised)\nstructured units → phrases → themes',
              color='#5eead4', fontsize=11)
ax1.tick_params(colors='white')
for sp in ax1.spines.values(): sp.set_color('#14b8a6')

# -----------------------------------------------------------------
# Panel 2 : SOFAR channel attenuation vs frequency
# -----------------------------------------------------------------
ax2 = fig.add_subplot(gs[0,1])
ax2.set_facecolor('#042f2e')
f = np.logspace(1, 5, 400)  # Hz
# Francois-Garrison absorption (simplified)
def alpha(f_kHz):
    # Boric acid
    f1 = 1.32
    A1 = 0.106*f1*f_kHz**2/(f1**2 + f_kHz**2)
    # MgSO4
    f2 = 63.0
    A2 = 0.52*f2*f_kHz**2/(f2**2 + f_kHz**2)
    # Viscous
    A3 = 0.00049 * f_kHz**2
    return A1 + A2 + A3  # dB/km

f_kHz = f/1000.0
a = alpha(f_kHz)
# Spherical loss at 1 km
TL_1km = 20*np.log10(1000) + a
# Range where TL hits 100 dB: useful SOFAR range
range_sofar = np.zeros_like(f)
for i in range(len(f)):
    # Solve TL=100 for r: 20 log10 r + a * r/1000 = 100 (r in m)
    r = np.logspace(1, 8, 1000)
    TL = 20*np.log10(r) + a[i]*r/1000
    idx = np.argmin(np.abs(TL - 100))
    range_sofar[i] = r[idx]
ax2.loglog(f, range_sofar/1000, color='#22d3ee', lw=2.5)
ax2.axvspan(15, 50, color='#fbbf2430', label='Blue whale calls 15–50 Hz')
ax2.axvspan(150, 3000, color='#f472b630', label='Humpback song')
ax2.axvspan(100000, 150000, color='#67e8f930', label='Dolphin echolocation')
ax2.set_xlabel('Frequency (Hz)', color='white')
ax2.set_ylabel('Range to 100 dB TL (km)', color='white')
ax2.set_title('SOFAR Channel Propagation\nLow f → thousands of km; high f → ~km',
              color='#5eead4', fontsize=11)
ax2.tick_params(colors='white'); ax2.grid(True, alpha=0.15, color='#14b8a6', which='both')
for sp in ax2.spines.values(): sp.set_color('#14b8a6')
ax2.legend(fontsize=8, facecolor='#042f2e', edgecolor='#14b8a6', labelcolor='white', loc='upper right')

# -----------------------------------------------------------------
# Panel 3 : SOFAR channel sound speed profile
# -----------------------------------------------------------------
ax3 = fig.add_subplot(gs[1,0])
ax3.set_facecolor('#042f2e')
depth = np.linspace(0, 4000, 400)
# Simplified mid-latitude sound speed profile with SOFAR axis at 1 km
T = np.where(depth < 150, 20 - depth/15,
             np.where(depth < 1000, 8 - 0.006*(depth-150), 2 + 0.002*(depth-1000)))
S = 35.0
c = 1402 + 5*T - 0.055*T**2 + 0.0016*(depth)  # m/s
ax3.plot(c, depth, color='#22d3ee', lw=2.5)
ax3.invert_yaxis()
# SOFAR axis at sound-speed minimum
sofar_idx = np.argmin(c)
ax3.axhline(depth[sofar_idx], color='#fbbf24', ls='--', label=f'SOFAR axis ~{depth[sofar_idx]:.0f} m')
ax3.fill_betweenx(depth, c.min()-5, c, color='#22d3ee22')
ax3.set_xlabel('Sound speed c (m/s)', color='white')
ax3.set_ylabel('Depth (m)', color='white')
ax3.set_title('Mid-Latitude Sound Speed Profile\nSOFAR channel traps low-f sound',
              color='#5eead4', fontsize=11)
ax3.tick_params(colors='white'); ax3.grid(True, alpha=0.15, color='#14b8a6')
for sp in ax3.spines.values(): sp.set_color('#14b8a6')
ax3.legend(fontsize=8, facecolor='#042f2e', edgecolor='#14b8a6', labelcolor='white', loc='lower right')

# -----------------------------------------------------------------
# Panel 4 : Signature whistle classification (simple clustering demo)
# -----------------------------------------------------------------
ax4 = fig.add_subplot(gs[1,1])
ax4.set_facecolor('#042f2e')
# Synthesize signature whistles: time-frequency contours for 4 individuals
t_w = np.linspace(0, 1.5, 200)
whistles = {
    'Dolphin A': 8 + 4*np.sin(2*np.pi*0.8*t_w),
    'Dolphin B': 12 - 3*t_w + np.sin(2*np.pi*1.5*t_w),
    'Dolphin C': 5 + 6*t_w - 2*t_w**2,
    'Dolphin D': 10 + 2*np.sin(2*np.pi*2.5*t_w)*np.exp(-t_w),
}
colors = ['#22d3ee','#fbbf24','#f472b6','#86efac']
for i, (name, w) in enumerate(whistles.items()):
    for k in range(3):  # 3 repetitions with small variation
        noise = 0.3*np.random.randn(len(t_w))
        ax4.plot(t_w, w + noise, color=colors[i], lw=1.5, alpha=0.6,
                 label=name if k==0 else None)
ax4.set_xlabel('Time (s)', color='white')
ax4.set_ylabel('Whistle frequency (kHz)', color='white')
ax4.set_title('Signature Whistles: Individual “Names”\neach dolphin repeats its own unique contour',
              color='#5eead4', fontsize=11)
ax4.tick_params(colors='white'); ax4.grid(True, alpha=0.15, color='#14b8a6')
for sp in ax4.spines.values(): sp.set_color('#14b8a6')
ax4.legend(fontsize=8, facecolor='#042f2e', edgecolor='#14b8a6', labelcolor='white')

fig.suptitle("Cetacean Acoustic Communication: Song, SOFAR, Profile, Signature Whistles",
             color='#5eead4', fontsize=14, fontweight='bold', y=0.995)
plt.savefig('output.png', dpi=150, bbox_inches='tight', facecolor='#020617')
print(f"Absorption at 20 Hz: {alpha(0.02):.5f} dB/km")
print(f"Absorption at 20 kHz: {alpha(20):.3f} dB/km")
print(f"Absorption at 120 kHz: {alpha(120):.2f} dB/km")
print(f"SOFAR axis depth: {depth[sofar_idx]:.0f} m, c_min = {c[sofar_idx]:.1f} m/s")

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Key Observations

Panel 1: Clear hierarchical structure of repeated phrases at different frequencies — units stacked into phrases into themes.
Panel 2: At 20 Hz the effective range is thousands of km; at 120 kHz only ~km. Low-frequency vocalizations win on range by 3–4 orders of magnitude.
Panel 3: The SOFAR axis appears as a clear sound-speed minimum at ~1 km depth — a physical waveguide in the ocean.
Panel 4: Each simulated dolphin's signature whistle has a stable frequency contour repeated consistently.

Module Summary

Humpback Song Structure

Units → phrases → themes → song; hours-long performances; all males sing same song

Cultural Transmission

Songs change progressively within season and across populations (Noad 2000)

SOFAR Channel

Sound speed minimum at ~1 km depth forms acoustic waveguide; 10 log(r) spreading

Low-f Propagation

20 Hz: α ~ 0.001 dB/km — blue whale calls can cross ocean basins

Blue Whale Calls

10–40 Hz, 188 dB, peak frequencies declining over decades

Sperm Whale Codas

Clan-specific click patterns; social, not foraging

Signature Whistles

Individual dolphin “names”; vocally learned, stable for decades

Noise Masking

+3 dB/decade ambient increase; communication range contracts 5× since 1950

References

Payne, R.S. & McVay, S. (1971). Songs of humpback whales. Science, 173, 585–597.
Noad, M.J., Cato, D.H., Bryden, M.M., Jenner, M.-N. & Jenner, K.C.S. (2000). Cultural revolution in whale songs. Nature, 408, 537.
Garland, E.C. et al. (2011). Dynamic horizontal cultural transmission of humpback whale song at the ocean basin scale. Current Biology, 21(8), 687–691.
Rendell, L. & Whitehead, H. (2003). Vocal clans in sperm whales. Proceedings of the Royal Society B, 270, 225–231.
Gero, S., Bøttcher, A., Whitehead, H. & Madsen, P.T. (2016). Socially segregated, sympatric sperm whale clans in the Atlantic Ocean. Royal Society Open Science, 3, 160061.
Janik, V.M. (2013). Cognitive skills in bottlenose dolphin communication. Trends in Cognitive Sciences, 17(4), 157–159.
Munk, W.H. (1974). Sound channel in an exponentially stratified ocean, with application to SOFAR. JASA, 55, 220–226.
Andrew, R.K., Howe, B.M. & Mercer, J.A. (2002). Ocean ambient sound: Comparing the 1960s with the 1990s. Acoustic Research Letters Online, 3, 65.
Parks, S.E., Clark, C.W. & Tyack, P.L. (2007). Short- and long-term changes in right whale calling behavior. JASA, 122, 3725–3731.
Francois, R.E. & Garrison, G.R. (1982). Sound absorption based on ocean measurements. JASA, 72, 1879–1890.
Thomson, D.J.M. & Barclay, D.R. (2020). Real-time observations of the impact of COVID-19 on underwater noise. JASA, 147, 3390–3396.

Share:X Reddit LinkedIn

← Previous: Echolocation Next: Module 5 — Baleen & Filter Feeding →