Module 0: Triticum Evolution & Domestication

Modern bread wheat (Triticum aestivum) is a hexaploid (AABBDD, 2n = 6x = 42) whose 17 Gbp genome records two polyploidization events and ~10 000 years of human selection. This module traces the diploid→tetraploid→hexaploid lineage from T. urartu through T. dicoccum to T. aestivum, quantifies the selection coefficient for the non-brittle rachis phenotype that defines domesticated wheat, and reviews the IWGSC 2018 reference assembly and its agronomic implications.

1. Two Polyploidization Events

Cultivated wheats fall into three ploidy tiers: diploid einkorn (T. monococcum, AA, 2n = 14), tetraploid emmer and durum (T. dicoccum, T. durum, AABB, 2n = 28), and hexaploid bread wheat and spelt (T. aestivum, T. spelta, AABBDD, 2n = 42). Each letter denotes a distinct 7-chromosome subgenome.

Event 1: Tetraploid Emmer (~0.5 Mya)

The first polyploidization united a wild diploid einkorn (T. urartu, AA genome) with a wild goatgrass (Aegilops speltoides-like, BB genome). The resulting sterile AB hybrid underwent spontaneous chromosome doubling to produce fertile tetraploid wild emmer, T. dicoccoides. Domestication of this tetraploid in the Fertile Crescent (~10 500 BP) yielded cultivated emmer, T. dicoccum, and later durum, T. durum, used today for pasta.

Event 2: Hexaploid Bread Wheat (~8000 BP)

The second polyploidization occurred far more recently, within the last ~8000 years, when cultivated tetraploid emmer (AABB) hybridized with wild Aegilops tauschii (DD, Tausch’s goatgrass) along the southern Caspian Sea. The sterile AABD hybrid doubled to give fertile hexaploid T. aestivum—bread wheat. Because this event postdates domestication, the D subgenome is genetically uniform (bottleneck) and lacks wild hexaploid relatives.

\[\underbrace{AA}_{T.\,urartu} \;+\; \underbrace{BB}_{Ae.\,speltoides} \longrightarrow AABB \longrightarrow T.\,dicoccum\]

\[\underbrace{AABB}_{T.\,dicoccum} \;+\; \underbrace{DD}_{Ae.\,tauschii} \longrightarrow AABBDD \longrightarrow T.\,aestivum\]

The IWGSC 2018 Reference Genome

The International Wheat Genome Sequencing Consortium (Appels et al. 2018) assembled the Chinese Spring reference to 14.5 Gbp (of the ~17 Gbp total), resolving 107 891 high-confidence genes across the three subgenomes. The A, B, and D subgenomes show ~97% collinearity to their extant relatives and retain the ancestral Pooideae 7-chromosome blueprint. Hexaploid buffering allows functional redundancy: a deleterious mutation in one homoeologue can be masked by its sisters—a property exploited in CRISPR editing of all three homoeologues simultaneously to engineer new traits (Wang 2014).

Assembly of the hexaploid genome (AABBDD)

Stepwise assembly of Triticum aestivum (AABBDD, 2n = 6x = 42)T. urartu(wild einkorn)AA, 2n = 14≈7 Chr × 700 Mbp+Aegilops speltoides(wild goatgrass)BB, 2n = 14≈7 Chr × 730 Mbp→T. dicoccoides / T. dicoccum(wild / domesticated emmer)AABB, 2n = 4x = 28≈14 Chr × 10 Gbp total≈8000 BPAegilops tauschii(Tausch's goatgrass)DD, 2n = 14≈7 Chr × 4 GbpSource of cold hardiness,rust resistance (Sr33, Sr35)+(AABB)Triticum aestivum — bread wheatAABBDD, 2n = 6x = 42≈21 Chr pairs, 17 Gbp genome (IWGSC 2018)107,891 high-confidence genes across three subgenomesBuffering redundancy → tolerance of mutationsChinese Spring reference used for CRISPR/MAS breeding

2. Fertile Crescent Domestication

Zohary & Hopf (2012) identify the Fertile Crescent—the arc from the Levant through south-east Anatolia into the Zagros foothills—as the primary cradle of wheat domestication. Einkorn domestication is best attested at Karacadağ (south-east Turkey) and emmer at Tell Aswad (Syria) and Çayönü (Anatolia), all dated 10 500–10 000 BP.

The Brittle → Non-Brittle Transition

The single most important domestication trait in cereals is the loss of seed shattering. Wild wheats disperse seeds by rachis disarticulation: at grain maturity the spike rachis breaks apart, flinging individual spikelets to the ground. Domesticated wheats retain the spike intact; the farmer cuts, threshes, and winnows a coherent ear. The genetic basis is recessive mutations at the Br loci (Br2 on 3A and homoeologous copies) and the pleiotropic Q locus on chromosome 5A, which also enables free-threshing.

\[p_{t+1} = \frac{p_t^2(1+s) + p_t(1-p_t)(1+hs)}{\bar{w}_t}\]

Classical single-locus selection equation; \(s\) is the selection coefficient, \(h\) the dominance coefficient,\(\bar{w}_t = p_t^2(1+s)+2p_tq_t(1+hs)+q_t^2\).

Archaeobotanical rachis-morphology time series (charred grain assemblages) show the non-brittle allele rising from ~2% to ~95% frequency over ~3500 years (Tanno & Willcox 2006; Fuller 2007). Forward-simulating the recursion with generation time of one year yields a best-fit selection coefficient \(s \approx 0.04\)—a remarkably modest fitness advantage, consistent with a gradual, protracted domestication rather than a single-step event.

Evidence of Protracted Domestication

Unlike the traditional “revolutionary” model of cereal domestication, careful re-analysis of the Tell Aswad, Aswad II, Netiv-Hagdud, Mureybet, and Cayönü assemblages (Fuller 2007; Allaby 2008) shows an unbroken pre-domestication cultivation phase lasting 2–3 millennia. During this phase, wild-morphology rachis fragments remain dominant while non-shattering genotypes gradually accumulate. Population-genetic modelling (Allaby, Fuller & Brown 2008) shows that with \(s \approx 0.04\) and initial frequency \(p_0 \sim 0.02\), a 3500-year period matches the archaeological record without invoking heroic per-generation selection pressures.

\[t_{\mathrm{fix}} \approx \frac{1}{s}\ln\!\left(\frac{1}{p_0}\right) \approx \frac{\ln(50)}{0.04} \approx 98 \text{ generations (per 10\% frequency rise)}\]

At generation time ~1 year (winter or spring crop cycle), this predicts that the non-brittle allele requires roughly \(\ln(p_0^{-1})/s\) = 100 generations to reach appreciable frequency and a further several hundred to approach fixation—consistent with the 8000–4500 BP completion observed in the archaeological record.

The Q Gene and Free-Threshing

Salamini et al. (2002) dissected the Q locus, a single AP2-family transcription factor on chromosome 5A. Wild q alleles produce tough glumes that hold the grain tightly; the derived Q allele (gain-of-function point mutation and expression change) loosens glumes, shortens the spike, and coordinates with Br mutations to give the classic free-threshing, non-shattering phenotype of modern bread wheat. Q is thus the master “domestication gene” of bread wheat.

2b. Cytogenetics of the Hexaploid

With three homoeologous subgenomes, meiosis in T. aestivum faces a combinatorial hazard: if homoeologous chromosomes (e.g. 1A and 1B) pair indiscriminately, chromosome segregation becomes chaotic and fertility collapses. The hexaploid genome solves this problem with a single genetic switch: the Ph1 (Pairing homoeologous 1) locus on the long arm of chromosome 5B, first identified by Riley & Chapman (1958).

The Ph1 System

Ph1 enforces strict bivalent pairing between true homologues (1A with 1A, 1B with 1B, 1D with 1D) and suppresses homoeologous crossovers. Plants with a ph1b deletion show multivalent configurations at metaphase I, allowing useful gene introgression from wild relatives in breeding programmes. The Ph1 gene encodes a cluster of cyclin-dependent kinase-like (CDK-like) genes (Griffiths 2006, Rey 2017) that regulate meiotic progression.

\[\text{21 bivalents at metaphase I: } \{1A{-}1A, 1B{-}1B, 1D{-}1D,\; 2A{-}2A, 2B{-}2B, 2D{-}2D,\; \dots\; 7D{-}7D\}\]

Exactly 21 bivalents; no multivalents under wild-type Ph1.

Homoeoallelic Redundancy

Hexaploid buffering means most genes exist in three homoeologous copies (A, B, D versions). Loss-of-function of a single copy is often phenotypically silent because the remaining two copies compensate. This enables two important breeding strategies: (i) TILLING (Targeting Induced Local Lesions IN Genomes) screens of EMS-mutagenised populations find recessive knock-outs in single homoeologues; (ii) CRISPR–Cas9 simultaneous editing of all three homoeologues drops a trait decisively (Wang 2014, powdery-mildew resistance via TaMLO).

3. Taxonomy, Spelt, and Rust-Resistance Introgressions

Triticum aestivum is a biological species by interfertility criteria, but it is divided into several agronomic subspecies. Bread wheat proper is T. aestivum subsp. aestivum; spelt is T. aestivum subsp. spelta. Spelt retains the hulled phenotype (grain clings to the glumes), while bread wheat is naked and free-threshing. Molecular evidence indicates spelt and bread wheat diverged within the last ~5000 years via independent introgressions of emmer into hexaploid stocks.

Rust-Resistance Genes

Wheat stem rust (Puccinia graminis f. sp. tritici) is the most destructive pathogen of cultivated wheat. Durable resistance has repeatedly been introgressed from diploid relatives:

  • Sr33 — NLR gene from Aegilops tauschii (D genome donor); effective vs. race Ug99 (Periyannan 2013).
  • Sr35 — from T. monococcum (A genome diploid); also Ug99-effective (Saintenac 2013).
  • Sr22, Sr45, Sr46 — additional NLR genes now stacked in elite cultivars.

These genes encode NB-LRR intracellular immune receptors that recognize pathogen effectors. Hexaploid wheat’s subgenomic redundancy tolerates the introgression of large chromosome segments from wild diploids without compromising agronomic performance.

3b. Green Revolution Dwarfing Genes

The post-1960 Green Revolution transformed wheat yields via introgression of two dwarfing alleles, Rht-B1b and Rht-D1b, from the Japanese cultivar Norin 10. Borlaug’s CIMMYT breeding programme (Nobel Peace Prize 1970) crossed Norin-10 derivatives with Mexican high-input germplasm to produce semi-dwarf hexaploid wheats that responded spectacularly to nitrogen fertiliser without lodging.

Molecular Basis

The Rht-B1b and Rht-D1b alleles encode gain-of-function mutant DELLA proteins (orthologous to Arabidopsis GAI/RGA) that are refractory to gibberellin-mediated degradation (Peng 1999). The resulting constitutive growth restriction reduces stem elongation by ~30%, shortening the straw and redirecting biomass into the grain. Harvest index (grain/total biomass) rose from ~0.3 in tall cultivars to ~0.5 in modern semi-dwarfs.

\[\text{Yield} \approx \text{Biomass} \times \text{HI},\quad \text{HI}_\text{tall} \approx 0.30,\; \text{HI}_\text{dwarf} \approx 0.50\]

Combined with nitrogen fertiliser, irrigation, and chemical crop protection, this architectural change doubled global wheat yields between 1960 and 1990. Yield gains have since slowed (“yield plateau”), motivating current efforts in photosynthetic engineering (Module 3) and genomic selection (Module 7).

4. Global Spread from the Fertile Crescent

Archaeological cereal remains, radiocarbon dating, and ancient-DNA transects trace the demic diffusion of wheat from the Fertile Crescent across three continents:

  • South-east Europe (~8500 BP): Starčevo, Karanovo and Sesklo Neolithic.
  • Central Europe (~7500 BP): Linear Pottery Culture; dominant crop across loess belts.
  • Nile Valley (~7000 BP): Fayum A; emmer staple of Pharaonic agriculture.
  • Indus Valley (~7000 BP): Mehrgarh emmer; later replaced by hexaploid.
  • China (~4500 BP): Gansu Neolithic; hexaploid bread wheat arrives via Central Asia.
  • Sub-Saharan Ethiopia (~3000 BP): durum wheat endemism with unique landraces.
  • Americas (1493 CE): Columbus; Spanish missions carry wheat to Mexico, Chile, Argentina, and Californian missions.
  • Australia (1788 CE): First Fleet; modern Australia is a major hard-wheat exporter.

The expansion follows a classic wave-of-advance pattern: initial slow diffusion (~1 km/yr at agricultural frontiers, Ammerman & Cavalli-Sforza 1984), accelerated by sailing ships and modern trade. Today T. aestivum is grown on more than 220 million hectares—the single largest crop acreage worldwide—and supplies roughly 20% of global food calories and protein.

Landraces and Genetic Resources

During the pre-Green-Revolution era (before ~1965), local landraces adapted to specific agro-ecologies dominated global wheat. Vavilov’s Leningrad collection (now the N. I. Vavilov Institute, St. Petersburg) and later CIMMYT (Mexico) and ICARDA (Syria/Morocco) gene banks now hold >800,000 accessions of wheat and wild relatives. Roughly 10% of this diversity has been genotyped via exome capture or genotyping-by-sequencing, creating an unprecedented resource for allele mining in the age of CRISPR.

4b. Quick Reference: Triticum Nomenclature

Wheat nomenclature is notoriously tangled because of competing taxonomic schools, parallel Latin and vernacular names, and the many ploidy levels. A quick reference table of the principal taxa referenced in this course:

  • Einkorn — T. monococcum, diploid AA, 2n=14. Oldest cultivated wheat. Low gluten; hulled.
  • Wild einkorn — T. urartu (A-genome donor) and T. boeoticum.
  • Wild emmer — T. dicoccoides, tetraploid AABB, 2n=28.
  • Domesticated emmer — T. dicoccum (T. turgidum ssp. dicoccum), AABB.
  • Durum wheat — T. durum (T. turgidum ssp. durum), AABB. Free-threshing; pasta wheat.
  • Polish wheat — T. polonicum, AABB. Long glumes; relict cultivation.
  • Khorasan wheat — T. turanicum (“Kamut®”), AABB. Ancient oriental cultivar group.
  • Bread wheat — T. aestivum subsp. aestivum, AABBDD, 2n=42. Free-threshing.
  • Spelt — T. aestivum subsp. spelta, AABBDD. Hulled; historically important in central Europe.
  • Club wheat — T. aestivum subsp. compactum, AABBDD. Short, dense spikes.
  • Indian dwarf wheat — T. aestivum subsp. sphaerococcum, AABBDD.
  • Aegilops speltoides — presumed B-genome donor.
  • Aegilops tauschii — D-genome donor of hexaploid wheat.

These taxa form a continuous interfertile network (the primary gene pool). Chromosome counts are multiples of 7, consistent with a single base genome (x = 7). Common usage in modern literature collapses the T. turgidum complex (AABB) and simply writes T. turgidum ssp. durum, and similarly collapses the hexaploid complex under T. aestivum.

5. Synthesis: Why Wheat Biophysics?

Wheat is simultaneously the most important cultivated grass, the most genetically complex major crop, and one of the most environmentally stressed. The biophysical themes that follow through the rest of the course are:

  1. Grain structure & biochemistry (Module 1): endosperm, aleurone, pericarp; starch granules A and B; protein bodies.
  2. Gluten polymer physics (Module 2): gliadin–glutenin disulfide network; viscoelastic rheology of dough; ultrasonic and AFM characterisation.
  3. Photosynthesis (Module 3): C₃ carbon fixation, RuBisCO efficiency, canopy architecture, and the C₄ engineering project.
  4. Water and drought (Module 4): stomatal conductance, xylem cavitation, osmotic adjustment, root architecture.
  5. Nitrogen uptake (Module 5): plasma-membrane NRT transporters, amino-acid loading into the grain, fertilisation energetics.
  6. Pathogens (Module 6): stem rust Ug99 epidemiology, NLR immune receptors, durable vs. race-specific resistance.
  7. Breeding (Module 7): marker-assisted selection, genomic prediction, CRISPR/Cas9 homoeoallele editing, de novo domestication.
  8. Climate & food security (Module 8): CO&sub2; fertilisation, heatwave-at-anthesis yield penalty, planetary boundaries, and the Borlaug legacy.

Every one of these topics sits on top of the evolutionary history sketched in this Module 0. Hexaploidy constrains gene dosage, buffers mutations, complicates editing, and endows T. aestivum with an enormous reservoir of cryptic allelic diversity now being excavated by post-genomic breeding.

Simulation 1: Polyploidization Timeline & Genome Assembly

Timeline of the two polyploidization events leading to hexaploid bread wheat, with chromosome-count and genome-size progression through T. urartu, T. dicoccoides, Ae. tauschii, and T. aestivum.

Python
script.py99 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

5b. Wheat in the Global Food System

Approximately 770 million tonnes of wheat are harvested per year worldwide (FAO 2022). The top producers are China (~140 Mt), India (~110 Mt), the European Union (~130 Mt), Russia (~85 Mt), the United States (~45 Mt), Canada, Australia, Pakistan, and Ukraine. Wheat supplies roughly 20% of all dietary calories and protein globally, and it is the primary staple for roughly 2.5 billion people.

Market Classes

Wheat is traded according to protein and gluten quality: hard red winter (bread), hard red spring (artisan and high-gluten), hard white (noodles, flatbreads), soft red winter (pastry, cake, crackers), soft white (pastries, cakes), and durum (pasta, couscous, bulgur, semolina). Protein content is the dominant quality metric, ranging from ~9% in soft-white classes to >14% in hard-red-spring. Gluten strength, measured by Chopin alveograph W-value and Brabender farinograph stability, determines bread-making performance (Module 2).

Food Security

Yield stability is increasingly threatened by climate change. Asseng et al. (2015) project a 6% yield reduction per 1°C of local warming in the absence of adaptation, because of heat-induced grain-filling-period shortening. Planetary boundaries for nitrogen (RockstrÜm 2009) are increasingly breached by wheat agriculture. Reactive-nitrogen runoff from wheat fields contributes to aquatic eutrophication, while N&sub2;O emissions from fertilised soils contribute to the greenhouse budget.

6. Wild Relatives and Secondary Gene Pools

Beyond the three direct progenitors of the A, B, and D subgenomes, T. aestivumsits within an interfertile network of Triticum and Aegilops species. Harlan & de Wet (1971) formalised the distinction between primary, secondary, and tertiary gene pools:

  • Primary gene pool (GP-1): species that cross freely with T. aestivum to give fertile F1 progeny. Includes T. urartu, Ae. tauschii, T. dicoccoides, T. dicoccum, T. durum, T. monococcum.
  • Secondary gene pool (GP-2): related tetraploids and diploids of the Aegilops genus (Ae. speltoides, Ae. longissima, Ae. sharonensis, Ae. bicornis, Ae. kotschyi, etc.); crosses require embryo rescue.
  • Tertiary gene pool (GP-3): other Triticeae (barley Hordeum, rye Secale, Agropyron, Thinopyrum); genes transferred via radiation-induced translocations or bridge crosses. Example: the 1BL.1RS rye translocation carrying disease-resistance genes was deployed on >50% of European wheat acreage.

Synthetic Hexaploids

A powerful tool for tapping wild D-genome diversity is the synthetic hexaploid approach: artificially cross durum (T. durum, AABB) with diverse Ae. tauschiiaccessions and chromosome-double the F1 to resurrect new AABBDD hexaploids. These synthetics carry wild D-genome alleles absent from the historical bread-wheat bottleneck, introducing novel resistance, drought tolerance, and grain-quality traits into the modern breeding pool (Ogbonnaya 2013; Mujeeb-Kazi 2008).

Simulation 2: Selection Coefficient for the Non-Brittle Allele

Fit the classical single-locus selection equation to archaeobotanical time-series data on rachis morphology, estimate the selection coefficient \(s\) that drove the brittle→non-brittle transition during wheat domestication, and explore the sensitivity of the trajectory to the dominance coefficient.

Python
script.py124 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

Key References

• IWGSC (Appels, R. et al.) (2018). “Shifting the limits in wheat research and breeding using a fully annotated reference genome.” Science, 361, eaar7191.

• Zohary, D., Hopf, M. & Weiss, E. (2012). Domestication of Plants in the Old World, 4th ed. Oxford University Press.

• Salamini, F. et al. (2002). “Genetics and geography of wild cereal domestication in the Near East.” Nature Reviews Genetics, 3, 429–441.

• Dubcovsky, J. & Dvorak, J. (2007). “Genome plasticity a key factor in the success of polyploid wheat under domestication.” Science, 316, 1862–1866.

• Tanno, K. & Willcox, G. (2006). “How fast was wild wheat domesticated?” Science, 311, 1886.

• Fuller, D. Q. (2007). “Contrasting patterns in crop domestication and domestication rates.” Annals of Botany, 100, 903–924.

• Simons, K. J. et al. (2006). “Molecular characterization of the major wheat domestication gene Q.” Genetics, 172, 547–555.

• Periyannan, S. et al. (2013). “The gene Sr33, an ortholog of barley Mla genes, encodes resistance to wheat stem rust race Ug99.” Science, 341, 786–788.

• Saintenac, C. et al. (2013). “Identification of wheat gene Sr35 that confers resistance to Ug99 stem rust race group.” Science, 341, 783–786.

• Wang, Y. et al. (2014). “Simultaneous editing of three homoeoalleles in hexaploid bread wheat confers heritable resistance to powdery mildew.” Nature Biotechnology, 32, 947–951.