Module 4

Transcriptomics & Gene Expression

RNA-seq replaced microarrays in the early 2010s and now dominates gene-expression profiling. This module covers the pipeline: alignment (STAR, HISAT2), quantification (Salmon, kallisto pseudoalignment), differential expression (DESeq2, edgeR), and single-cell extensions (scRNA-seq, pseudotime).

1. RNA-seq Pipeline

Standard workflow: (1) QC reads with FastQC; (2) trim adaptors (fastp, Trimmomatic); (3) align to genome (STAR, HISAT2) or pseudoalign to transcriptome (Salmon, kallisto); (4) quantify transcript abundance (TPM); (5) test differential expression (DESeq2, edgeR) with appropriate sample-wise dispersion models; (6) pathway analysis (GSEA, Enrichr).

2. DESeq2 & Negative Binomial

Counts of RNA-seq reads per gene follow approximately a negative-binomial distribution (Poisson with overdispersion). DESeq2 (Love 2014) fits gene-wise dispersions, shrinks them empirically to a gene-wise prior, and tests the Wald statistic against the shrunken estimate. The shrinkage stabilises low-count gene estimates and is critical for small-sample studies.

Simulation: Volcano Plot

Python
script.py37 lines

Click Run to execute the Python code

Code will be executed with Python 3 on the server

3. Single-Cell RNA-seq

scRNA-seq (10x Chromium, Smart-seq3, Drop-seq, SPLiT-seq) profiles thousands of individual cells. Tools: Seurat (R) and Scanpy (Python) for QC, normalisation, dimensional reduction (PCA → UMAP), clustering (Louvain/Leiden), cell-type annotation. Trajectory inference (Monocle3, PAGA, scVelo) orders cells along developmental pseudotime axes.

Key References

• Love, M. I., Huber, W. & Anders, S. (2014). “Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.” Genome Biol., 15, 550.

• Bray, N. L. et al. (2016). “Near-optimal probabilistic RNA-seq quantification.” Nat. Biotechnol., 34, 525–527.

• Patro, R. et al. (2017). “Salmon provides fast and bias-aware quantification of transcript expression.” Nat. Methods, 14, 417–419.

• Stuart, T. et al. (2019). “Comprehensive integration of single-cell data.” Cell, 177, 1888–1902.

Share:XRedditLinkedIn