diff --git a/03_loader_and_exec_format/README.md b/03_loader_and_exec_format/README.md
index 7c9951a..481f959 100644
--- a/03_loader_and_exec_format/README.md
+++ b/03_loader_and_exec_format/README.md
@@ -22,6 +22,7 @@ This section covers the complete lifecycle of an AmigaOS executable:
| [exe_load_pipeline.md](exe_load_pipeline.md) | LoadSeg → AllocMem → relocation → segment chain → CreateProc → entry point |
| [object_file_format.md](object_file_format.md) | Compiler object files (HUNK_UNIT), multi-section layout, HUNK_LIB archives, linker operation |
| [overlay_system.md](overlay_system.md) | HUNK_OVERLAY: tree architecture, runtime overlay manager, worked binary example, modern alternatives |
+| [**exe_crunchers.md**](exe_crunchers.md) | **Executable packers: PowerPacker/Imploder/Shrinkler, decrunch stubs, compression algorithms, detection** |
## Why HUNK?
diff --git a/03_loader_and_exec_format/exe_crunchers.md b/03_loader_and_exec_format/exe_crunchers.md
new file mode 100644
index 0000000..d7d7376
--- /dev/null
+++ b/03_loader_and_exec_format/exe_crunchers.md
@@ -0,0 +1,489 @@
+[← Home](../README.md) · [Loader & HUNK Format](README.md)
+
+# Executable Crunchers — Compression, Decrunch Stubs, and Internals
+
+## Overview
+
+**Executable crunchers** (packers) compress AmigaOS executables while keeping them directly runnable. The crunched file is a valid HUNK executable — when launched, a tiny **decrunch stub** runs first, decompresses the original program in memory, then jumps to its real entry point. The user sees a brief colour-cycling delay (the "decrunch colours"), then the program runs normally.
+
+This was essential in the floppy era: a 200 KB program crunched to 120 KB loads significantly faster from a slow 880 KB floppy *and* frees disk space on capacity-constrained media.
+
+---
+
+## Architecture
+
+```mermaid
+graph LR
+ subgraph "Original Executable"
+ OH["HUNK_HEADER"] --> OC["HUNK_CODE
Original code"]
+ OC --> OD["HUNK_DATA
Original data"]
+ OD --> OB["HUNK_BSS"]
+ end
+
+ subgraph "Crunched Executable"
+ CH["HUNK_HEADER"] --> CS["HUNK_CODE
Decrunch Stub
(~200–800 bytes)"]
+ CS --> CD["HUNK_DATA
Compressed payload
(original hunks)"]
+ CD --> CB["HUNK_BSS
Decompression workspace"]
+ end
+
+ OH -.->|"Cruncher tool"| CH
+
+ style CS fill:#fff9c4,stroke:#f9a825,color:#333
+ style CD fill:#e8f4fd,stroke:#2196f3,color:#333
+```
+
+### Key Insight
+
+A crunched executable is **itself a valid HUNK file**. The OS loader handles it normally — `LoadSeg()` allocates memory, loads hunks, applies relocations. The "magic" is that hunk 0 contains a decrunch stub instead of the original code, and the data hunk contains the compressed original program.
+
+---
+
+## Major Amiga Crunchers
+
+| Cruncher | Era | Algorithm | Stub Size | Typical Ratio | Notes |
+|---|---|---|---|---|---|
+| **PowerPacker** (PP20) | 1989–1994 | LZ77 + configurable efficiency | ~280 bytes | 50–60% | Most popular; `powerpacker.library` for data files |
+| **Imploder** (IMP!) | 1990–1993 | LZSS variant | ~400 bytes | 45–55% | Multiple modes: Normal, Pure, Library, Overlayed |
+| **Turbo Imploder** | 1991–1993 | Enhanced LZSS | ~420 bytes | 42–52% | Faster crunch, same decrunch |
+| **ByteKiller** | 1988–1991 | LZ77 (simple) | ~160 bytes | 55–65% | Early; position-independent stub; used for raw data too |
+| **Titanics Cruncher** (ATN!) | 1991–1993 | LZ77 | ~350 bytes | 55–65% | Fast decrunch |
+| **CrunchMania** (CrM!) | 1992–1995 | LZ + range coding | ~500 bytes | 40–50% | Many registered/customised versions — format variants |
+| **Shrinkler** | 2014+ | Context-model + range coder | ~250 bytes | 30–40% | Modern; best ratio; used in 4K/64K demo intros |
+| **PackFire** | 2016+ | Shrinkler derivative | ~200 bytes | 30–40% | Optimised for size-limited compos |
+| **XPK** | 1992+ | Framework (multiple sub-packers) | varies | varies | Library-based; supports NUKE, SMPL, SQSH, etc. |
+
+---
+
+## Binary Structure of a Crunched Executable
+
+### What the Cruncher Produces
+
+The cruncher tool reads the original executable, compresses its contents, and wraps them in a new HUNK executable:
+
+```
+HUNK_HEADER ($3F3)
+ num_hunks = 2 or 3
+ hunk_sizes:
+ [0] = stub code size + compressed data (or split across hunks)
+ [1] = workspace BSS (decompression buffer)
+
+HUNK_CODE ($3E9)
+ ; 200–800 bytes of 68000 code
+ ; the original executable, compressed
+ ; original hunk count, sizes, memory types
+
+HUNK_RELOC32 ($3EC) ; relocations for the stub itself (minimal)
+
+HUNK_BSS ($3EB) ; workspace for decompression
+ ; typically = original uncompressed size
+
+HUNK_END ($3F2)
+```
+
+### Alternate Layout (Multi-Hunk)
+
+Some crunchers split the stub and payload into separate hunks:
+
+```
+Hunk 0: HUNK_CODE — decrunch stub only (~300 bytes)
+Hunk 1: HUNK_DATA — compressed payload + metadata
+Hunk 2: HUNK_BSS — decompression workspace
+```
+
+---
+
+## PowerPacker PP20 — Format Deep Dive
+
+PowerPacker (by Nico François) is the most widely used Amiga cruncher. It exists in two forms: a **data file format** (for `powerpacker.library`) and an **executable wrapper** (for crunched .exe files).
+
+### PP20 Data Format
+
+```
+Offset Size Field
+────── ──── ─────────────────────────────
+$00 4 Signature: "PP20" ($50503230)
+$04 4 Efficiency table: 4 bytes controlling LZ bit-depth
+ e.g. $09090909 = "Fast", $0A0B0C0D = "Best"
+$08 N Compressed bitstream data
+$08+N 4 Decrunch info: 24-bit original size (big-endian) + checksum byte
+ Byte layout: [size_hi] [size_mid] [size_lo] [checksum]
+```
+
+### Efficiency Table
+
+The 4-byte efficiency table controls how many bits are used for offset/length encoding in different compression modes:
+
+| Mode | Efficiency Bytes | Description |
+|---|---|---|
+| Fast | `$09 09 09 09` | Smaller window, faster crunch |
+| Mediocre | `$09 0A 0A 0A` | Balance |
+| Good | `$09 0A 0B 0B` | Better ratio |
+| Very Good | `$09 0A 0B 0C` | Near-best |
+| Best | `$09 0A 0C 0D` | Maximum compression, slowest |
+
+The decompressor reads these 4 bytes to initialize its internal offset/length bit-allocation tables before starting the main decompression loop.
+
+### Decrunch Colours
+
+The PowerPacker decrunch stub famously modifies custom chip colour registers during decompression to provide visual feedback — the background colour cycles through shades of grey or colour gradients, signalling that decrunching is in progress. This is the characteristic "decrunch effect" visible on real hardware:
+
+```asm
+; Visual feedback during decrunch:
+ MOVE.W D0, $DFF180 ; COLOR00 — background colour
+ ; D0 increments with each decompressed block
+```
+
+---
+
+## Shrinkler — Modern State-of-the-Art
+
+Shrinkler (by Blueberry/Loonies) is the current gold standard for Amiga executable compression, achieving 30–40% ratios. It's open-source and widely used in the demo scene.
+
+### Algorithm: Context-Modelling + Range Coder
+
+Unlike older LZ77-based crunchers, Shrinkler uses:
+
+1. **Adaptive context model** — maintains 1536 probability contexts (`NUM_CONTEXTS = 1536`). Each context tracks the probability of the next bit being 0 or 1, updated after every decoded bit
+2. **Range coder** — an arithmetic coding variant that encodes bits using interval subdivision based on the context probabilities
+3. **LZ matching** — literal bytes and back-references are intermixed, with the context model predicting which type comes next
+
+### Shrinkler Data Header
+
+```
+Offset Size Field
+────── ──── ─────────────────────────────
+$00 4 Signature: "Shri" ($53687269)
+$04 1 Major version
+$05 1 Minor version
+$06 2 Header size (remaining bytes)
+$08 4 Compressed data size
+$0C 4 Uncompressed data size
+$10 4 Safety margin (for in-place decompression)
+$14 4 Flags: bit 0 = FLAG_PARITY_CONTEXT
+```
+
+The **parity context** flag (`FLAG_PARITY_CONTEXT`) enables a special mode that maintains separate probability models based on the byte position parity, exploiting statistical properties of 68000 machine code (even/odd byte patterns in opcode words).
+
+### 68000 Decompressor Core (from Shrinkler source)
+
+The actual decompressor fits in approximately 100 instructions:
+
+```asm
+; Register usage:
+; D2 = Range value
+; D3 = Interval size
+; D4 = Input bit buffer (reads bytes from compressed stream)
+; D6 = Context index
+; D7 = Parity context flag (0 or 1)
+; A4 = Compressed data source pointer
+; A5 = Decompressed data destination pointer
+
+INIT_ONE_PROB = $8000 ; Initial probability: 50/50
+ADJUST_SHIFT = 4 ; Probability adaptation rate
+NUM_CONTEXTS = 1536 ; Context table size
+
+ShrinklerDecompress:
+ movem.l d2-d7/a4-a6,-(a7)
+ ; Init range decoder state
+ moveq.l #0,d2 ; Range value = 0
+ moveq.l #1,d3 ; Interval size = 1
+ moveq.l #-$80,d4 ; Input buffer (triggers first byte read)
+
+ ; Init all 1536 probabilities to 50% ($8000)
+ move.l #NUM_CONTEXTS,d6
+.init:
+ move.w #INIT_ONE_PROB,-(a7) ; Push WORD onto stack
+ subq.w #1,d6
+ bne.b .init
+ ; Context table is now on the stack (3072 bytes)
+
+ ; Main decompression loop
+.lit:
+ ; Decode literal byte bit-by-bit using context model
+ addq.b #1,d6
+.getlit:
+ bsr.b GetBit ; Get one bit from range coder
+ addx.b d6,d6 ; Shift bit into D6
+ bcc.b .getlit ; Loop until byte complete
+ move.b d6,(a5)+ ; Write decompressed byte
+
+.switch:
+ bsr.b GetKind ; Is next item literal or reference?
+ bcc.b .lit ; Literal → decode another byte
+
+ ; Reference: decode offset and length
+ ; ... (LZ match copy loop)
+```
+
+### Stack-Based Context Table
+
+A distinctive Shrinkler technique: the 1536-entry probability table (3072 bytes) is allocated **on the stack** — each entry is a WORD pushed during initialization. This avoids needing a separate AllocMem call and keeps the decompressor self-contained.
+
+---
+
+## Compression Algorithms
+
+### LZ77 / LZSS (PowerPacker, Titanics, ByteKiller, Imploder)
+
+The dominant algorithm family. The compressed stream is a sequence of control bits followed by either literal bytes or back-references:
+
+```
+[flag bit]
+ 0 → literal byte follows (copy 1 byte verbatim)
+ 1 → match reference: (offset, length)
+ offset = how far back in already-decompressed data to copy from
+ length = how many bytes to copy from that position
+
+Decompression pseudo-code:
+ while (output_pos < original_size):
+ bit = read_bit()
+ if bit == 0:
+ output[output_pos++] = read_byte() # literal
+ else:
+ offset = read_bits(offset_bits) # back-reference
+ length = read_bits(length_bits) + min_len
+ copy(output, output_pos - offset, length) # copy from history
+ output_pos += length
+```
+
+The **efficiency setting** (PowerPacker) or **mode** (Imploder) controls how many bits are allocated to offset and length fields — more offset bits = larger search window = better compression but slower.
+
+### Context Modelling + Range Coding (Shrinkler, PackFire)
+
+Modern crunchers replace fixed-bit-width encoding with probability-based arithmetic coding:
+
+1. For each bit position, the **context model** estimates: "probability that this bit is 1"
+2. The **range coder** encodes the bit using that probability — high-probability bits use fewer output bits
+3. After encoding/decoding, the context probability is **updated** based on the actual bit value
+
+This achieves near-optimal compression but decompression is slower (~2–5 seconds on a 7 MHz 68000 for a typical executable).
+
+---
+
+## Relocation Handling
+
+The original executable had HUNK_RELOC32 entries that patch absolute addresses. After decompression, these must be reapplied. Crunchers use three strategies:
+
+### Method 1: Compress Everything Including Relocs
+
+The entire original file (all hunks + relocation tables) is compressed as a blob. The decrunch stub acts as a mini-`LoadSeg`:
+1. Decompress to a temp buffer
+2. Parse the HUNK stream
+3. Allocate individual hunks with correct memory types
+4. Copy data and apply relocations
+5. Free the temp buffer
+
+### Method 2: Pre-Relocated + Delta Table
+
+1. Cruncher pre-applies relocations assuming base address 0
+2. Stores a compact **delta table** — sorted list of byte-offset deltas between relocation sites
+3. After decompression, the stub walks the delta table and adds actual base addresses
+
+```c
+/* Delta table: each entry is the offset-delta to the next reloc site */
+UWORD reloc_deltas[] = {
+ 0x0006, /* first reloc at offset 6 */
+ 0x0014, /* +0x14 → next at offset 0x1A */
+ 0x0008, /* +0x08 → next at offset 0x22 */
+ 0x0000 /* terminator */
+};
+/* More compact than storing absolute offsets */
+```
+
+### Method 3: Merge and Self-Relocate
+
+All hunks merged into a single code hunk. Inter-hunk references resolved at crunch time. The result needs minimal or no relocation.
+
+**Drawback**: Loses CHIP/FAST memory separation — all data ends up in the same memory type. Problematic for programs that need Chip RAM for bitmaps or audio.
+
+---
+
+## Memory Layout During Decompression
+
+```
+BEFORE (crunched exe loaded by OS):
+
+ ┌──────────────────────┐ Hunk 0 (CODE)
+ │ Decrunch stub (300B) │
+ │ Compressed data (80K)│
+ │ Metadata │
+ └──────────────────────┘
+ ┌──────────────────────┐ Hunk 1 (BSS)
+ │ Workspace (200K) │ ← decompression buffer
+ └──────────────────────┘
+
+DURING (stub is executing):
+
+ ┌──────────────────────┐ Hunk 0 — still alive
+ │ Stub + compressed ───│──→ reading from here
+ └──────────────────────┘
+ ┌──────────────────────┐ AllocMem'd by stub
+ │ Original Hunk 0 CODE │──→ writing decompressed data
+ └──────────────────────┘
+ ┌──────────────────────┐ AllocMem'd by stub
+ │ Original Hunk 1 DATA │
+ └──────────────────────┘
+
+AFTER (stub jumps to original entry):
+
+ ┌──────────────────────┐ (freed or abandoned)
+ │ [freed stub memory] │
+ └──────────────────────┘
+ ┌──────────────────────┐ Original program running
+ │ Original Hunk 0 CODE │ ← PC here
+ └──────────────────────┘
+ ┌──────────────────────┐
+ │ Original Hunk 1 DATA │
+ └──────────────────────┘
+```
+
+> **In-place decompression**: Some crunchers (including Shrinkler) support decompressing over the compressed data — the `safety_margin` field in the Shrinkler header reserves extra space so the decompressor's write pointer never overtakes the read pointer. Data is decompressed from end to start.
+
+---
+
+## Detection and Identification
+
+### Magic Signatures
+
+| Cruncher | Signature | Hex | Location |
+|---|---|---|---|
+| PowerPacker | `PP20` | $50503230 | Start of compressed data |
+| Imploder | `IMP!` | $494D5021 | Start of compressed data |
+| Turbo Imploder | `IMP!` | $494D5021 | Same — version in stub differs |
+| Titanics | `ATN!` | $41544E21 | Start of compressed data |
+| CrunchMania | `CrM!` / `CrM2` | $43724D21 / $43724D32 | Start of compressed data |
+| Shrinkler | `Shri` | $53687269 | Data file header (exe uses stub pattern) |
+| ByteKiller | (no magic) | — | Detected by stub pattern only |
+| XPK Framework | `XPKF` | $58504B46 | File header |
+
+> [!WARNING]
+> **Fake headers** are extremely common in the Amiga cracking scene. A file claiming to be `IMP!` may have a spoofed header to frustrate analysis. If standard tools reject it, the header is likely fake — use a debugger to capture the decrunched memory image instead.
+
+### Detecting Crunched Executables in RE
+
+1. **Tiny code hunk + large data hunk** — unusual ratio signals packing
+2. **AllocMem + decompression loop** at entry point — not the normal `c.o` startup pattern
+3. **No `MOVE.L 4.W,A6` / `OpenLibrary` sequence** — stub goes straight to decompression
+4. **Custom chip register writes** (`$DFF180` colour changes) — decrunch colour feedback
+5. **Magic bytes** in the data hunk — scan for known signatures
+6. **Self-modifying code** — stub may overwrite its own memory during in-place decompression
+
+```python
+# Quick detection script:
+import struct
+
+MAGICS = {
+ b'PP20': 'PowerPacker',
+ b'IMP!': 'Imploder',
+ b'ATN!': 'Titanics Cruncher',
+ b'CrM!': 'CrunchMania',
+ b'CrM2': 'CrunchMania 2',
+ b'Shri': 'Shrinkler (data)',
+ b'XPKF': 'XPK Framework',
+}
+
+def detect_cruncher(filename):
+ with open(filename, 'rb') as f:
+ data = f.read()
+ for magic, name in MAGICS.items():
+ if magic in data:
+ off = data.index(magic)
+ print(f" {name} detected at offset ${off:04X}")
+ return name
+ # Check for valid HUNK with suspicious layout
+ if data[:4] == b'\x00\x00\x03\xf3': # HUNK_HEADER
+ print(" Valid HUNK — check for stub pattern at entry point")
+ return None
+```
+
+---
+
+## Decrunching Tools
+
+### AmigaOS Native
+
+| Tool | Description |
+|---|---|
+| `xfdmaster.library` | Universal decruncher — modular architecture with "slave" plugins in `LIBS:xfd/` |
+| `xfdDecrunch` | CLI front-end: `xfdDecrunch packed.exe unpacked.exe` |
+| `xfdScan` / `xfdList` | Identify cruncher type; list installed slave modules |
+| `powerpacker.library` | PP20 data file decompression: `ppLoadData()` |
+
+### Cross-Platform
+
+| Tool | Description |
+|---|---|
+| **Ancient** (C++) | Modern portable library — supports ByteKiller, Imploder, CrunchMania, PP20, and many more. GitHub: `temisu/ancient` |
+| `ppunpack` | PP20 only: `ppunpack packed.exe unpacked.exe` |
+| Shrinkler `-d` | Shrinkler data files: `shrinkler -d packed unpacked` |
+
+### xfdmaster — Modular Architecture
+
+xfdmaster does not have a hardcoded format list. It loads **slave modules** from `LIBS:xfd/` at runtime, each handling one or more cruncher formats:
+
+```
+LIBS:xfd/
+ PowerPacker ; handles PP20
+ Imploder ; handles IMP!
+ CrunchMania ; handles CrM!, CrM2
+ ByteKiller ; stub-pattern detection
+ Titanics ; handles ATN!
+ ... ; 100+ supported formats
+```
+
+```c
+/* Using xfdmaster.library to decrunch any format: */
+struct xfdBufferInfo *xbi = xfdAllocObject(XFDOBJ_BUFFERINFO);
+xbi->xfdbi_SourceBufLen = filesize;
+xbi->xfdbi_SourceBuffer = filebuf;
+
+if (xfdRecogBuffer(xbi))
+{
+ printf("Detected: %s\n", xbi->xfdbi_PackerName);
+ if (xfdDecrunchBuffer(xbi))
+ {
+ /* xbi->xfdbi_TargetBuffer = decrunched data */
+ /* xbi->xfdbi_TargetBufSaveLen = decrunched size */
+ }
+}
+xfdFreeObject(xbi);
+```
+
+### Debugger-Based Extraction (Last Resort)
+
+For unknown or custom crunchers, the most reliable method is to load the executable in a hardware-level debugger (HRTMon, ASM-One, or an emulator's monitor), set a breakpoint at the end of the decrunch stub (typically the final `JMP` instruction), and capture the memory image once decompression is complete:
+
+```
+; In HRTMon:
+> d $entry_point ; disassemble entry
+; Find the final JMP at the end of the stub
+> bpx $stub_end_jmp ; set breakpoint
+> g ; run
+; When breakpoint hits, the decrunched program is in memory
+> sm $dest $dest+size "decrunched.bin" ; save memory
+```
+
+---
+
+## Impact on FPGA / Emulation
+
+| Concern | Detail |
+|---|---|
+| **Timing-sensitive stubs** | Imploder has tight loops that may fail on accelerated CPUs; some stubs poll `$DFF006` (VHPOSR) for timing |
+| **Memory allocation** | Stub requires working `exec.library AllocMem` — must have a functional memory list |
+| **Chip RAM specificity** | If original hunks need CHIP RAM, stub must request `MEMF_CHIP` — DMA-accessible memory required for graphics/audio |
+| **Self-modifying code** | In-place decompression writes over instruction bytes — 68020+ instruction cache must be invalidated (`CacheClearU`) |
+| **Custom chip access** | Decrunch colour writes to `$DFF180` require a working Denise/colour register |
+| **Boot-block crunchers** | Trackloaders (game boot blocks) use custom crunchers without HUNK format — completely different mechanism, no OS involvement |
+
+---
+
+## References
+
+- PowerPacker documentation (Nico François, 1989)
+- Shrinkler source: https://github.com/askeksa/Shrinkler — `decrunchers/ShrinklerDecompress.S`
+- Ancient decompression library: https://github.com/temisu/ancient — portable C++ decompressors
+- xfdmaster.library — Aminet `util/pack/xfdmaster.lha` (Dirk Stöcker)
+- See also: [HUNK Format](hunk_format.md) — the container format crunchers wrap
+- See also: [Exe Load Pipeline](exe_load_pipeline.md) — how LoadSeg handles the crunched HUNK
+- See also: [Overlay System](overlay_system.md) — another approach to large-program memory management
diff --git a/README.md b/README.md
index dfbd5c4..a2dc811 100644
--- a/README.md
+++ b/README.md
@@ -98,6 +98,7 @@ The Amiga's documentation was scattered across out-of-print manuals, Usenet post
| [exe_load_pipeline.md](03_loader_and_exec_format/exe_load_pipeline.md) | LoadSeg → AllocMem → relocation → segment chain → CreateProc → entry point |
| [object_file_format.md](03_loader_and_exec_format/object_file_format.md) | HUNK_UNIT object files, multi-section layout, HUNK_LIB archives, linker operation |
| [overlay_system.md](03_loader_and_exec_format/overlay_system.md) | HUNK_OVERLAY: tree architecture, runtime overlay manager, modern alternatives |
+| [**exe_crunchers.md**](03_loader_and_exec_format/exe_crunchers.md) | **Executable packers: PP20/Imploder/Shrinkler, decrunch stubs, algorithms, detection** |
### 04 — Linking & Library Integration
| File | Topic |