[← Home](../README.md) · [Loader & HUNK Format](README.md) # Executable Crunchers — Compression, Decrunch Stubs, and Internals ## Overview **Executable crunchers** (packers) compress AmigaOS executables while keeping them directly runnable. The crunched file is a valid HUNK executable — when launched, a tiny **decrunch stub** runs first, decompresses the original program in memory, then jumps to its real entry point. The user sees a brief colour-cycling delay (the "decrunch colours"), then the program runs normally. This was essential in the floppy era: a 200 KB program crunched to 120 KB loads significantly faster from a slow 880 KB floppy *and* frees disk space on capacity-constrained media. --- ## Architecture ```mermaid graph LR subgraph "Original Executable" OH["HUNK_HEADER"] --> OC["HUNK_CODE
Original code"] OC --> OD["HUNK_DATA
Original data"] OD --> OB["HUNK_BSS"] end subgraph "Crunched Executable" CH["HUNK_HEADER"] --> CS["HUNK_CODE
Decrunch Stub
(~200–800 bytes)"] CS --> CD["HUNK_DATA
Compressed payload
(original hunks)"] CD --> CB["HUNK_BSS
Decompression workspace"] end OH -.->|"Cruncher tool"| CH style CS fill:#fff9c4,stroke:#f9a825,color:#333 style CD fill:#e8f4fd,stroke:#2196f3,color:#333 ``` ### Key Insight A crunched executable is **itself a valid HUNK file**. The OS loader handles it normally — `LoadSeg()` allocates memory, loads hunks, applies relocations. The "magic" is that hunk 0 contains a decrunch stub instead of the original code, and the data hunk contains the compressed original program. --- ## Major Amiga Crunchers | Cruncher | Era | Algorithm | Stub Size | Typical Ratio | Notes | |---|---|---|---|---|---| | **PowerPacker** (PP20) | 1989–1994 | LZ77 + configurable efficiency | ~280 bytes | 50–60% | Most popular; `powerpacker.library` for data files | | **Imploder** (IMP!) | 1990–1993 | LZSS variant | ~400 bytes | 45–55% | Multiple modes: Normal, Pure, Library, Overlayed | | **Turbo Imploder** | 1991–1993 | Enhanced LZSS | ~420 bytes | 42–52% | Faster crunch, same decrunch | | **ByteKiller** | 1988–1991 | LZ77 (simple) | ~160 bytes | 55–65% | Early; position-independent stub; used for raw data too | | **Titanics Cruncher** (ATN!) | 1991–1993 | LZ77 | ~350 bytes | 55–65% | Fast decrunch | | **CrunchMania** (CrM!) | 1992–1995 | LZ + range coding | ~500 bytes | 40–50% | Many registered/customised versions — format variants | | **Shrinkler** | 2014+ | Context-model + range coder | ~250 bytes | 30–40% | Modern; best ratio; used in 4K/64K demo intros | | **PackFire** | 2016+ | Shrinkler derivative | ~200 bytes | 30–40% | Optimised for size-limited compos | | **XPK** | 1992+ | Framework (multiple sub-packers) | varies | varies | Library-based; supports NUKE, SMPL, SQSH, etc. | --- ## Binary Structure of a Crunched Executable ### What the Cruncher Produces The cruncher tool reads the original executable, compresses its contents, and wraps them in a new HUNK executable: ``` HUNK_HEADER ($3F3) num_hunks = 2 or 3 hunk_sizes: [0] = stub code size + compressed data (or split across hunks) [1] = workspace BSS (decompression buffer) HUNK_CODE ($3E9) ; 200–800 bytes of 68000 code ; the original executable, compressed ; original hunk count, sizes, memory types HUNK_RELOC32 ($3EC) ; relocations for the stub itself (minimal) HUNK_BSS ($3EB) ; workspace for decompression ; typically = original uncompressed size HUNK_END ($3F2) ``` ### Alternate Layout (Multi-Hunk) Some crunchers split the stub and payload into separate hunks: ``` Hunk 0: HUNK_CODE — decrunch stub only (~300 bytes) Hunk 1: HUNK_DATA — compressed payload + metadata Hunk 2: HUNK_BSS — decompression workspace ``` --- ## PowerPacker PP20 — Format Deep Dive PowerPacker (by Nico François) is the most widely used Amiga cruncher. It exists in two forms: a **data file format** (for `powerpacker.library`) and an **executable wrapper** (for crunched .exe files). ### PP20 Data Format ``` Offset Size Field ────── ──── ───────────────────────────── $00 4 Signature: "PP20" ($50503230) $04 4 Efficiency table: 4 bytes controlling LZ bit-depth e.g. $09090909 = "Fast", $0A0B0C0D = "Best" $08 N Compressed bitstream data $08+N 4 Decrunch info: 24-bit original size (big-endian) + checksum byte Byte layout: [size_hi] [size_mid] [size_lo] [checksum] ``` ### Efficiency Table The 4-byte efficiency table controls how many bits are used for offset/length encoding in different compression modes: | Mode | Efficiency Bytes | Description | |---|---|---| | Fast | `$09 09 09 09` | Smaller window, faster crunch | | Mediocre | `$09 0A 0A 0A` | Balance | | Good | `$09 0A 0B 0B` | Better ratio | | Very Good | `$09 0A 0B 0C` | Near-best | | Best | `$09 0A 0C 0D` | Maximum compression, slowest | The decompressor reads these 4 bytes to initialize its internal offset/length bit-allocation tables before starting the main decompression loop. ### Decrunch Colours The PowerPacker decrunch stub famously modifies custom chip colour registers during decompression to provide visual feedback — the background colour cycles through shades of grey or colour gradients, signalling that decrunching is in progress. This is the characteristic "decrunch effect" visible on real hardware: ```asm ; Visual feedback during decrunch: MOVE.W D0, $DFF180 ; COLOR00 — background colour ; D0 increments with each decompressed block ``` --- ## Shrinkler — Modern State-of-the-Art Shrinkler (by Blueberry/Loonies) is the current gold standard for Amiga executable compression, achieving 30–40% ratios. It's open-source and widely used in the demo scene. ### Algorithm: Context-Modelling + Range Coder Unlike older LZ77-based crunchers, Shrinkler uses: 1. **Adaptive context model** — maintains 1536 probability contexts (`NUM_CONTEXTS = 1536`). Each context tracks the probability of the next bit being 0 or 1, updated after every decoded bit 2. **Range coder** — an arithmetic coding variant that encodes bits using interval subdivision based on the context probabilities 3. **LZ matching** — literal bytes and back-references are intermixed, with the context model predicting which type comes next ### Shrinkler Data Header ``` Offset Size Field ────── ──── ───────────────────────────── $00 4 Signature: "Shri" ($53687269) $04 1 Major version $05 1 Minor version $06 2 Header size (remaining bytes) $08 4 Compressed data size $0C 4 Uncompressed data size $10 4 Safety margin (for in-place decompression) $14 4 Flags: bit 0 = FLAG_PARITY_CONTEXT ``` The **parity context** flag (`FLAG_PARITY_CONTEXT`) enables a special mode that maintains separate probability models based on the byte position parity, exploiting statistical properties of 68000 machine code (even/odd byte patterns in opcode words). ### 68000 Decompressor Core (from Shrinkler source) The actual decompressor fits in approximately 100 instructions: ```asm ; Register usage: ; D2 = Range value ; D3 = Interval size ; D4 = Input bit buffer (reads bytes from compressed stream) ; D6 = Context index ; D7 = Parity context flag (0 or 1) ; A4 = Compressed data source pointer ; A5 = Decompressed data destination pointer INIT_ONE_PROB = $8000 ; Initial probability: 50/50 ADJUST_SHIFT = 4 ; Probability adaptation rate NUM_CONTEXTS = 1536 ; Context table size ShrinklerDecompress: movem.l d2-d7/a4-a6,-(a7) ; Init range decoder state moveq.l #0,d2 ; Range value = 0 moveq.l #1,d3 ; Interval size = 1 moveq.l #-$80,d4 ; Input buffer (triggers first byte read) ; Init all 1536 probabilities to 50% ($8000) move.l #NUM_CONTEXTS,d6 .init: move.w #INIT_ONE_PROB,-(a7) ; Push WORD onto stack subq.w #1,d6 bne.b .init ; Context table is now on the stack (3072 bytes) ; Main decompression loop .lit: ; Decode literal byte bit-by-bit using context model addq.b #1,d6 .getlit: bsr.b GetBit ; Get one bit from range coder addx.b d6,d6 ; Shift bit into D6 bcc.b .getlit ; Loop until byte complete move.b d6,(a5)+ ; Write decompressed byte .switch: bsr.b GetKind ; Is next item literal or reference? bcc.b .lit ; Literal → decode another byte ; Reference: decode offset and length ; ... (LZ match copy loop) ``` ### Stack-Based Context Table A distinctive Shrinkler technique: the 1536-entry probability table (3072 bytes) is allocated **on the stack** — each entry is a WORD pushed during initialization. This avoids needing a separate AllocMem call and keeps the decompressor self-contained. --- ## Compression Algorithms ### LZ77 / LZSS (PowerPacker, Titanics, ByteKiller, Imploder) The dominant algorithm family. The compressed stream is a sequence of control bits followed by either literal bytes or back-references: ``` [flag bit] 0 → literal byte follows (copy 1 byte verbatim) 1 → match reference: (offset, length) offset = how far back in already-decompressed data to copy from length = how many bytes to copy from that position Decompression pseudo-code: while (output_pos < original_size): bit = read_bit() if bit == 0: output[output_pos++] = read_byte() # literal else: offset = read_bits(offset_bits) # back-reference length = read_bits(length_bits) + min_len copy(output, output_pos - offset, length) # copy from history output_pos += length ``` The **efficiency setting** (PowerPacker) or **mode** (Imploder) controls how many bits are allocated to offset and length fields — more offset bits = larger search window = better compression but slower. ### Context Modelling + Range Coding (Shrinkler, PackFire) Modern crunchers replace fixed-bit-width encoding with probability-based arithmetic coding: 1. For each bit position, the **context model** estimates: "probability that this bit is 1" 2. The **range coder** encodes the bit using that probability — high-probability bits use fewer output bits 3. After encoding/decoding, the context probability is **updated** based on the actual bit value This achieves near-optimal compression but decompression is slower (~2–5 seconds on a 7 MHz 68000 for a typical executable). --- ## Relocation Handling The original executable had HUNK_RELOC32 entries that patch absolute addresses. After decompression, these must be reapplied. Crunchers use three strategies: ### Method 1: Compress Everything Including Relocs The entire original file (all hunks + relocation tables) is compressed as a blob. The decrunch stub acts as a mini-`LoadSeg`: 1. Decompress to a temp buffer 2. Parse the HUNK stream 3. Allocate individual hunks with correct memory types 4. Copy data and apply relocations 5. Free the temp buffer ### Method 2: Pre-Relocated + Delta Table 1. Cruncher pre-applies relocations assuming base address 0 2. Stores a compact **delta table** — sorted list of byte-offset deltas between relocation sites 3. After decompression, the stub walks the delta table and adds actual base addresses ```c /* Delta table: each entry is the offset-delta to the next reloc site */ UWORD reloc_deltas[] = { 0x0006, /* first reloc at offset 6 */ 0x0014, /* +0x14 → next at offset 0x1A */ 0x0008, /* +0x08 → next at offset 0x22 */ 0x0000 /* terminator */ }; /* More compact than storing absolute offsets */ ``` ### Method 3: Merge and Self-Relocate All hunks merged into a single code hunk. Inter-hunk references resolved at crunch time. The result needs minimal or no relocation. **Drawback**: Loses CHIP/FAST memory separation — all data ends up in the same memory type. Problematic for programs that need Chip RAM for bitmaps or audio. --- ## Memory Layout During Decompression ``` BEFORE (crunched exe loaded by OS): ┌──────────────────────┐ Hunk 0 (CODE) │ Decrunch stub (300B) │ │ Compressed data (80K)│ │ Metadata │ └──────────────────────┘ ┌──────────────────────┐ Hunk 1 (BSS) │ Workspace (200K) │ ← decompression buffer └──────────────────────┘ DURING (stub is executing): ┌──────────────────────┐ Hunk 0 — still alive │ Stub + compressed ───│──→ reading from here └──────────────────────┘ ┌──────────────────────┐ AllocMem'd by stub │ Original Hunk 0 CODE │──→ writing decompressed data └──────────────────────┘ ┌──────────────────────┐ AllocMem'd by stub │ Original Hunk 1 DATA │ └──────────────────────┘ AFTER (stub jumps to original entry): ┌──────────────────────┐ (freed or abandoned) │ [freed stub memory] │ └──────────────────────┘ ┌──────────────────────┐ Original program running │ Original Hunk 0 CODE │ ← PC here └──────────────────────┘ ┌──────────────────────┐ │ Original Hunk 1 DATA │ └──────────────────────┘ ``` > **In-place decompression**: Some crunchers (including Shrinkler) support decompressing over the compressed data — the `safety_margin` field in the Shrinkler header reserves extra space so the decompressor's write pointer never overtakes the read pointer. Data is decompressed from end to start. --- ## Detection and Identification ### Magic Signatures | Cruncher | Signature | Hex | Location | |---|---|---|---| | PowerPacker | `PP20` | $50503230 | Start of compressed data | | Imploder | `IMP!` | $494D5021 | Start of compressed data | | Turbo Imploder | `IMP!` | $494D5021 | Same — version in stub differs | | Titanics | `ATN!` | $41544E21 | Start of compressed data | | CrunchMania | `CrM!` / `CrM2` | $43724D21 / $43724D32 | Start of compressed data | | Shrinkler | `Shri` | $53687269 | Data file header (exe uses stub pattern) | | ByteKiller | (no magic) | — | Detected by stub pattern only | | XPK Framework | `XPKF` | $58504B46 | File header | > [!WARNING] > **Fake headers** are extremely common in the Amiga cracking scene. A file claiming to be `IMP!` may have a spoofed header to frustrate analysis. If standard tools reject it, the header is likely fake — use a debugger to capture the decrunched memory image instead. ### Detecting Crunched Executables in RE 1. **Tiny code hunk + large data hunk** — unusual ratio signals packing 2. **AllocMem + decompression loop** at entry point — not the normal `c.o` startup pattern 3. **No `MOVE.L 4.W,A6` / `OpenLibrary` sequence** — stub goes straight to decompression 4. **Custom chip register writes** (`$DFF180` colour changes) — decrunch colour feedback 5. **Magic bytes** in the data hunk — scan for known signatures 6. **Self-modifying code** — stub may overwrite its own memory during in-place decompression ```python # Quick detection script: import struct MAGICS = { b'PP20': 'PowerPacker', b'IMP!': 'Imploder', b'ATN!': 'Titanics Cruncher', b'CrM!': 'CrunchMania', b'CrM2': 'CrunchMania 2', b'Shri': 'Shrinkler (data)', b'XPKF': 'XPK Framework', } def detect_cruncher(filename): with open(filename, 'rb') as f: data = f.read() for magic, name in MAGICS.items(): if magic in data: off = data.index(magic) print(f" {name} detected at offset ${off:04X}") return name # Check for valid HUNK with suspicious layout if data[:4] == b'\x00\x00\x03\xf3': # HUNK_HEADER print(" Valid HUNK — check for stub pattern at entry point") return None ``` --- ## Decrunching Tools ### AmigaOS Native | Tool | Description | |---|---| | `xfdmaster.library` | Universal decruncher — modular architecture with "slave" plugins in `LIBS:xfd/` | | `xfdDecrunch` | CLI front-end: `xfdDecrunch packed.exe unpacked.exe` | | `xfdScan` / `xfdList` | Identify cruncher type; list installed slave modules | | `powerpacker.library` | PP20 data file decompression: `ppLoadData()` | ### Cross-Platform | Tool | Description | |---|---| | **Ancient** (C++) | Modern portable library — supports ByteKiller, Imploder, CrunchMania, PP20, and many more. GitHub: `temisu/ancient` | | `ppunpack` | PP20 only: `ppunpack packed.exe unpacked.exe` | | Shrinkler `-d` | Shrinkler data files: `shrinkler -d packed unpacked` | ### xfdmaster — Modular Architecture xfdmaster does not have a hardcoded format list. It loads **slave modules** from `LIBS:xfd/` at runtime, each handling one or more cruncher formats: ``` LIBS:xfd/ PowerPacker ; handles PP20 Imploder ; handles IMP! CrunchMania ; handles CrM!, CrM2 ByteKiller ; stub-pattern detection Titanics ; handles ATN! ... ; 100+ supported formats ``` ```c /* Using xfdmaster.library to decrunch any format: */ struct xfdBufferInfo *xbi = xfdAllocObject(XFDOBJ_BUFFERINFO); xbi->xfdbi_SourceBufLen = filesize; xbi->xfdbi_SourceBuffer = filebuf; if (xfdRecogBuffer(xbi)) { printf("Detected: %s\n", xbi->xfdbi_PackerName); if (xfdDecrunchBuffer(xbi)) { /* xbi->xfdbi_TargetBuffer = decrunched data */ /* xbi->xfdbi_TargetBufSaveLen = decrunched size */ } } xfdFreeObject(xbi); ``` ### Debugger-Based Extraction (Last Resort) For unknown or custom crunchers, the most reliable method is to load the executable in a hardware-level debugger (HRTMon, ASM-One, or an emulator's monitor), set a breakpoint at the end of the decrunch stub (typically the final `JMP` instruction), and capture the memory image once decompression is complete: ``` ; In HRTMon: > d $entry_point ; disassemble entry ; Find the final JMP at the end of the stub > bpx $stub_end_jmp ; set breakpoint > g ; run ; When breakpoint hits, the decrunched program is in memory > sm $dest $dest+size "decrunched.bin" ; save memory ``` --- ## Impact on FPGA / Emulation | Concern | Detail | |---|---| | **Timing-sensitive stubs** | Imploder has tight loops that may fail on accelerated CPUs; some stubs poll `$DFF006` (VHPOSR) for timing | | **Memory allocation** | Stub requires working `exec.library AllocMem` — must have a functional memory list | | **Chip RAM specificity** | If original hunks need CHIP RAM, stub must request `MEMF_CHIP` — DMA-accessible memory required for graphics/audio | | **Self-modifying code** | In-place decompression writes over instruction bytes — 68020+ instruction cache must be invalidated (`CacheClearU`) | | **Custom chip access** | Decrunch colour writes to `$DFF180` require a working Denise/colour register | | **Boot-block crunchers** | Trackloaders (game boot blocks) use custom crunchers without HUNK format — completely different mechanism, no OS involvement | --- ## References - PowerPacker documentation (Nico François, 1989) - Shrinkler source: https://github.com/askeksa/Shrinkler — `decrunchers/ShrinklerDecompress.S` - Ancient decompression library: https://github.com/temisu/ancient — portable C++ decompressors - xfdmaster.library — Aminet `util/pack/xfdmaster.lha` (Dirk Stöcker) - See also: [HUNK Format](hunk_format.md) — the container format crunchers wrap - See also: [Exe Load Pipeline](exe_load_pipeline.md) — how LoadSeg handles the crunched HUNK - See also: [Overlay System](overlay_system.md) — another approach to large-program memory management