The **HUNK** format is the binary container format used throughout AmigaOS. It is **not** a single file type — it covers two very different kinds of file that happen to share the same record structure:
| File kind | Extension | First longword | Can be executed? |
| **Object file** — compiler/assembler output, needs linking | `.o` | `$000003E7` (`HUNK_UNIT`) | ❌ No — must be linked first to produce an executable |
| **Static library archive** — collection of object files | `.lib` | `$000003FA` (`HUNK_LIB`) | ❌ No — linker input only |
An object file (`.o`) is **intermediate output** from a compiler. It contains relocatable code and unresolved external references. A linker (`slink`, `vlink`) combines one or more `.o` files with library archives into a final executable.
The format is a linear stream of **hunk records**, each identified by a 32-bit type word followed by type-specific data.
---
## Magic Number — All Valid First Longword Values
Tools and the OS identify a HUNK file by reading its **first 32-bit longword**. There are exactly three valid opening values:
| First longword | Hex | Dec | Constant | File type | Who reads it |
Any other first longword means the file is **not a valid HUNK file**. `InternalLoadSeg` will return an error.
> [!NOTE]
> Only `HUNK_HEADER` files can be passed to `LoadSeg()`. Passing a `.o` object file or a `.lib` archive to `LoadSeg()` will fail — those are consumed exclusively by the linker at build time, never at runtime.
### What `$000003F3` means exactly
The value `$000003F3` = decimal 1011 = the constant `HUNK_HEADER`. Nothing about this value is arbitrary — it is the hunk type code for the header record, used as the magic number because the header is always the first hunk in an executable.
### What `$000003E7` means exactly
The value `$000003E7` = decimal 999 = `HUNK_UNIT`. This marks the start of one relocatable compilation unit. A `.o` file may contain multiple `HUNK_UNIT` records, one per independently-compiled module (though most compilers emit exactly one per file).
> Source header: **`dos/doshunks.h`** (NDK 3.9). Every hunk record starts with one of these 32-bit tag values. The file is a linear stream — the loader reads tag → payload → next tag, until the file ends.
---
### Terminology
| Term | Meaning |
|---|---|
| **longword** | 32-bit (4-byte) value — the native word size of the 68000 |
| **size in longs** | Content length as a count of 4-byte longwords. Bytes = longs × 4 |
| **Exec** | Appears in loadable executables only (starts with `HUNK_HEADER`) |
| **Obj** | Appears in relocatable object files only (starts with `HUNK_UNIT`) |
| **Both** | Valid in either context |
---
### Group 1 — Object File Framing
> These two tags appear **only in `.o` files**. Never in a final linked executable.
| Hex | Dec | Constant | Wire format | Description |
|---|---|---|---|---|
| `$3E7` | 999 | `HUNK_UNIT` | `[tag] [name_len_longs] [name_bytes…]` | **Start of a relocatable object unit.** Always the very first record in a `.o` file — the object-file equivalent of `HUNK_HEADER`. The name field names the compilation unit (e.g. `"main.o"`). A single `.o` file may contain multiple `HUNK_UNIT` records. |
| `$3E8` | 1000 | `HUNK_NAME` | `[tag] [name_len_longs] [name_bytes…]` | **Section name label.** Optional; assigns a human-readable name to the following section. The linker uses it for map files and diagnostics. |
---
### Group 2 — Content Sections
> Carry actual program data. Valid in **both** executables and object files. The type longword may have `HUNKF_CHIP` / `HUNKF_FAST` ORed into its upper bits — see [Memory Placement Flags](#memory-placement-flags).
| Hex | Dec | Constant | Payload | Description |
|---|---|---|---|---|
| `$3E9` | 1001 | `HUNK_CODE` | `[tag] [size_longs] [code_bytes × size×4]` | **Machine-code section.** The loader allocates RAM, copies the bytes, then applies any `HUNK_RELOC32` that follows. Holds 68k instructions — never data. |
| `$3EA` | 1002 | `HUNK_DATA` | `[tag] [size_longs] [data_bytes × size×4]` | **Initialized read/write data.** Global variables with non-zero values, string literals, jump tables, etc. Any embedded pointers to other hunks require `HUNK_RELOC32` fixups. |
| `$3EB` | 1003 | `HUNK_BSS` | `[tag] [size_longs]`*(no data bytes)* | **Uninitialized data (zero-fill).** Only the size is stored — no bytes in the file. The loader calls `AllocMem(..., MEMF_CLEAR)`. A 64 KB zero array costs 4 bytes on disk. **No relocation follows BSS hunks** — there are no initialized values to fix up. |
> [!NOTE]
> **HUNK_DATA trailing space:** Data hunks have been observed with trailing `ds.width` variables that do not contribute to the local hunk length declared in the `HUNK_DATA` header, but are accounted for in the `HUNK_HEADER` size table. The OS loader allocates based on the header size table, so the extra space is available at runtime even though the hunk's own `num_longs` field doesn't include it.
> Tell the loader which longwords inside the current hunk need to be patched with the actual load address of another hunk. Without relocation, all cross-hunk pointers would point to wrong addresses after the OS places code at a non-zero address.
| Hex | Dec | Constant | Alias | Field width | Description |
|---|---|---|---|---|---|
| `$3EC` | 1004 | `HUNK_RELOC32` | `HUNK_ABSRELOC32` | LONG (32-bit) | **Absolute 32-bit fixup — the most common type.** Wire format: `[tag] { [count] [hunk_idx] [offset_0] … [offset_n] } … [0]`. Each offset points to a longword in the current hunk; `*(ULONG*)(base+offset) += target_hunk_base`. Terminated by `count=0`. |
| `$3ED` | 1005 | `HUNK_RELOC16` | `HUNK_RELRELOC16` | LONG (32-bit) | **16-bit absolute fixup.** Same format as above but patches a UWORD. Rare — 68k branch displacements are PC-relative and need no reloc. |
| `$3EE` | 1006 | `HUNK_RELOC8` | `HUNK_RELRELOC8` | LONG (32-bit) | **8-bit fixup.** Patches a UBYTE. Essentially unused — no 68k instruction has an 8-bit absolute address field. |
| `$3F7` | 1015 | `HUNK_DREL32` | — | WORD (16-bit) | **Compact 32-bit reloc.** Same semantics as `HUNK_RELOC32` but count, hunk index, and offsets are stored as 16-bit WORDs, halving the table size. Valid only when all hunk offsets fit in 16 bits (hunk <64KB).GeneratedbyBLink.|
| `$3F8` | 1016 | `HUNK_DREL16` | — | WORD (16-bit) | Compact 16-bit reloc with WORD-sized fields. Very rare. |
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | WORD (16-bit) | **Compact absolute 32-bit reloc with WORD offsets.** Semantically identical to `HUNK_RELOC32` with WORD fields. Default output of vasm/vlink when all offsets fit in 16 bits. Preferred over `HUNK_DREL32` in OS 3.x-era tools. **After the table, if the total WORD count is odd, a padding WORD (`$0000`) restores longword alignment** before the next hunk record. |
| `$3FD` | 1021 | `HUNK_RELRELOC32` | — | LONG (32-bit) | **PC-relative 32-bit reloc.** Patch: `*(LONG*)(base+off) += target_base − (base+off+4)`. Used by GCC `-fPIC` and PIC shared libraries. |
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | — | LONG (32-bit) | **Absolute 16-bit fixup.** Patches a UWORD with the low 16 bits of the target's absolute address. Required for `MOVE.W #abs_addr,Dn` patterns. Rare. |
---
### Group 4 — External Symbol Table
> Object files only — **never present in a linked executable**.
| Hex | Dec | Constant | Description |
|---|---|---|---|
| `$3EF` | 1007 | `HUNK_EXT` | **Import + export symbol table for a compilation unit.** A single stream encodes both sides: *exports* declare symbols defined in this hunk (type `EXT_DEF`, `EXT_ABS`, `EXT_RES`); *imports* list unresolved references the linker must satisfy from other objects (type `EXT_REF32`, `EXT_REF16`, `EXT_REF8`, `EXT_COMMON`). The linker resolves all imports and emits `HUNK_RELOC32` records in the output executable. Wire format: `[tag] { [type_and_namelen] [name_bytes…] [value_or_refcount] [ref_offsets…] } … [0]`. See [`hunk_ext_deep_dive.md`](hunk_ext_deep_dive.md) for sub-type encoding. |
---
### Group 5 — Debug and Metadata
> **Completely ignored by the OS loader.** Strip with `slink NODBG` or `m68k-amigaos-strip --strip-debug` to reduce file size.
| Hex | Dec | Constant | Payload | Description |
|---|---|---|---|---|
| `$3F0` | 1008 | `HUNK_SYMBOL` | `[tag] { [namelen_longs] [name_bytes…] [value] } … [0]` | **Local symbol table.** Maps label names → offsets within this hunk. Consumed by MonAm, wack, IDA Pro. Terminated by `namelen=0`. |
| `$3F1` | 1009 | `HUNK_DEBUG` | `[tag] [size_longs] [format_tag] [data_bytes…]` | **Opaque debug block.** The leading `format_tag` longword identifies the debug data encoding — see [Debug Format Tags](#debug-format-tags) below for the full table. See [`hunk_debug_info.md`](hunk_debug_info.md) for stabs record layout. |
#### Debug Format Tags
The first longword after the size field in a `HUNK_DEBUG` block is a 4-character ASCII **format tag** identifying the debug data encoding:
| Format tag (hex) | ASCII | Compiler / Assembler | Description |
| `$5A4D4147` | `ZMAG` | GNU tools (ld) | GNU ZMAGIC debug hunk (full 6-byte tag `ZMAGIC`) |
> [!NOTE]
> `dos.library` v31+ treats **any** hunk ID whose lower 29 bits exceed `HUNK_ABSRELOC16` (`$3FE` / 1022) as a `HUNK_DEBUG` block and silently skips it — unless bit 29 is set, which causes `ERROR_BAD_HUNK`. This allows compilers to emit custom debug hunk types that newer loaders ignore transparently.
| `$3F2` | 1010 | `HUNK_END` | `[tag]` only — **no payload** | **Required end-of-hunk marker.** Every code/data/BSS hunk (and all reloc/symbol records that follow it) must close with `HUNK_END`. The loader advances to the next segment slot on reading it. |
| `$3F3` | 1011 | `HUNK_HEADER` | `[tag] [0] [num_hunks] [first_hunk] [last_hunk] [size_longs × n]` | **Executable magic number and segment size table.** Must be the very first longword in a loadable executable. The zero longword is the resident-library list (always 0 in practice). `num_hunks` = total hunks; `first_hunk`/`last_hunk` = inclusive range; followed by one size-in-longs per hunk. |
| `$3F5` | 1013 | `HUNK_OVERLAY` | `[tag] [size_longs] [overlay_table_data…]` | **Overlay descriptor table.** Follows the resident hunks; describes groups of code swapped in from disk on demand. Allows programs larger than available RAM. Obsolete — prefer `OpenLibrary()`. |
| `$3F6` | 1014 | `HUNK_BREAK` | `[tag]` only — **no payload** | **End of overlay tree sentinel.**`InternalLoadSeg` needs this to know where the overlay descriptor ends and the per-overlay hunk data begins. |
> [!NOTE]
> Value `$3F4` (decimal 1012) is **unused** — the numbering skips it intentionally.
---
### Group 7 — Static Library Archive
> Linker input only. Never loaded by `LoadSeg()` at runtime.
| Hex | Dec | Constant | Description |
|---|---|---|---|
| `$3FA` | 1018 | `HUNK_LIB` | **Static library archive container.** A sequence of embedded `HUNK_UNIT` object files, each preceded by its size in longwords. Produced by `ar68k` or the AmigaOS `join` command. The linker extracts only the units needed to resolve outstanding `HUNK_EXT` imports. |
| `$3FB` | 1019 | `HUNK_INDEX` | **Symbol index for `HUNK_LIB`.** A packed string table plus a per-unit map of exported symbol names → unit byte offsets. Lets the linker locate a function without scanning every object in the archive. Always immediately follows the `HUNK_LIB` it describes. |
After the initial `HUNK_HEADER`, the OS loader (`dos.library`) only examines the **lower 29 bits** of each hunk type longword. The upper bits encode memory placement flags (see [Memory Placement Flags](#memory-placement-flags) below). This has two important consequences:
1.**Unknown hunk types become debug.**`dos.library` v31+ treats any hunk ID whose lower 29 bits exceed `HUNK_ABSRELOC16` (`$3FE` / 1022) as a `HUNK_DEBUG` block and silently skips it. This allows compilers to emit custom debug hunk types that newer loaders ignore without error.
2.**Bit 29 set → load failure.** If a hunk ID has bit 29 set but is not a recognized code/data/BSS type, the loader **fails** with `ERROR_BAD_HUNK` rather than treating it as debug.
```c
/* Typical loader logic (dos.library v31+) */
hunk_id = read_uint32(f);
if (hunk_id == HUNK_HEADER) { ... } /* first hunk only — full 32 bits */
/* After HUNK_HEADER: mask memory flags, check range */
hunk_id &= 0x3FFFFFFF; /* keep lower 30 bits */
if (hunk_id > HUNK_ABSRELOC16) { /* unknown type */
if (hunk_id & HUNKF_FAST) /* bit 29 set? */
return ERROR_BAD_HUNK; /* hard error */
/* else: treat as HUNK_DEBUG — skip silently */
}
```
> [!NOTE]
> The masking (typically `& 0x3FFFFFFF`) keeps 30 bits, not 29 as the simplified description suggests. The practical rule: after `HUNK_HEADER`, memory flag bits are stripped before the type code comparison.
The type longword for these three hunks can encode a **memory placement request** in its upper bits. The loader passes the corresponding `MEMF_*` flags to `AllocMem`.
```
Bit layout of the type longword:
31 30 29 28 ............. 0
┌───┐ ┌────┐ ┌────┐ ┌─────────────────┐
│ 0 │ │CHIP│ │FAST│ │ Hunk type code │
└───┘ └────┘ └────┘ └─────────────────┘
```
| Bit | Constant | Value | Meaning |
|---|---|---|---|
| 30 | `HUNKF_CHIP` | `1L<<30` | Hunk **must** be in Chip RAM — required for anything the custom chips DMA from (bitmaps, audio, copper lists, sprites) |
| 29 | `HUNKF_FAST` | `1L<<29` | Hunk **prefers** Fast RAM — use for pure CPU data where DMA is not needed; avoids Chip RAM bus contention |
| 30+29 both set | *(extended)* | `0x60000000` | Next longword in the file contains full `MEMF_*` flags for `AllocMem` — allows any combination |
| neither | *(default)* | `0` | `MEMF_PUBLIC` — any available memory |
Additional helper constants:
| Constant | Value | Meaning |
|---|---|---|
| `HUNKB_CHIP` | `30` | Bit **number** (use with `bset`/`btst`) |
| `HUNKB_FAST` | `29` | Bit **number** |
**`MEMF_*` flags** used in extended mode (from `exec/memory.h`):
| Constant | Value | Meaning |
|---|---|---|
| `MEMF_ANY` | `0` | No preference — any accessible memory |
| `MEMF_PUBLIC` | `1<<0` | Must be accessible by all tasks and hardware |
- http://amiga-dev.wikidot.com/file-format:hunk — HUNK format reference with Python parsing code, debug format tags, and dos.library v31+ compatibility notes