mirror of
https://github.com/alfishe/amiga-bootcamp.git
synced 2026-06-13 00:26:28 +00:00
docs(amiga): complete AmigaOS 3.1/3.2 developer reference — 172 files across 17 sections
Comprehensive technical documentation covering: - Hardware: OCS/ECS/AGA custom chip registers, Copper & Blitter deep dives - Boot sequence: cold boot through startup-sequence - Binary format: HUNK executable spec, relocation, debug info - Linking & ABI: .fd files, LVO tables, register calling conventions - Exec kernel: tasks, interrupts, memory, signals, semaphores - AmigaDOS: file I/O, FFS/OFS layout, CLI/Shell scripting - Graphics: planar bitmaps, Copper programming, HAM/EHB modes - Intuition: screens, windows, IDCMP, BOOPSI - Devices: trackdisk, SCSI, serial, timer, audio, keyboard - Libraries: utility, expansion, IFFParse, locale, ARexx - Networking: bsdsocket API, SANA-II, TCP/IP stack comparison - Toolchain: GCC, vasm/vlink, SAS/C, NDK, debugging - Reverse engineering: IDA/Ghidra setup, compiler fingerprints, case studies - CPU & MMU: 68040/060 emulation libs, PMMU, cache management - Driver development: SANA-II, Picasso96/RTG, AHI audio All files include breadcrumb navigation. No local paths or proprietary content.
This commit is contained in:
parent
f07a368bf1
commit
21751c0025
172 changed files with 19701 additions and 0 deletions
51
03_loader_and_exec_format/README.md
Normal file
51
03_loader_and_exec_format/README.md
Normal file
|
|
@ -0,0 +1,51 @@
|
|||
[← Home](../README.md)
|
||||
|
||||
# Executable Loader & HUNK Format
|
||||
|
||||
## Overview
|
||||
|
||||
This section covers the complete lifecycle of an AmigaOS executable:
|
||||
|
||||
1. **HUNK file format** — the binary container for all AmigaOS executables, libraries, and object files
|
||||
2. **Loader pipeline** — how `dos.library` loads and relocates an executable into memory
|
||||
3. **Object files** — how compilers produce relocatable object files for the linker
|
||||
4. **Overlays** — how programs larger than available memory use the overlay system
|
||||
|
||||
## Contents
|
||||
|
||||
| File | Topic |
|
||||
|---|---|
|
||||
| [hunk_format.md](hunk_format.md) | Complete HUNK binary specification |
|
||||
| [hunk_ext_deep_dive.md](hunk_ext_deep_dive.md) | HUNK_EXT: exports, imports, commons |
|
||||
| [hunk_relocation.md](hunk_relocation.md) | HUNK_RELOC32/16/8 mechanics |
|
||||
| [hunk_debug_info.md](hunk_debug_info.md) | HUNK_SYMBOL, HUNK_DEBUG (stabs) |
|
||||
| [exe_load_pipeline.md](exe_load_pipeline.md) | LoadSeg → Process creation |
|
||||
| [object_file_format.md](object_file_format.md) | Compiler object files (HUNK_UNIT) |
|
||||
| [overlay_system.md](overlay_system.md) | HUNK_OVERLAY memory segmentation |
|
||||
|
||||
## Why HUNK?
|
||||
|
||||
HUNK is the native AmigaOS executable format, used from AmigaOS 1.0 through 3.x. It predates ELF/COFF and has these key properties:
|
||||
|
||||
- **Segmented**: separate code, data, and BSS hunks with independent memory allocation
|
||||
- **Relocatable**: all absolute references are patched at load time (no ASLR; base address changes each run)
|
||||
- **Typed memory**: each hunk can request `CHIP` or `FAST` memory independently
|
||||
- **Symbol-complete**: optional HUNK_SYMBOL and HUNK_DEBUG hunks carry debugging information
|
||||
|
||||
## Key Concepts
|
||||
|
||||
| Term | Meaning |
|
||||
|---|---|
|
||||
| **Hunk** | One contiguous block in the binary (code, data, BSS, etc.) |
|
||||
| **Segment** | A loaded hunk at runtime — a `BPTR`-linked list |
|
||||
| **Segment list** | Chain of loaded hunks returned by `LoadSeg()` |
|
||||
| **BPTR** | Amiga byte pointer — 32-bit value right-shifted by 2 (`ptr >> 2`) |
|
||||
| **Relocation** | Patching absolute addresses based on actual load address |
|
||||
| **LVO** | Library Vector Offset — negative offset from library base |
|
||||
|
||||
## References
|
||||
|
||||
- ADCD 2.1: `Includes_and_Autodocs_3._guide/` — dos.library LoadSeg autodoc
|
||||
- NDK39: `dos/dos.h` — BPTR, segment handling macros
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS chapter
|
||||
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node0150.html
|
||||
276
03_loader_and_exec_format/exe_load_pipeline.md
Normal file
276
03_loader_and_exec_format/exe_load_pipeline.md
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# Executable Load Pipeline
|
||||
|
||||
## Overview
|
||||
|
||||
This document traces the complete path from user request to a running process: `LoadSeg()` → memory allocation → relocation → `CreateProc()` → execution.
|
||||
|
||||
---
|
||||
|
||||
## Entry Points
|
||||
|
||||
| Function | Library | Description |
|
||||
|---|---|---|
|
||||
| `LoadSeg(name)` | dos.library | Load named file, return segment list |
|
||||
| `InternalLoadSeg(fh, table, funcarray, stack)` | dos.library | Low-level load from open file handle |
|
||||
| `NewLoadSeg(name, tags)` | dos.library (3.1+) | Tagged version of LoadSeg |
|
||||
| `UnLoadSeg(seglist)` | dos.library | Free segment list |
|
||||
| `CreateNewProc(tags)` | dos.library | Create process from segment list |
|
||||
| `RunCommand(seg, stack, args, len)` | dos.library | Run segment in current process context |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Parsing HUNK_HEADER
|
||||
|
||||
`InternalLoadSeg` opens the file and reads the header:
|
||||
|
||||
```c
|
||||
1. Read magic word — must be $000003F3 (HUNK_HEADER)
|
||||
2. Read resident library list (always 0 for standard executables)
|
||||
3. Read num_hunks, first_hunk, last_hunk
|
||||
4. Read size table: num_hunks longwords
|
||||
Each longword: bits[31:30] = memory type, bits[29:0] = size in longs
|
||||
```
|
||||
|
||||
The loader allocates one memory block per hunk using `AllocMem()`:
|
||||
- Memory type from size longword bits → `MEMF_CHIP`, `MEMF_FAST`, or `MEMF_ANY`
|
||||
- Size = longword_count × 4 bytes
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Memory Allocation
|
||||
|
||||
For each hunk (from first_hunk to last_hunk):
|
||||
|
||||
```c
|
||||
ULONG size_longs = size_table[i] & ~0xC0000000;
|
||||
ULONG mem_type = (size_table[i] >> 30) & 3;
|
||||
ULONG memf;
|
||||
|
||||
switch (mem_type) {
|
||||
case 0: memf = MEMF_PUBLIC; break;
|
||||
case 1: memf = MEMF_CHIP; break;
|
||||
case 2: memf = MEMF_FAST; break;
|
||||
case 3: /* extended: read additional longword for MEMF_ flags */
|
||||
}
|
||||
|
||||
APTR seg_mem = AllocMem(size_longs * 4 + sizeof(BPTR), memf | MEMF_CLEAR);
|
||||
```
|
||||
|
||||
Each allocation is **4 bytes larger** than the hunk data to hold the BPTR link to the next segment.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Loading Hunk Data
|
||||
|
||||
For each hunk, the loader reads hunks sequentially from the file:
|
||||
|
||||
```
|
||||
while not HUNK_END:
|
||||
switch (hunk_type):
|
||||
HUNK_CODE, HUNK_DATA:
|
||||
read num_longs × 4 bytes into segment memory
|
||||
HUNK_BSS:
|
||||
already zero-filled by AllocMem(MEMF_CLEAR)
|
||||
HUNK_RELOC32:
|
||||
store for Phase 4 (apply after all hunks loaded)
|
||||
HUNK_SYMBOL, HUNK_DEBUG:
|
||||
read and discard (or pass to debugger hook)
|
||||
HUNK_END:
|
||||
advance to next hunk
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Relocation Pass
|
||||
|
||||
After all hunks are loaded and their base addresses are known:
|
||||
|
||||
```
|
||||
for each HUNK_RELOC32 in hunk H:
|
||||
for each (target_hunk, offsets[]):
|
||||
base = segment_base[target_hunk]
|
||||
for each offset:
|
||||
patch = (ULONG *)(segment_base[H] + offset)
|
||||
*patch += base /* add actual load address */
|
||||
```
|
||||
|
||||
This two-pass approach (load all, then relocate) is required because `HUNK_RELOC32` entries may reference any hunk, including ones not yet loaded when the reloc entry is encountered.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Segment List Construction
|
||||
|
||||
The segments are chained as a **BPTR list**:
|
||||
|
||||
```
|
||||
Segment 0 memory:
|
||||
[BPTR → seg1] (4 bytes)
|
||||
[code data...]
|
||||
|
||||
Segment 1 memory:
|
||||
[BPTR → 0] NULL = end of list
|
||||
[data...]
|
||||
```
|
||||
|
||||
`LoadSeg()` returns `MKBADDR(seg0_mem)` — a BPTR to the first segment (the memory address right-shifted by 2, as required by the BCPL pointer convention).
|
||||
|
||||
Converting BPTR to real address:
|
||||
```c
|
||||
APTR addr = BADDR(seglist); /* = seglist << 2 */
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Process Creation
|
||||
|
||||
`CreateNewProc()` (or the old `CreateProc()`) takes the segment list and creates a new AmigaOS process:
|
||||
|
||||
```c
|
||||
struct Process *proc = CreateNewProcTags(
|
||||
NP_Seglist, seglist,
|
||||
NP_Name, "MyProgram",
|
||||
NP_StackSize, 8192,
|
||||
NP_Priority, 0,
|
||||
NP_CommandName, cmd_string,
|
||||
TAG_DONE);
|
||||
```
|
||||
|
||||
Internally this calls `exec.library MakeNode()` and initialises:
|
||||
- `Process->pr_SegList` — the segment list BPTR
|
||||
- Stack: allocated via `AllocMem(NP_StackSize, MEMF_PUBLIC)`, stored in `pr_Stack`
|
||||
- `tc_SPLower` / `tc_SPUpper` — stack bounds
|
||||
- `tc_SPReg` — initial stack pointer (top of stack)
|
||||
- `pr_GlobVec` — global vector (BCPL compat, not used by C programs)
|
||||
- `pr_CLI` — CLI structure if launched from Shell
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Entry Point
|
||||
|
||||
The process starts executing at the **first word of hunk 0** (the first loaded segment). This is not `main()` — it is the startup code (`_start` / `c.o`):
|
||||
|
||||
```asm
|
||||
; c.o (SAS/C startup):
|
||||
_start:
|
||||
MOVE.L 4.W, A6 ; SysBase
|
||||
MOVE.L A0, _CommandStr ; raw command line from dos.library
|
||||
BSR __main ; C runtime init
|
||||
...
|
||||
MOVE.L _ExitCode, D0 ; return value
|
||||
RTS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## CLI vs Workbench Launch
|
||||
|
||||
| Parameter | CLI Launch | WBStartup |
|
||||
|---|---|---|
|
||||
| `pr_CLI` | Non-NULL, points to CLI struct | NULL |
|
||||
| `pr_WBenchMsg` | NULL | Pointer to WBStartup message |
|
||||
| A0 at entry | Command string pointer | NULL |
|
||||
| A1 at entry | NULL | Pointer to WBStartup message |
|
||||
| Return | `dos.library` handles exit | Must `Forbid(); ReplyMsg(wb_msg)` |
|
||||
|
||||
Startup code detects the launch type:
|
||||
```c
|
||||
if (pr->pr_CLI) {
|
||||
/* CLI launch: use command string */
|
||||
} else {
|
||||
/* WB launch: wait for and reply to WBenchMsg */
|
||||
WaitPort(&pr->pr_MsgPort);
|
||||
WBMsg = (struct WBStartup *)GetMsg(&pr->pr_MsgPort);
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## State Machine Diagram
|
||||
|
||||
```mermaid
|
||||
stateDiagram-v2
|
||||
[*] --> ParseHeader : LoadSeg(name)
|
||||
ParseHeader --> AllocHunks : HUNK_HEADER valid
|
||||
AllocHunks --> LoadData : AllocMem() per hunk
|
||||
LoadData --> Relocate : all hunks read
|
||||
Relocate --> BuildSegList : all patches applied
|
||||
BuildSegList --> CreateProc : BPTR chain complete
|
||||
CreateProc --> Running : AddTask() + Dispatch
|
||||
Running --> [*] : process exits, UnLoadSeg()
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## UnLoadSeg — Freeing Memory
|
||||
|
||||
```c
|
||||
UnLoadSeg(seglist);
|
||||
/* Walks the BPTR chain, FreeMem() each segment block */
|
||||
```
|
||||
|
||||
The 4-byte BPTR header in each segment block records the size (stored by the loader before the data):
|
||||
```
|
||||
seg_mem - 4 : size of this allocation in bytes
|
||||
seg_mem + 0 : BPTR to next segment
|
||||
seg_mem + 4 : hunk data
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Hunk Types Reference
|
||||
|
||||
Complete enumeration of all hunk type codes as defined in `NDK39: dos/doshunks.h`. Types are listed in numeric order. The **Context** column indicates where each hunk may legally appear: **Exec** = loadable executable, **Obj** = relocatable object file (HUNK_UNIT stream), **Both** = either.
|
||||
|
||||
| Hex | Dec | Name | Context | Purpose | Typical Content | Notes / Limits |
|
||||
|---|---|---|---|---|---|---|
|
||||
| `$3E7` | 999 | `HUNK_UNIT` | Obj | Marks the start of a relocatable object file unit | 1 longword = name length in longs, then the unit name string (padded to longword boundary) | Must be the first record in every `.o` file. Not present in final executables. |
|
||||
| `$3E8` | 1000 | `HUNK_NAME` | Obj | Names a section within a HUNK_UNIT | 1 longword = name length, then the section name string | Optional; precedes the code/data/BSS hunk it names. Linker uses for diagnostics. |
|
||||
| `$3E9` | 1001 | `HUNK_CODE` | Both | Machine-code section | 1 longword = size in 32-bit longs; then *size×4* bytes of 68k opcodes | Loaded into RAM. Bits 30–29 of the type word select memory type (see HUNKF_* flags). Max size: ~1 GB (29-bit field = 512 M longs). |
|
||||
| `$3EA` | 1002 | `HUNK_DATA` | Both | Initialized read/write data section | 1 longword = size in longs; then *size×4* bytes of raw data | Same memory-type flags as HUNK_CODE. Pointer tables here require HUNK_RELOC32 fixups. |
|
||||
| `$3EB` | 1003 | `HUNK_BSS` | Both | Uninitialized (zero-filled) data section | 1 longword = size in longs | No data follows — the loader's `AllocMem(MEMF_CLEAR)` zeroes the block. Bitmap/audio buffers use this. |
|
||||
| `$3EC` | 1004 | `HUNK_RELOC32` / `HUNK_ABSRELOC32` | Both | Absolute 32-bit address fixup table | Pairs of `(num_offsets, target_hunk)` followed by *num_offsets* longword byte-offsets; terminated by `num_offsets=0` | Each patch site: `*(ULONG*)(hunk_base+offset) += target_base`. Offsets must be longword-aligned. Unlimited entries. Most common reloc type. |
|
||||
| `$3ED` | 1005 | `HUNK_RELOC16` / `HUNK_RELRELOC16` | Obj | PC-relative or absolute 16-bit fixup | Same structure as HUNK_RELOC32 but patches a UWORD | Rarely generated. The 68k only supports 16-bit displacements in `Bcc`/`BSR`; linkers prefer PC-relative code instead. |
|
||||
| `$3EE` | 1006 | `HUNK_RELOC8` / `HUNK_RELRELOC8` | Obj | 8-bit fixup | Same structure, patches a UBYTE | Extremely rare. Only useful for short-branch offsets inside a single hunk. |
|
||||
| `$3EF` | 1007 | `HUNK_EXT` | Obj | External symbol table (imports + exports) | Sequence of `(type_namelen, name, value/refs)` entries; terminated by longword `$00000000` | **Not present in executables** — linker resolves all externals into HUNK_RELOC32 at link time. See EXT_DEF, EXT_REF32, EXT_COMMON sub-types in `hunk_ext_deep_dive.md`. |
|
||||
| `$3F0` | 1008 | `HUNK_SYMBOL` | Both | Local (non-exported) symbol table for debugging | Pairs of `(name_len, name, value)`; terminated by `name_len=0` | Ignored by the OS loader; used only by debuggers (MonAm, wack, IDA). Strip with `slink NODBG` or `m68k-amigaos-strip --strip-debug`. No limit on entry count. |
|
||||
| `$3F1` | 1009 | `HUNK_DEBUG` | Both | Arbitrary debugger data block | 1 longword = size in longs; then *size×4* bytes of opaque data | First longword is often a format tag: `$3D415053` = SAS/C stabs, `$3D474343` = GCC stabs. Ignored by loader. Can hold DWARF, stabs, or proprietary data. |
|
||||
| `$3F2` | 1010 | `HUNK_END` | Both | Marks the end of one logical hunk | No data — bare type longword only | **Required** after every code/data/BSS + its reloc/symbol records. Loader advances to the next segment slot when this is seen. |
|
||||
| `$3F3` | 1011 | `HUNK_HEADER` | Exec | Executable file header — magic number + segment size table | Resident-lib list (always 0), num_hunks, first_hunk, last_hunk, then one size longword per hunk | **Must be the very first record** in a loadable executable. Object files use HUNK_UNIT instead. Each size longword: bits 31–30 = memory type, bits 29–0 = size in longs. |
|
||||
| `$3F5` | 1013 | `HUNK_OVERLAY` | Exec | Overlay table — describes on-demand swap-in groups | 1 longword = overlay data size; then the overlay tree structure (node count, per-node hunk counts and sizes) | Follows the resident hunks in the file. Obsolete in modern Amiga software; replaced by `OpenLibrary()`. AmigaOS `InternalLoadSeg` supports it but it is rarely seen after 1990. |
|
||||
| `$3F6` | 1014 | `HUNK_BREAK` | Exec | End-of-overlay-tree sentinel | No data | Immediately follows the overlay tree data. Required for `InternalLoadSeg` to know where the overlay definition ends. |
|
||||
| `$3F7` | 1015 | `HUNK_DREL32` | Both | Compact 32-bit base-relative relocation (word-sized fields) | Pairs of `(num_offsets:WORD, target_hunk:WORD)` followed by *num_offsets* WORDs; terminated by WORD `0` | Used by BLink and some third-party linkers. More compact than HUNK_RELOC32 for small programs (all offsets fit in 16 bits, hunks < 64 KB). Supported by `InternalLoadSeg`. |
|
||||
| `$3F8` | 1016 | `HUNK_DREL16` | Obj | Compact 16-bit base-relative relocation | Same word-field structure as HUNK_DREL32, patches UWORD | Very rare; primarily in object files from BLink-family toolchains. |
|
||||
| `$3F9` | 1017 | `HUNK_DREL8` | Obj | Compact 8-bit base-relative relocation | Same word-field structure, patches UBYTE | Essentially unused in practice. |
|
||||
| `$3FA` | 1018 | `HUNK_LIB` | Obj | Static library archive container | Sequence of embedded HUNK_UNIT blocks (each a full `.o`) preceded by their individual sizes | Output of `ar`-equivalent tools (`ar68k`, AmigaOS `join`). The linker extracts individual units from this container as needed. Not executed directly. |
|
||||
| `$3FB` | 1019 | `HUNK_INDEX` | Obj | Symbol index for a HUNK_LIB archive | String table + per-unit symbol-name / hunk-number mappings | Allows the linker to locate a specific symbol without scanning all units. Immediately follows HUNK_LIB. |
|
||||
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | Both | Compact 32-bit absolute relocation (word-sized offsets) | Same compact word-field structure as HUNK_DREL32 but semantically identical to HUNK_RELOC32 | Added in AmigaOS 3.x era. Reduces reloc-table file size when all patch offsets fit in 16 bits. Some linkers (e.g., vasm/vlink) emit this by default. |
|
||||
| `$3FD` | 1021 | `HUNK_RELRELOC32` | Both | PC-relative 32-bit relocation | Same longword-field structure as HUNK_RELOC32; patches a 32-bit displacement rather than an absolute address | Used by GCC `-fPIC` output and shared-library position-independent code. Patch: `*(LONG*)(base+offset) += target_base − (base+offset+4)`. |
|
||||
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | Both | Absolute 16-bit relocation | Same longword-field structure as HUNK_RELOC32, patches a UWORD with the lower 16 bits of the target address | Required when code uses `MOVE.W #abs_addr, Dn` with a truncated 16-bit address constant. Rare in well-structured programs. |
|
||||
|
||||
### Memory-Type Flags (Bits 30–29 of Hunk Type Word)
|
||||
|
||||
These flags may be ORed into the type longword of `HUNK_CODE`, `HUNK_DATA`, and `HUNK_BSS` to control memory placement:
|
||||
|
||||
| Bit 30 | Bit 29 | Meaning | AllocMem flag |
|
||||
|---|---|---|---|
|
||||
| 0 | 0 | Any memory (default) | `MEMF_PUBLIC` |
|
||||
| 1 | 0 | Chip RAM required | `MEMF_CHIP` |
|
||||
| 0 | 1 | Fast RAM preferred | `MEMF_FAST` |
|
||||
| 1 | 1 | Extended — next longword specifies full `MEMF_*` flags | See note |
|
||||
|
||||
Example: `HUNK_CODE` forced into Chip RAM = `$C00003E9` (`HUNK_CODE | HUNKF_CHIP`).
|
||||
|
||||
### HUNKB_ADVISORY (Bit 29 of the *type word itself*)
|
||||
|
||||
When **bit 29** is set in an otherwise-unknown hunk type, AmigaOS `InternalLoadSeg` treats it like `HUNK_DEBUG` (reads and discards the block) instead of failing with an error. This allows future hunk types to be added without breaking older loaders.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- NDK39: `dos/dos.h`, `dos/dosextens.h` — Process, CLI structures
|
||||
- ADCD 2.1 Autodocs: `LoadSeg`, `InternalLoadSeg`, `CreateNewProc`
|
||||
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node0150.html
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS chapter
|
||||
163
03_loader_and_exec_format/hunk_debug_info.md
Normal file
163
03_loader_and_exec_format/hunk_debug_info.md
Normal file
|
|
@ -0,0 +1,163 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# HUNK Debug Information
|
||||
|
||||
## Overview
|
||||
|
||||
Two optional hunk types carry debug information in AmigaOS executables:
|
||||
|
||||
- **HUNK_SYMBOL** ($3F0) — a simple name→offset symbol table
|
||||
- **HUNK_DEBUG** ($3F1) — arbitrary debug data (most commonly stabs records)
|
||||
|
||||
Both are ignored by the loader and only used by debuggers.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_SYMBOL
|
||||
|
||||
The simplest debug hunk. Contains a list of (name, offset) pairs for the current hunk:
|
||||
|
||||
```
|
||||
HUNK_SYMBOL ($000003F0)
|
||||
|
||||
[Repeat:]
|
||||
<name_len> Length of name in longwords (0 terminates)
|
||||
<name...> Symbol name padded to longword boundary
|
||||
<value> Symbol value (offset within current hunk)
|
||||
|
||||
<0> Terminator: name_len = 0
|
||||
HUNK_END
|
||||
```
|
||||
|
||||
### Example
|
||||
|
||||
Two symbols in a code hunk:
|
||||
```
|
||||
$000003F0 HUNK_SYMBOL
|
||||
$00000001 name = 1 long (4 chars)
|
||||
"_foo" symbol name
|
||||
$00000000 at offset 0
|
||||
$00000002 name = 2 longs (8 chars)
|
||||
"_bar\0\0\0\0"
|
||||
$00000040 at offset $40
|
||||
$00000000 terminator
|
||||
```
|
||||
|
||||
### Use in Debuggers
|
||||
|
||||
MonAm, wack, and IDA Pro all parse `HUNK_SYMBOL` to provide named labels in the disassembly. IDA's Amiga loader maps these directly to function/data names.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_DEBUG
|
||||
|
||||
`HUNK_DEBUG` carries arbitrary debug data. The most common format used by AmigaOS compilers is **stabs** records (as produced by SAS/C 6.x and GCC).
|
||||
|
||||
```
|
||||
HUNK_DEBUG ($000003F1)
|
||||
|
||||
<size_in_longs> Total size of the debug data in longwords
|
||||
<data...> Compiler-specific debug data
|
||||
HUNK_END
|
||||
```
|
||||
|
||||
### SAS/C Stabs Format
|
||||
|
||||
SAS/C 6.x emits stabs-format debug info. The first longword in the debug data is a tag identifying the format:
|
||||
|
||||
```
|
||||
$3D415053 Tag = "=APS" — SAS/C stabs
|
||||
```
|
||||
|
||||
Following the tag: standard BSD/UNIX stabs records:
|
||||
```c
|
||||
struct stab_entry {
|
||||
ULONG n_strx; /* offset into string table */
|
||||
UBYTE n_type; /* stab type code */
|
||||
UBYTE n_other;
|
||||
UWORD n_desc; /* line number or misc */
|
||||
ULONG n_value; /* symbol value */
|
||||
};
|
||||
```
|
||||
|
||||
**Common stab type codes:**
|
||||
| Code | Name | Meaning |
|
||||
|---|---|---|
|
||||
| $24 | `N_FUN` | Function start |
|
||||
| $44 | `N_SLINE` | Source line number |
|
||||
| $64 | `N_SO` | Source file name |
|
||||
| $84 | `N_LSYM` | Local symbol / type |
|
||||
| $A0 | `N_GSYM` | Global symbol |
|
||||
| $C0 | `N_RSYM` | Register variable |
|
||||
|
||||
### GCC Stabs Format
|
||||
|
||||
GCC (`m68k-amigaos-gcc`) emits similar stabs, usually with tag `$3D474343` ("=GCC") or no tag at all.
|
||||
|
||||
### Line Number Information
|
||||
|
||||
Stabs records with `N_SLINE` provide source-to-address mapping, enabling source-level debugging in tools like wack:
|
||||
|
||||
```
|
||||
N_SLINE: n_desc = source_line_number
|
||||
n_value = offset in current code hunk
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Reading Debug Info in IDA Pro
|
||||
|
||||
IDA Pro's Amiga HUNK loader (standard IDA plugin) parses:
|
||||
- `HUNK_SYMBOL` → applies as function/data names automatically
|
||||
- `HUNK_DEBUG` → partially parsed; stab `N_FUN` entries become function names
|
||||
|
||||
To see IDA's parsed symbols after loading:
|
||||
- `View → Open Subviews → Names` — all named locations including HUNK_SYMBOL entries
|
||||
- `View → Open Subviews → Segments` — hunk-to-segment mapping
|
||||
|
||||
---
|
||||
|
||||
## Stripping Debug Info
|
||||
|
||||
To produce a smaller executable without debug info:
|
||||
|
||||
**SAS/C:**
|
||||
```
|
||||
slink lib/c.o + myobj.o TO myexe NODBG
|
||||
```
|
||||
|
||||
**GCC:**
|
||||
```
|
||||
m68k-amigaos-strip --strip-debug myexe
|
||||
```
|
||||
|
||||
This removes HUNK_SYMBOL and HUNK_DEBUG records, reducing file size.
|
||||
|
||||
---
|
||||
|
||||
## Worked Hex Example (HUNK_SYMBOL)
|
||||
|
||||
Binary fragment from a real executable:
|
||||
```
|
||||
Offset Hex Bytes Decoded
|
||||
$1A00: 00 00 03 F0 HUNK_SYMBOL
|
||||
$1A04: 00 00 00 01 name_len = 1 (4 chars)
|
||||
$1A08: 5F 6D 61 69 "_mai"
|
||||
$1A0C: 6E 00 00 00 "n\0\0\0"
|
||||
$1A10: 00 00 00 00 value = 0 (entry at start of code hunk)
|
||||
$1A14: 00 00 00 02 name_len = 2 (8 chars)
|
||||
$1A18: 5F 70 72 6F "_pro"
|
||||
$1A1C: 63 65 73 73 "cess"
|
||||
$1A20: 00 00 00 78 value = $78
|
||||
$1A24: 00 00 00 00 terminator
|
||||
$1A28: 00 00 03 F2 HUNK_END
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- NDK39: `dos/doshunks.h` — HUNK_SYMBOL, HUNK_DEBUG constants
|
||||
- SAS/C 6.x Programmer's Guide — debug output format
|
||||
- GCC internals — stabs format documentation
|
||||
- IDA Pro Amiga loader source (community) — stabs parsing
|
||||
164
03_loader_and_exec_format/hunk_ext_deep_dive.md
Normal file
164
03_loader_and_exec_format/hunk_ext_deep_dive.md
Normal file
|
|
@ -0,0 +1,164 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# HUNK_EXT — Exports and Imports
|
||||
|
||||
## Overview
|
||||
|
||||
`HUNK_EXT` ($000003EF) is the external symbol record used in **object files** (HUNK_UNIT format). It carries both **exported symbols** (definitions visible to the linker) and **imported symbols** (references to symbols defined in other object files or libraries).
|
||||
|
||||
`HUNK_EXT` does **not** appear in final executables — the linker resolves all externals during the link step and emits `HUNK_RELOC32` instead.
|
||||
|
||||
---
|
||||
|
||||
## Record Format
|
||||
|
||||
```
|
||||
HUNK_EXT ($000003EF)
|
||||
|
||||
[Repeat until terminator:]
|
||||
<type_and_namelen> Longword: bits[31:24] = EXT type, bits[23:0] = name length in longs
|
||||
<name...> Symbol name, padded to longword boundary
|
||||
[type-specific data]
|
||||
|
||||
<0x00000000> Terminator (name length = 0)
|
||||
HUNK_END
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## EXT Type Codes
|
||||
|
||||
| Value | Name | Direction | Description |
|
||||
|---|---|---|---|
|
||||
| 0 | `EXT_SYMB` | export | Absolute symbol (no relocation) |
|
||||
| 1 | `EXT_DEF` | export | Defined symbol at offset in current hunk |
|
||||
| 2 | `EXT_ABS` | export | Absolute value symbol |
|
||||
| 3 | `EXT_RES` | export | Resident library symbol |
|
||||
| 129 | `EXT_REF32` | import | 32-bit reference (absolute) |
|
||||
| 130 | `EXT_COMMON` | import/def | Common BSS block |
|
||||
| 131 | `EXT_REF16` | import | 16-bit reference |
|
||||
| 132 | `EXT_REF8` | import | 8-bit reference |
|
||||
| 133 | `EXT_DEXT32` | import | 32-bit data-relative reference |
|
||||
| 134 | `EXT_DEXT16` | import | 16-bit data-relative reference |
|
||||
| 135 | `EXT_DEXT8` | import | 8-bit data-relative reference |
|
||||
|
||||
---
|
||||
|
||||
## Export Records
|
||||
|
||||
### EXT_DEF — Defined Symbol
|
||||
|
||||
The most common export — a function or data label at a fixed offset within the current hunk:
|
||||
|
||||
```
|
||||
bits[31:24] = 0x01 EXT_DEF
|
||||
bits[23:0] = name_longs name length in longwords
|
||||
<name...> symbol name (padded)
|
||||
<offset> offset within current hunk (longword)
|
||||
```
|
||||
|
||||
Example: exporting `_init` at offset 0x10 in the code hunk:
|
||||
```
|
||||
$01000005 ; EXT_DEF, name = 5 longs (20 chars)
|
||||
"_ini"
|
||||
"t "
|
||||
" " ; (padded to 5 longs)
|
||||
$00000010 ; offset $10
|
||||
```
|
||||
|
||||
### EXT_ABS — Absolute Symbol
|
||||
|
||||
Symbol has an absolute value (not relative to any hunk):
|
||||
```
|
||||
bits[31:24] = 0x02
|
||||
<name...>
|
||||
<absolute_value>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Import Records
|
||||
|
||||
### EXT_REF32 — 32-bit Reference
|
||||
|
||||
Used when the current object references an external symbol. The linker patches these after symbol resolution.
|
||||
|
||||
```
|
||||
bits[31:24] = 0x81 EXT_REF32
|
||||
bits[23:0] = name_longs
|
||||
<name...> symbol name being referenced
|
||||
<num_refs> number of reference sites in current hunk
|
||||
<ref_offset_0> byte offset within current hunk
|
||||
<ref_offset_1>
|
||||
...
|
||||
```
|
||||
|
||||
Example: `_start` calls `_DOSBase` (external):
|
||||
```
|
||||
$81000004 ; EXT_REF32, name = 4 longs
|
||||
"_DOS"
|
||||
"Base"
|
||||
$00000003 ; 3 reference sites
|
||||
$0000001C ; offset $1C
|
||||
$00000034 ; offset $34
|
||||
$00000048 ; offset $48
|
||||
```
|
||||
|
||||
### EXT_COMMON — Common Block (BSS)
|
||||
|
||||
Uninitialized data shared across multiple object files (like C `extern int x;`). The linker allocates one block, all references point to it:
|
||||
|
||||
```
|
||||
bits[31:24] = 0x82 EXT_COMMON
|
||||
<name...>
|
||||
<size> size in bytes (the common block size)
|
||||
<num_refs> reference sites
|
||||
<offsets...>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Worked Binary Example
|
||||
|
||||
Object file exporting `_foo` (at offset 0) and importing `_puts`:
|
||||
|
||||
```
|
||||
Offset Hex Meaning
|
||||
$00: 00 00 03 E7 HUNK_UNIT
|
||||
$04: 00 00 00 01 name length = 1 long
|
||||
$08: "foo\0" unit name = "foo"
|
||||
$0C: 00 00 03 E9 HUNK_CODE
|
||||
$10: 00 00 00 08 8 longwords = 32 bytes of code
|
||||
$14: [32 bytes of code...]
|
||||
$34: 00 00 03 EF HUNK_EXT
|
||||
$38: 01 00 00 01 EXT_DEF, name = 1 long
|
||||
$3C: "_foo" symbol name "_foo"
|
||||
$40: 00 00 00 00 at offset 0 in code hunk
|
||||
$44: 81 00 00 01 EXT_REF32, name = 1 long
|
||||
$48: "_put" (truncated)
|
||||
$4C: "s\0\0\0"
|
||||
$50: 00 00 00 01 1 reference
|
||||
$54: 00 00 00 08 at offset $08 in code
|
||||
$58: 00 00 00 00 HUNK_EXT terminator
|
||||
$5C: 00 00 03 F2 HUNK_END
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Linker Resolution
|
||||
|
||||
When the linker processes multiple object files:
|
||||
|
||||
1. Build a **global symbol table** from all `EXT_DEF` records
|
||||
2. For each `EXT_REF32`, find the defining object and record the target hunk + offset
|
||||
3. Emit `HUNK_RELOC32` in the output executable to patch the reference sites at load time
|
||||
4. `EXT_COMMON` blocks are allocated once; all references redirected to that allocation
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- NDK39: `dos/doshunks.h` — EXT_ type constants
|
||||
- vlink documentation: http://sun.hasenbraten.de/vlink/release/vlink.pdf
|
||||
- ADE (Amiga Developer Environment) linker source code
|
||||
- SAS/C 6.x reference manual — object file format appendix
|
||||
536
03_loader_and_exec_format/hunk_format.md
Normal file
536
03_loader_and_exec_format/hunk_format.md
Normal file
|
|
@ -0,0 +1,536 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# HUNK Binary Format — Complete Specification
|
||||
|
||||
## Overview
|
||||
|
||||
The **HUNK** format is the binary container format used throughout AmigaOS. It is **not** a single file type — it covers two very different kinds of file that happen to share the same record structure:
|
||||
|
||||
| File kind | Extension | First longword | Can be executed? |
|
||||
|---|---|---|---|
|
||||
| **Executable** — program, shared library, device driver | *(none)*, `.library`, `.device` | `$000003F3` (`HUNK_HEADER`) | ✅ Yes — loaded directly by `dos.library LoadSeg()` |
|
||||
| **Object file** — compiler/assembler output, needs linking | `.o` | `$000003E7` (`HUNK_UNIT`) | ❌ No — must be linked first to produce an executable |
|
||||
| **Static library archive** — collection of object files | `.lib` | `$000003FA` (`HUNK_LIB`) | ❌ No — linker input only |
|
||||
|
||||
An object file (`.o`) is **intermediate output** from a compiler. It contains relocatable code and unresolved external references. A linker (`slink`, `vlink`) combines one or more `.o` files with library archives into a final executable.
|
||||
|
||||
The format is a linear stream of **hunk records**, each identified by a 32-bit type word followed by type-specific data.
|
||||
|
||||
---
|
||||
|
||||
## Magic Number — All Valid First Longword Values
|
||||
|
||||
Tools and the OS identify a HUNK file by reading its **first 32-bit longword**. There are exactly three valid opening values:
|
||||
|
||||
| First longword | Hex | Dec | Constant | File type | Who reads it |
|
||||
|---|---|---|---|---|---|
|
||||
| `$000003F3` | `0x3F3` | 1011 | `HUNK_HEADER` | **Loadable executable** — program, `.library`, `.device` | `dos.library InternalLoadSeg()` |
|
||||
| `$000003E7` | `0x3E7` | 999 | `HUNK_UNIT` | **Relocatable object file** (`.o`) — compiler/assembler output | Linker (`slink`, `vlink`) |
|
||||
| `$000003FA` | `0x3FA` | 1018 | `HUNK_LIB` | **Static library archive** (`.lib`) — collection of `.o` files | Linker only |
|
||||
|
||||
Any other first longword means the file is **not a valid HUNK file**. `InternalLoadSeg` will return an error.
|
||||
|
||||
> [!NOTE]
|
||||
> Only `HUNK_HEADER` files can be passed to `LoadSeg()`. Passing a `.o` object file or a `.lib` archive to `LoadSeg()` will fail — those are consumed exclusively by the linker at build time, never at runtime.
|
||||
|
||||
### What `$000003F3` means exactly
|
||||
|
||||
The value `$000003F3` = decimal 1011 = the constant `HUNK_HEADER`. Nothing about this value is arbitrary — it is the hunk type code for the header record, used as the magic number because the header is always the first hunk in an executable.
|
||||
|
||||
### What `$000003E7` means exactly
|
||||
|
||||
The value `$000003E7` = decimal 999 = `HUNK_UNIT`. This marks the start of one relocatable compilation unit. A `.o` file may contain multiple `HUNK_UNIT` records, one per independently-compiled module (though most compilers emit exactly one per file).
|
||||
|
||||
### Checking the magic yourself
|
||||
|
||||
```bash
|
||||
# Check file type from the command line:
|
||||
python3 -c "
|
||||
import struct, sys
|
||||
data = open(sys.argv[1], 'rb').read(4)
|
||||
tag = struct.unpack('>I', data)[0]
|
||||
names = {0x3F3:'HUNK_HEADER (executable)', 0x3E7:'HUNK_UNIT (object file)', 0x3FA:'HUNK_LIB (library archive)'}
|
||||
print(f'{sys.argv[1]}: {names.get(tag, f\"UNKNOWN ({tag:#010x})\")}')
|
||||
" mybinary
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Hunk Type Codes
|
||||
|
||||
> Source header: **`dos/doshunks.h`** (NDK 3.9). Every hunk record starts with one of these 32-bit tag values. The file is a linear stream — the loader reads tag → payload → next tag, until the file ends.
|
||||
|
||||
---
|
||||
|
||||
### Terminology
|
||||
|
||||
| Term | Meaning |
|
||||
|---|---|
|
||||
| **longword** | 32-bit (4-byte) value — the native word size of the 68000 |
|
||||
| **BPTR** | BCPL pointer — byte address **right-shifted by 2** (always longword-aligned). Dereference: `real_addr = bptr_value << 2` |
|
||||
| **ULONG** | Unsigned 32-bit integer |
|
||||
| **UBYTE** | Unsigned 8-bit byte |
|
||||
| **size in longs** | Content length as a count of 4-byte longwords. Bytes = longs × 4 |
|
||||
| **Exec** | Appears in loadable executables only (starts with `HUNK_HEADER`) |
|
||||
| **Obj** | Appears in relocatable object files only (starts with `HUNK_UNIT`) |
|
||||
| **Both** | Valid in either context |
|
||||
|
||||
---
|
||||
|
||||
### Group 1 — Object File Framing
|
||||
|
||||
> These two tags appear **only in `.o` files**. Never in a final linked executable.
|
||||
|
||||
| Hex | Dec | Constant | Wire format | Description |
|
||||
|---|---|---|---|---|
|
||||
| `$3E7` | 999 | `HUNK_UNIT` | `[tag] [name_len_longs] [name_bytes…]` | **Start of a relocatable object unit.** Always the very first record in a `.o` file — the object-file equivalent of `HUNK_HEADER`. The name field names the compilation unit (e.g. `"main.o"`). A single `.o` file may contain multiple `HUNK_UNIT` records. |
|
||||
| `$3E8` | 1000 | `HUNK_NAME` | `[tag] [name_len_longs] [name_bytes…]` | **Section name label.** Optional; assigns a human-readable name to the following section. The linker uses it for map files and diagnostics. |
|
||||
|
||||
---
|
||||
|
||||
### Group 2 — Content Sections
|
||||
|
||||
> Carry actual program data. Valid in **both** executables and object files. The type longword may have `HUNKF_CHIP` / `HUNKF_FAST` ORed into its upper bits — see [Memory Placement Flags](#memory-placement-flags).
|
||||
|
||||
| Hex | Dec | Constant | Payload | Description |
|
||||
|---|---|---|---|---|
|
||||
| `$3E9` | 1001 | `HUNK_CODE` | `[tag] [size_longs] [code_bytes × size×4]` | **Machine-code section.** The loader allocates RAM, copies the bytes, then applies any `HUNK_RELOC32` that follows. Holds 68k instructions — never data. |
|
||||
| `$3EA` | 1002 | `HUNK_DATA` | `[tag] [size_longs] [data_bytes × size×4]` | **Initialized read/write data.** Global variables with non-zero values, string literals, jump tables, etc. Any embedded pointers to other hunks require `HUNK_RELOC32` fixups. |
|
||||
| `$3EB` | 1003 | `HUNK_BSS` | `[tag] [size_longs]` *(no data bytes)* | **Uninitialized data (zero-fill).** Only the size is stored — no bytes in the file. The loader calls `AllocMem(..., MEMF_CLEAR)`. A 64 KB zero array costs 4 bytes on disk. |
|
||||
|
||||
---
|
||||
|
||||
### Group 3 — Relocation Records
|
||||
|
||||
> Tell the loader which longwords inside the current hunk need to be patched with the actual load address of another hunk. Without relocation, all cross-hunk pointers would point to wrong addresses after the OS places code at a non-zero address.
|
||||
|
||||
| Hex | Dec | Constant | Alias | Field width | Description |
|
||||
|---|---|---|---|---|---|
|
||||
| `$3EC` | 1004 | `HUNK_RELOC32` | `HUNK_ABSRELOC32` | LONG (32-bit) | **Absolute 32-bit fixup — the most common type.** Wire format: `[tag] { [count] [hunk_idx] [offset_0] … [offset_n] } … [0]`. Each offset points to a longword in the current hunk; `*(ULONG*)(base+offset) += target_hunk_base`. Terminated by `count=0`. |
|
||||
| `$3ED` | 1005 | `HUNK_RELOC16` | `HUNK_RELRELOC16` | LONG (32-bit) | **16-bit absolute fixup.** Same format as above but patches a UWORD. Rare — 68k branch displacements are PC-relative and need no reloc. |
|
||||
| `$3EE` | 1006 | `HUNK_RELOC8` | `HUNK_RELRELOC8` | LONG (32-bit) | **8-bit fixup.** Patches a UBYTE. Essentially unused — no 68k instruction has an 8-bit absolute address field. |
|
||||
| `$3F7` | 1015 | `HUNK_DREL32` | — | WORD (16-bit) | **Compact 32-bit reloc.** Same semantics as `HUNK_RELOC32` but count, hunk index, and offsets are stored as 16-bit WORDs, halving the table size. Valid only when all hunk offsets fit in 16 bits (hunk < 64 KB). Generated by BLink. |
|
||||
| `$3F8` | 1016 | `HUNK_DREL16` | — | WORD (16-bit) | Compact 16-bit reloc with WORD-sized fields. Very rare. |
|
||||
| `$3F9` | 1017 | `HUNK_DREL8` | — | WORD (16-bit) | Compact 8-bit reloc with WORD-sized fields. Essentially unused. |
|
||||
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | WORD (16-bit) | **Compact absolute 32-bit reloc with WORD offsets.** Semantically identical to `HUNK_RELOC32` with WORD fields. Default output of vasm/vlink when all offsets fit in 16 bits. Preferred over `HUNK_DREL32` in OS 3.x-era tools. |
|
||||
| `$3FD` | 1021 | `HUNK_RELRELOC32` | — | LONG (32-bit) | **PC-relative 32-bit reloc.** Patch: `*(LONG*)(base+off) += target_base − (base+off+4)`. Used by GCC `-fPIC` and PIC shared libraries. |
|
||||
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | — | LONG (32-bit) | **Absolute 16-bit fixup.** Patches a UWORD with the low 16 bits of the target's absolute address. Required for `MOVE.W #abs_addr,Dn` patterns. Rare. |
|
||||
|
||||
---
|
||||
|
||||
### Group 4 — External Symbol Table
|
||||
|
||||
> Object files only — **never present in a linked executable**.
|
||||
|
||||
| Hex | Dec | Constant | Description |
|
||||
|---|---|---|---|
|
||||
| `$3EF` | 1007 | `HUNK_EXT` | **Import + export symbol table for a compilation unit.** A single stream encodes both sides: *exports* declare symbols defined in this hunk (type `EXT_DEF`, `EXT_ABS`, `EXT_RES`); *imports* list unresolved references the linker must satisfy from other objects (type `EXT_REF32`, `EXT_REF16`, `EXT_REF8`, `EXT_COMMON`). The linker resolves all imports and emits `HUNK_RELOC32` records in the output executable. Wire format: `[tag] { [type_and_namelen] [name_bytes…] [value_or_refcount] [ref_offsets…] } … [0]`. See [`hunk_ext_deep_dive.md`](hunk_ext_deep_dive.md) for sub-type encoding. |
|
||||
|
||||
---
|
||||
|
||||
### Group 5 — Debug and Metadata
|
||||
|
||||
> **Completely ignored by the OS loader.** Strip with `slink NODBG` or `m68k-amigaos-strip --strip-debug` to reduce file size.
|
||||
|
||||
| Hex | Dec | Constant | Payload | Description |
|
||||
|---|---|---|---|---|
|
||||
| `$3F0` | 1008 | `HUNK_SYMBOL` | `[tag] { [namelen_longs] [name_bytes…] [value] } … [0]` | **Local symbol table.** Maps label names → offsets within this hunk. Consumed by MonAm, wack, IDA Pro. Terminated by `namelen=0`. |
|
||||
| `$3F1` | 1009 | `HUNK_DEBUG` | `[tag] [size_longs] [format_tag] [data_bytes…]` | **Opaque debug block.** The leading `format_tag` longword identifies the format: `$3D415053` = SAS/C stabs; `$3D474343` = GCC stabs; `$3D574152` = Warp/Storm C. See [`hunk_debug_info.md`](hunk_debug_info.md). |
|
||||
|
||||
---
|
||||
|
||||
### Group 6 — Structural Records
|
||||
|
||||
| Hex | Dec | Constant | Payload | Description |
|
||||
|---|---|---|---|---|
|
||||
| `$3F2` | 1010 | `HUNK_END` | `[tag]` only — **no payload** | **Required end-of-hunk marker.** Every code/data/BSS hunk (and all reloc/symbol records that follow it) must close with `HUNK_END`. The loader advances to the next segment slot on reading it. |
|
||||
| `$3F3` | 1011 | `HUNK_HEADER` | `[tag] [0] [num_hunks] [first_hunk] [last_hunk] [size_longs × n]` | **Executable magic number and segment size table.** Must be the very first longword in a loadable executable. The zero longword is the resident-library list (always 0 in practice). `num_hunks` = total hunks; `first_hunk`/`last_hunk` = inclusive range; followed by one size-in-longs per hunk. |
|
||||
| `$3F5` | 1013 | `HUNK_OVERLAY` | `[tag] [size_longs] [overlay_table_data…]` | **Overlay descriptor table.** Follows the resident hunks; describes groups of code swapped in from disk on demand. Allows programs larger than available RAM. Obsolete — prefer `OpenLibrary()`. |
|
||||
| `$3F6` | 1014 | `HUNK_BREAK` | `[tag]` only — **no payload** | **End of overlay tree sentinel.** `InternalLoadSeg` needs this to know where the overlay descriptor ends and the per-overlay hunk data begins. |
|
||||
|
||||
> [!NOTE]
|
||||
> Value `$3F4` (decimal 1012) is **unused** — the numbering skips it intentionally.
|
||||
|
||||
---
|
||||
|
||||
### Group 7 — Static Library Archive
|
||||
|
||||
> Linker input only. Never loaded by `LoadSeg()` at runtime.
|
||||
|
||||
| Hex | Dec | Constant | Description |
|
||||
|---|---|---|---|
|
||||
| `$3FA` | 1018 | `HUNK_LIB` | **Static library archive container.** A sequence of embedded `HUNK_UNIT` object files, each preceded by its size in longwords. Produced by `ar68k` or the AmigaOS `join` command. The linker extracts only the units needed to resolve outstanding `HUNK_EXT` imports. |
|
||||
| `$3FB` | 1019 | `HUNK_INDEX` | **Symbol index for `HUNK_LIB`.** A packed string table plus a per-unit map of exported symbol names → unit byte offsets. Lets the linker locate a function without scanning every object in the archive. Always immediately follows the `HUNK_LIB` it describes. |
|
||||
|
||||
|
||||
### Memory Placement Flags
|
||||
|
||||
|
||||
|
||||
The type longword for these three hunks can encode a **memory placement request** in its upper bits. The loader passes the corresponding `MEMF_*` flags to `AllocMem`.
|
||||
|
||||
```
|
||||
Bit layout of the type longword:
|
||||
|
||||
31 30 29 28 ............. 0
|
||||
┌───┐ ┌────┐ ┌────┐ ┌─────────────────┐
|
||||
│ 0 │ │CHIP│ │FAST│ │ Hunk type code │
|
||||
└───┘ └────┘ └────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
| Bit | Constant | Value | Meaning |
|
||||
|---|---|---|---|
|
||||
| 30 | `HUNKF_CHIP` | `1L<<30` | Hunk **must** be in Chip RAM — required for anything the custom chips DMA from (bitmaps, audio, copper lists, sprites) |
|
||||
| 29 | `HUNKF_FAST` | `1L<<29` | Hunk **prefers** Fast RAM — use for pure CPU data where DMA is not needed; avoids Chip RAM bus contention |
|
||||
| 30+29 both set | *(extended)* | `0x60000000` | Next longword in the file contains full `MEMF_*` flags for `AllocMem` — allows any combination |
|
||||
| neither | *(default)* | `0` | `MEMF_PUBLIC` — any available memory |
|
||||
|
||||
Additional helper constants:
|
||||
|
||||
| Constant | Value | Meaning |
|
||||
|---|---|---|
|
||||
| `HUNKB_CHIP` | `30` | Bit **number** (use with `bset`/`btst`) |
|
||||
| `HUNKB_FAST` | `29` | Bit **number** |
|
||||
|
||||
**`MEMF_*` flags** used in extended mode (from `exec/memory.h`):
|
||||
|
||||
| Constant | Value | Meaning |
|
||||
|---|---|---|
|
||||
| `MEMF_ANY` | `0` | No preference — any accessible memory |
|
||||
| `MEMF_PUBLIC` | `1<<0` | Must be accessible by all tasks and hardware |
|
||||
| `MEMF_CHIP` | `1<<1` | Chip RAM — reachable by DMA controllers |
|
||||
| `MEMF_FAST` | `1<<2` | Fast RAM — CPU-only, no chip DMA contention |
|
||||
| `MEMF_CLEAR` | `1<<16` | Zero-fill on allocation |
|
||||
| `MEMF_LARGEST` | `1<<17` | Return the single largest contiguous free block |
|
||||
| `MEMF_REVERSE` | `1<<18` | Allocate from the top of the region (high addresses first) |
|
||||
| `MEMF_TOTAL` | `1<<19` | `AvailMem`: report total installed rather than current free |
|
||||
|
||||
**Example:** force a code hunk into Chip RAM:
|
||||
|
||||
```
|
||||
type longword = HUNK_CODE | HUNKF_CHIP
|
||||
= 0x000003E9 | 0x40000000
|
||||
= 0xC00003E9
|
||||
```
|
||||
|
||||
**Why would code go in Chip RAM?** Rare, but needed on an A500 with no Fast RAM — everything including code must fit in the 512 KB Chip RAM.
|
||||
|
||||
---
|
||||
|
||||
### Quick Reference Table
|
||||
|
||||
| Hex | Dec | Constant | Alias | Context | Purpose |
|
||||
|---|---|---|---|---|---|
|
||||
| `$3E7` | 999 | `HUNK_UNIT` | — | Obj | Start of relocatable object unit |
|
||||
| `$3E8` | 1000 | `HUNK_NAME` | — | Obj | Name label for the following section |
|
||||
| `$3E9` | 1001 | `HUNK_CODE` | — | Both | Machine-code section |
|
||||
| `$3EA` | 1002 | `HUNK_DATA` | — | Both | Initialized read/write data |
|
||||
| `$3EB` | 1003 | `HUNK_BSS` | — | Both | Uninitialized data (size only, no bytes) |
|
||||
| `$3EC` | 1004 | `HUNK_RELOC32` | `HUNK_ABSRELOC32` | Both | Absolute 32-bit address fixup list |
|
||||
| `$3ED` | 1005 | `HUNK_RELOC16` | `HUNK_RELRELOC16` | Obj | 16-bit address fixup list |
|
||||
| `$3EE` | 1006 | `HUNK_RELOC8` | `HUNK_RELRELOC8` | Obj | 8-bit fixup list |
|
||||
| `$3EF` | 1007 | `HUNK_EXT` | — | Obj | Import + export symbol table |
|
||||
| `$3F0` | 1008 | `HUNK_SYMBOL` | — | Both | Local debug symbol table |
|
||||
| `$3F1` | 1009 | `HUNK_DEBUG` | — | Both | Opaque debug data (stabs / DWARF) |
|
||||
| `$3F2` | 1010 | `HUNK_END` | — | Both | End-of-hunk marker — **required** |
|
||||
| `$3F3` | 1011 | `HUNK_HEADER` | — | Exec | Executable magic number + size table |
|
||||
| *(none)* | *1012* | *(unused)* | — | — | Gap in the numbering |
|
||||
| `$3F5` | 1013 | `HUNK_OVERLAY` | — | Exec | Overlay group descriptor |
|
||||
| `$3F6` | 1014 | `HUNK_BREAK` | — | Exec | End of overlay tree |
|
||||
| `$3F7` | 1015 | `HUNK_DREL32` | — | Both | Compact 32-bit reloc (WORD-width fields) |
|
||||
| `$3F8` | 1016 | `HUNK_DREL16` | — | Obj | Compact 16-bit reloc |
|
||||
| `$3F9` | 1017 | `HUNK_DREL8` | — | Obj | Compact 8-bit reloc |
|
||||
| `$3FA` | 1018 | `HUNK_LIB` | — | Obj | Static library archive |
|
||||
| `$3FB` | 1019 | `HUNK_INDEX` | — | Obj | Symbol index for HUNK_LIB |
|
||||
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | Both | Compact abs 32-bit reloc (WORD offsets) |
|
||||
| `$3FD` | 1021 | `HUNK_RELRELOC32` | — | Both | PC-relative 32-bit reloc |
|
||||
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | — | Both | Absolute 16-bit address patch |
|
||||
|
||||
**Context key:** `Exec` = loadable executable · `Obj` = object file (HUNK_UNIT stream) · `Both` = either
|
||||
|
||||
```c
|
||||
/* dos/doshunks.h — NDK 3.9 */
|
||||
|
||||
#define HUNK_UNIT 999 /* 0x3E7 — start of relocatable object (.o) */
|
||||
#define HUNK_NAME 1000 /* 0x3E8 — name of this object unit/section */
|
||||
#define HUNK_CODE 1001 /* 0x3E9 — machine code section */
|
||||
#define HUNK_DATA 1002 /* 0x3EA — initialized data section */
|
||||
#define HUNK_BSS 1003 /* 0x3EB — uninitialized data (zero-fill) */
|
||||
#define HUNK_RELOC32 1004 /* 0x3EC — 32-bit absolute relocation table */
|
||||
#define HUNK_ABSRELOC32 HUNK_RELOC32 /* alias */
|
||||
#define HUNK_RELOC16 1005 /* 0x3ED — 16-bit relocation table */
|
||||
#define HUNK_RELRELOC16 HUNK_RELOC16 /* alias */
|
||||
#define HUNK_RELOC8 1006 /* 0x3EE — 8-bit relocation table */
|
||||
#define HUNK_RELRELOC8 HUNK_RELOC8 /* alias */
|
||||
#define HUNK_EXT 1007 /* 0x3EF — external symbol table (obj only) */
|
||||
#define HUNK_SYMBOL 1008 /* 0x3F0 — local debug symbol table */
|
||||
#define HUNK_DEBUG 1009 /* 0x3F1 — arbitrary debug data (stabs etc.) */
|
||||
#define HUNK_END 1010 /* 0x3F2 — end-of-hunk marker */
|
||||
#define HUNK_HEADER 1011 /* 0x3F3 — executable file header (magic) */
|
||||
/* 0x3F4 — unused */
|
||||
#define HUNK_OVERLAY 1013 /* 0x3F5 — overlay tree descriptor */
|
||||
#define HUNK_BREAK 1014 /* 0x3F6 — end of overlay tree */
|
||||
#define HUNK_DREL32 1015 /* 0x3F7 — compact 32-bit reloc (WORD fields) */
|
||||
#define HUNK_DREL16 1016 /* 0x3F8 — compact 16-bit reloc */
|
||||
#define HUNK_DREL8 1017 /* 0x3F9 — compact 8-bit reloc */
|
||||
#define HUNK_LIB 1018 /* 0x3FA — static library archive */
|
||||
#define HUNK_INDEX 1019 /* 0x3FB — symbol index for HUNK_LIB */
|
||||
#define HUNK_RELOC32SHORT 1020 /* 0x3FC — compact 32-bit absolute reloc */
|
||||
#define HUNK_RELRELOC32 1021 /* 0x3FD — PC-relative 32-bit reloc */
|
||||
#define HUNK_ABSRELOC16 1022 /* 0x3FE — absolute 16-bit reloc */
|
||||
|
||||
/* Memory placement flags — OR'd into HUNK_CODE/DATA/BSS type longword */
|
||||
#define HUNKB_FAST 29 /* bit number for Fast RAM flag */
|
||||
#define HUNKB_CHIP 30 /* bit number for Chip RAM flag */
|
||||
#define HUNKF_FAST (1L<<29) /* request Fast RAM for this hunk */
|
||||
#define HUNKF_CHIP (1L<<30) /* request Chip RAM for this hunk */
|
||||
```
|
||||
|
||||
### Terminology used in this document
|
||||
|
||||
- **longword** — a 32-bit (4-byte) value; the natural word size of the 68k
|
||||
- **BPTR** — a BCPL pointer: the byte address right-shifted by 2 (always longword-aligned). Convert: `byte_addr = BPTR_value << 2`
|
||||
- **ULONG** — unsigned 32-bit integer (`unsigned long` on the 68k)
|
||||
- **UBYTE** — unsigned 8-bit byte
|
||||
- **size in longs** — the hunk content length expressed as a count of 4-byte longwords (multiply by 4 to get bytes)
|
||||
|
||||
### Quick Reference Table
|
||||
|
||||
| Hex | Dec | Name | Alias | Context | Purpose |
|
||||
|---|---|---|---|---|---|
|
||||
| `$3E7` | 999 | `HUNK_UNIT` | — | Obj | Start of relocatable object unit |
|
||||
| `$3E8` | 1000 | `HUNK_NAME` | — | Obj | Name string for this unit/section |
|
||||
| `$3E9` | 1001 | `HUNK_CODE` | — | Both | Executable machine code section |
|
||||
| `$3EA` | 1002 | `HUNK_DATA` | — | Both | Initialized read/write data section |
|
||||
| `$3EB` | 1003 | `HUNK_BSS` | — | Both | Uninitialized data — size only, no bytes |
|
||||
| `$3EC` | 1004 | `HUNK_RELOC32` | `HUNK_ABSRELOC32` | Both | Absolute 32-bit address patch list |
|
||||
| `$3ED` | 1005 | `HUNK_RELOC16` | `HUNK_RELRELOC16` | Obj | 16-bit address patch list |
|
||||
| `$3EE` | 1006 | `HUNK_RELOC8` | `HUNK_RELRELOC8` | Obj | 8-bit patch list |
|
||||
| `$3EF` | 1007 | `HUNK_EXT` | — | Obj | Import + export symbol table |
|
||||
| `$3F0` | 1008 | `HUNK_SYMBOL` | — | Both | Local debug symbol table |
|
||||
| `$3F1` | 1009 | `HUNK_DEBUG` | — | Both | Opaque debug data (stabs, DWARF) |
|
||||
| `$3F2` | 1010 | `HUNK_END` | — | Both | End of this hunk — required |
|
||||
| `$3F3` | 1011 | `HUNK_HEADER` | — | Exec | Executable magic + size table |
|
||||
| `$3F5` | 1013 | `HUNK_OVERLAY` | — | Exec | Overlay group descriptor |
|
||||
| `$3F6` | 1014 | `HUNK_BREAK` | — | Exec | End of overlay tree |
|
||||
| `$3F7` | 1015 | `HUNK_DREL32` | — | Both | Compact 32-bit reloc (WORD-sized fields) |
|
||||
| `$3F8` | 1016 | `HUNK_DREL16` | — | Obj | Compact 16-bit reloc |
|
||||
| `$3F9` | 1017 | `HUNK_DREL8` | — | Obj | Compact 8-bit reloc |
|
||||
| `$3FA` | 1018 | `HUNK_LIB` | — | Obj | Static library archive (.lib) |
|
||||
| `$3FB` | 1019 | `HUNK_INDEX` | — | Obj | Symbol index for HUNK_LIB |
|
||||
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | Both | Compact abs 32-bit reloc (WORD offsets) |
|
||||
| `$3FD` | 1021 | `HUNK_RELRELOC32` | — | Both | PC-relative 32-bit reloc |
|
||||
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | — | Both | Absolute 16-bit address patch |
|
||||
|
||||
**Context key:** `Exec` = loadable executable only · `Obj` = object file (HUNK_UNIT stream) only · `Both` = valid in either
|
||||
|
||||
### Memory Type Flags on HUNK_CODE / HUNK_DATA / HUNK_BSS
|
||||
|
||||
The type longword for code, data, and BSS hunks can carry memory placement hints in its upper two bits:
|
||||
|
||||
```
|
||||
Bits 31..0 of the type longword:
|
||||
bit 31: unused (always 0)
|
||||
bit 30: HUNKF_CHIP — loader must use AllocMem(..., MEMF_CHIP)
|
||||
bit 29: HUNKF_FAST — loader prefers AllocMem(..., MEMF_FAST)
|
||||
bits 28..0: the hunk type constant (e.g. 0x3E9 for HUNK_CODE)
|
||||
```
|
||||
|
||||
| Bit 30 | Bit 29 | Meaning | `AllocMem` flags used |
|
||||
|---|---|---|---|
|
||||
| 0 | 0 | Any memory (default) | `MEMF_PUBLIC` |
|
||||
| 1 | 0 | Chip RAM required | `MEMF_CHIP` |
|
||||
| 0 | 1 | Fast RAM preferred | `MEMF_FAST` |
|
||||
| 1 | 1 | **Extended** — next longword holds full `MEMF_*` flags | caller reads extra longword |
|
||||
|
||||
**`MEMF_*` flag constants** (from `exec/memory.h`, used in extended mode):
|
||||
|
||||
```c
|
||||
/* exec/memory.h — NDK39 */
|
||||
#define MEMF_PUBLIC (1L<<0) /* any accessible memory */
|
||||
#define MEMF_CHIP (1L<<1) /* Chip RAM (DMA-reachable by custom chips) */
|
||||
#define MEMF_FAST (1L<<2) /* Fast RAM (CPU-only; faster than Chip) */
|
||||
#define MEMF_VIRTUAL (1L<<3) /* not used on classic AmigaOS */
|
||||
#define MEMF_CLEAR (1L<<16) /* zero-fill the allocation */
|
||||
#define MEMF_LARGEST (1L<<17) /* return the single largest free block */
|
||||
#define MEMF_REVERSE (1L<<18) /* allocate from top of list (high address) */
|
||||
#define MEMF_TOTAL (1L<<19) /* AvailMem: return total, not largest */
|
||||
#define MEMF_ANY 0L /* no preference — equivalent to MEMF_PUBLIC */
|
||||
```
|
||||
|
||||
Example: `HUNK_CODE` forced into Chip RAM:
|
||||
```
|
||||
type longword = HUNK_CODE | HUNKF_CHIP
|
||||
= 0x000003E9 | 0x40000000
|
||||
= 0xC00003E9
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_HEADER — Executable Header
|
||||
|
||||
Appears at the very start of an executable file:
|
||||
|
||||
```
|
||||
$000003F3 HUNK_HEADER magic
|
||||
$00000000 Resident library list (always 0 for loadable executables)
|
||||
<num_hunks> Number of hunks in the executable (longword)
|
||||
<first_hunk> Index of first loadable hunk (usually 0)
|
||||
<last_hunk> Index of last loadable hunk (= num_hunks - 1)
|
||||
<size_0> Size of hunk 0 in longwords
|
||||
<size_1> Size of hunk 1 in longwords
|
||||
...
|
||||
<size_n> Size of last hunk in longwords
|
||||
```
|
||||
|
||||
**Size longword bit encoding:**
|
||||
```
|
||||
bits 31-30: Memory type (00=ANY, 10=CHIP, 01=FAST, 11=extended)
|
||||
bits 29-0: Size in 32-bit longwords
|
||||
```
|
||||
|
||||
**Example header (2 hunks: code + data):**
|
||||
```
|
||||
Offset Bytes Meaning
|
||||
$00: 00 00 03 F3 HUNK_HEADER
|
||||
$04: 00 00 00 00 no resident library list
|
||||
$08: 00 00 00 02 2 hunks
|
||||
$0C: 00 00 00 00 first hunk index = 0
|
||||
$10: 00 00 00 01 last hunk index = 1
|
||||
$14: 00 00 00 50 hunk 0: 0x50 longwords = 0x140 bytes (code)
|
||||
$18: 00 00 00 10 hunk 1: 0x10 longwords = 0x40 bytes (data)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_CODE / HUNK_DATA
|
||||
|
||||
```
|
||||
<type> HUNK_CODE ($3E9) or HUNK_DATA ($3EA)
|
||||
<num_longs> Size in longwords
|
||||
<data...> Raw code or data bytes (num_longs × 4 bytes)
|
||||
HUNK_END ($3F2) End of this hunk
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_BSS
|
||||
|
||||
```
|
||||
<type> HUNK_BSS ($3EB)
|
||||
<num_longs> Size in longwords (allocate this many bytes, zero-filled)
|
||||
HUNK_END ($3F2) End of this hunk
|
||||
```
|
||||
|
||||
No data follows — BSS is zero-initialised by the loader.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC32
|
||||
|
||||
32-bit absolute relocation records. These patch addresses in the code/data that reference other hunks.
|
||||
|
||||
```
|
||||
<HUNK_RELOC32> $000003EC
|
||||
<num_offsets> Number of offsets to patch for the next target hunk
|
||||
<target_hunk> Index of the hunk whose base address is added
|
||||
<offset_0> Byte offset within current hunk to patch (longword)
|
||||
<offset_1>
|
||||
...
|
||||
<num_offsets=0> Terminator — end of relocation table
|
||||
HUNK_END ($3F2)
|
||||
```
|
||||
|
||||
**Example:** Code hunk references data hunk. Two addresses need patching:
|
||||
```
|
||||
$000003EC HUNK_RELOC32
|
||||
$00000002 2 offsets to patch
|
||||
$00000001 target = hunk 1 (data hunk)
|
||||
$00000010 patch at offset $10 in code hunk
|
||||
$00000024 patch at offset $24 in code hunk
|
||||
$00000000 end of reloc list
|
||||
$000003F2 HUNK_END
|
||||
```
|
||||
|
||||
At load time: `*(ULONG *)(code_base + 0x10) += data_base`
|
||||
|
||||
---
|
||||
|
||||
## HUNK_SYMBOL
|
||||
|
||||
Optional local symbol table for debugging:
|
||||
|
||||
```
|
||||
<HUNK_SYMBOL> $000003F0
|
||||
<name_len> String length in longwords (1–N)
|
||||
<name...> Symbol name (padded to longword boundary)
|
||||
<value> Symbol value (offset within hunk)
|
||||
...
|
||||
<0> Zero name_len terminates
|
||||
HUNK_END ($3F2)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete Executable Structure (Annotated)
|
||||
|
||||
```
|
||||
[File start]
|
||||
HUNK_HEADER
|
||||
num_hunks = 3 ; code, data, BSS
|
||||
sizes: [0x200, 0x80, 0x100]
|
||||
|
||||
--- Hunk 0: Code ---
|
||||
HUNK_CODE
|
||||
0x200 bytes of machine code
|
||||
HUNK_RELOC32
|
||||
Patches in code referencing hunk 1 (data)
|
||||
Patches in code referencing hunk 2 (BSS)
|
||||
HUNK_SYMBOL (optional)
|
||||
_main at offset 0
|
||||
_foo at offset 0x40
|
||||
HUNK_END
|
||||
|
||||
--- Hunk 1: Data ---
|
||||
HUNK_DATA
|
||||
0x80 bytes of initialized data
|
||||
HUNK_RELOC32
|
||||
Patches in data (e.g., pointer tables) referencing code hunk
|
||||
HUNK_END
|
||||
|
||||
--- Hunk 2: BSS ---
|
||||
HUNK_BSS
|
||||
0x100 longwords = 1024 bytes (zero-filled at load time)
|
||||
HUNK_END
|
||||
|
||||
[File end]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## File Format Diagram
|
||||
|
||||
```mermaid
|
||||
block-beta
|
||||
columns 1
|
||||
header["HUNK_HEADER\n(sizes + memory types)"]
|
||||
code["HUNK_CODE\n(machine code bytes)"]
|
||||
reloc0["HUNK_RELOC32\n(patch list for code)"]
|
||||
sym["HUNK_SYMBOL\n(optional debug names)"]
|
||||
end0["HUNK_END"]
|
||||
data["HUNK_DATA\n(initialized data)"]
|
||||
reloc1["HUNK_RELOC32\n(patch list for data)"]
|
||||
end1["HUNK_END"]
|
||||
bss["HUNK_BSS\n(zero-fill size)"]
|
||||
end2["HUNK_END"]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS executable format chapter
|
||||
- ADCD 2.1: `Libraries_Manual_guide/` — LoadSeg, InternalLoadSeg
|
||||
- NDK39: `dos/doshunks.h` — hunk type constants
|
||||
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node01E0.html
|
||||
- Community reference: http://sun.hasenbraten.de/vlink/release/vlink.pdf (HUNK format appendix)
|
||||
160
03_loader_and_exec_format/hunk_relocation.md
Normal file
160
03_loader_and_exec_format/hunk_relocation.md
Normal file
|
|
@ -0,0 +1,160 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# HUNK Relocation Mechanics
|
||||
|
||||
## Overview
|
||||
|
||||
Relocation is the process of **patching absolute addresses** in a loaded executable to reflect its actual memory location. Since AmigaOS allocates memory dynamically, a program cannot know its load address at compile time — all inter-hunk references must be fixed up at runtime.
|
||||
|
||||
---
|
||||
|
||||
## Why Relocation Is Necessary
|
||||
|
||||
An Amiga executable contains references like:
|
||||
```asm
|
||||
LEA DataTable(PC), A0 ; PC-relative — no relocation needed
|
||||
MOVE.L #DataTable, A0 ; Absolute — MUST be relocated
|
||||
```
|
||||
|
||||
The linker places `DataTable` at some hunk-relative offset (e.g., offset 0 in the data hunk). The absolute address is only known at load time. The relocation table tells the loader which longwords in the code contain these absolute values.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC32 Format
|
||||
|
||||
```
|
||||
HUNK_RELOC32 ($000003EC)
|
||||
|
||||
[Repeat until terminator:]
|
||||
<num_offsets> Number of longword addresses to patch for this target hunk
|
||||
<target_hunk> Index of the hunk whose base address is added
|
||||
<offset_0> Byte offset within the current hunk to patch
|
||||
<offset_1>
|
||||
...
|
||||
|
||||
<0> num_offsets = 0 terminates the reloc list
|
||||
HUNK_END ($000003F2)
|
||||
```
|
||||
|
||||
### Patching Algorithm
|
||||
|
||||
For each entry in HUNK_RELOC32 of hunk `H`:
|
||||
```
|
||||
foreach (target_hunk, offsets[]):
|
||||
base = segment_base_address[target_hunk]
|
||||
foreach offset in offsets:
|
||||
*(ULONG *)(H_base + offset) += base
|
||||
```
|
||||
|
||||
The value at `H_base + offset` already contains the **hunk-relative** address written by the linker. Adding the actual base produces the final absolute address.
|
||||
|
||||
### Example
|
||||
|
||||
Code hunk references data hunk at two sites:
|
||||
```
|
||||
Before load (raw file values):
|
||||
code[0x18] = $00000000 ; linker placed "data offset 0" here
|
||||
code[0x2C] = $00000010 ; linker placed "data offset 0x10" here
|
||||
|
||||
HUNK_RELOC32:
|
||||
num_offsets = 2
|
||||
target_hunk = 1 ; data hunk
|
||||
offsets = [0x18, 0x2C]
|
||||
|
||||
After load (data hunk loaded at $20000):
|
||||
code[0x18] = $00000000 + $20000 = $00020000
|
||||
code[0x2C] = $00000010 + $20000 = $00020010
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC16 and HUNK_RELOC8
|
||||
|
||||
Same format as HUNK_RELOC32 but patch **16-bit** or **8-bit** values:
|
||||
- `HUNK_RELOC16` ($3ED): patches UWORD at offset
|
||||
- `HUNK_RELOC8` ($3EE): patches UBYTE at offset
|
||||
|
||||
These are rare in practice — the 68000 requires even-aligned word accesses and only supports 16-bit displacement in most addressing modes.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_DREL32 — Short Relocation (32-bit)
|
||||
|
||||
`HUNK_DREL32` ($3F7) is an alternative relocation format used by some linkers (e.g., BLink) for smaller reloc tables:
|
||||
|
||||
```
|
||||
HUNK_DREL32
|
||||
|
||||
[Repeat:]
|
||||
<num_offsets> (WORD, not LONGWORD)
|
||||
<target_hunk> (WORD)
|
||||
<offset_0> (WORD)
|
||||
...
|
||||
|
||||
<0> terminator
|
||||
```
|
||||
|
||||
By using 16-bit values, this format is more compact for programs with many relocations and small hunk sizes (<64 KB). AmigaOS `InternalLoadSeg` supports both formats.
|
||||
|
||||
---
|
||||
|
||||
## PC-Relative References (No Relocation Needed)
|
||||
|
||||
The 68020+ supports **PC-relative addressing** with 32-bit displacements:
|
||||
```asm
|
||||
LEA symbol(PC), A0 ; PC-relative load effective address
|
||||
MOVE.L data(PC), D0 ; PC-relative data read
|
||||
```
|
||||
|
||||
PC-relative references do not require relocation — the offset is relative to the instruction, so it is valid regardless of where the code is loaded. **GCC for 68k** generates PC-relative code by default (`-fpic`), significantly reducing the size of relocation tables.
|
||||
|
||||
SAS/C generates absolute references by default and relies heavily on `HUNK_RELOC32`.
|
||||
|
||||
---
|
||||
|
||||
## Relocation at Runtime — Segment Chain
|
||||
|
||||
The loader tracks loaded segments as a **BPTR chain** (singly-linked list). The segment list head is returned by `LoadSeg()`:
|
||||
|
||||
```
|
||||
Segment 0 (code):
|
||||
BPTR → Segment 1
|
||||
[code data]
|
||||
|
||||
Segment 1 (data):
|
||||
BPTR → 0 (NULL)
|
||||
[data]
|
||||
```
|
||||
|
||||
Each segment begins with a 4-byte BPTR to the next segment. Hunk index `n` corresponds to segment `n` in this chain.
|
||||
|
||||
---
|
||||
|
||||
## Viewing Relocations with Tools
|
||||
|
||||
### IDA Pro
|
||||
After loading a HUNK file with the Amiga plugin, IDA resolves relocations automatically. The fixup table is visible in `View → Open Subviews → Fixups`.
|
||||
|
||||
### hexdump + manual
|
||||
|
||||
Locate HUNK_RELOC32 ($3EC) in raw hex:
|
||||
```bash
|
||||
xxd mybinary | grep "0003 ec"
|
||||
```
|
||||
|
||||
Then read num_offsets and target_hunk longwords that follow.
|
||||
|
||||
### hunkinfo (custom tool)
|
||||
|
||||
```bash
|
||||
hunkinfo mybinary # shows all hunks, sizes, reloc counts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- NDK39: `dos/doshunks.h`
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS chapter, `InternalLoadSeg`
|
||||
- vlink linker documentation — relocation section
|
||||
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node01E0.html
|
||||
167
03_loader_and_exec_format/object_file_format.md
Normal file
167
03_loader_and_exec_format/object_file_format.md
Normal file
|
|
@ -0,0 +1,167 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# Object File Format (HUNK_UNIT)
|
||||
|
||||
## Overview
|
||||
|
||||
Compiler output (`.o` files) uses `HUNK_UNIT` format — a variation of the HUNK executable format for **relocatable, unlinked code**. Multiple object files are combined by the linker to produce a final executable.
|
||||
|
||||
---
|
||||
|
||||
## Object File vs Executable
|
||||
|
||||
| Feature | Object file | Executable |
|
||||
|---|---|---|
|
||||
| Magic word | `HUNK_UNIT` ($3E7) | `HUNK_HEADER` ($3F3) |
|
||||
| Unit name | Yes (HUNK_NAME) | No |
|
||||
| External refs | `HUNK_EXT` (imports+exports) | None (resolved by linker) |
|
||||
| Relocation | `HUNK_RELOC32` (partial) | `HUNK_RELOC32` (complete) |
|
||||
| BSS | `HUNK_BSS` | `HUNK_BSS` |
|
||||
| Loaded directly | No | Yes |
|
||||
|
||||
---
|
||||
|
||||
## Object File Structure
|
||||
|
||||
```
|
||||
HUNK_UNIT ($000003E7) — identifies this as an object file
|
||||
HUNK_NAME ($000003E8) — optional: source/module name
|
||||
<num_longs>
|
||||
<name string>
|
||||
|
||||
--- For each code/data/bss section: ---
|
||||
|
||||
HUNK_CODE / HUNK_DATA / HUNK_BSS
|
||||
<data>
|
||||
|
||||
HUNK_RELOC32 — intra-object relocations (if any)
|
||||
|
||||
HUNK_EXT — external symbol definitions + references
|
||||
EXT_DEF _myfunc = offset X (export)
|
||||
EXT_REF32 _printf [offsets] (import)
|
||||
|
||||
HUNK_SYMBOL — optional local symbols
|
||||
|
||||
HUNK_END
|
||||
|
||||
--- Repeat for additional sections ---
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Multi-Section Object Files
|
||||
|
||||
A single `.o` file can contain multiple code/data sections. Each section is a separate hunk terminated by `HUNK_END`:
|
||||
|
||||
```
|
||||
HUNK_UNIT
|
||||
HUNK_NAME "mymodule.c"
|
||||
HUNK_CODE ; section 0: main code
|
||||
[code...]
|
||||
HUNK_EXT ; exports/imports for section 0
|
||||
HUNK_END
|
||||
HUNK_CODE ; section 1: __initdata (constructor table)
|
||||
[init code...]
|
||||
HUNK_END
|
||||
HUNK_DATA ; section 2: initialized data
|
||||
[data...]
|
||||
HUNK_RELOC32 ; internal relocs for section 2
|
||||
HUNK_END
|
||||
HUNK_BSS ; section 3: uninitialized data
|
||||
HUNK_END
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Compiler Section Naming Convention
|
||||
|
||||
Different compilers use different conventions for code/data section organization:
|
||||
|
||||
### SAS/C 6.x
|
||||
|
||||
| Section | Type | Contents |
|
||||
|---|---|---|
|
||||
| `.text` | HUNK_CODE | Compiled functions |
|
||||
| `.data` | HUNK_DATA | Initialized globals |
|
||||
| `.bss` | HUNK_BSS | Uninitialized globals |
|
||||
| `__CSEG` | HUNK_CODE | (alternate) code |
|
||||
| `__DSEG` | HUNK_DATA | (alternate) data |
|
||||
|
||||
SAS/C 6.x uses HUNK_NAME to embed section names (matching HUNK_NAME format).
|
||||
|
||||
### GCC (m68k-amigaos)
|
||||
|
||||
GCC emits more sections:
|
||||
```
|
||||
.text — code
|
||||
.data — initialized data
|
||||
.bss — BSS
|
||||
.rodata — read-only data (string literals, const)
|
||||
.ctors — C++ constructor table
|
||||
.dtors — C++ destructor table
|
||||
```
|
||||
|
||||
VBCC follows a similar scheme to GCC.
|
||||
|
||||
---
|
||||
|
||||
## Library Archives (.lib)
|
||||
|
||||
A `.lib` file is an **archive of object files**, using `HUNK_LIB` ($3FA) and `HUNK_INDEX` ($3FB):
|
||||
|
||||
```
|
||||
HUNK_LIB ($000003FA)
|
||||
<total_size> size of all following library data in longs
|
||||
[HUNK_UNIT blocks for each member ...]
|
||||
|
||||
HUNK_INDEX ($000003FB)
|
||||
<size_in_longs> size of index data
|
||||
[index entries...]
|
||||
```
|
||||
|
||||
The index maps symbol names to their containing `HUNK_UNIT` within the library, allowing the linker to extract only the needed object files.
|
||||
|
||||
Libraries shipped with SAS/C, GCC, and VBCC use this format.
|
||||
|
||||
---
|
||||
|
||||
## Linker Operation (Overview)
|
||||
|
||||
The linker (`slink`, `blink`, `ld`) processes object files:
|
||||
|
||||
1. **Collect all HUNK_EXT exports** from every `.o` into a global symbol table
|
||||
2. **Resolve HUNK_EXT imports** — for each `EXT_REF32`, find the defining object
|
||||
3. **Pull in library members** — if an imported symbol lives in a `.lib`, add that object file
|
||||
4. **Merge sections** — combine all `.text` hunks into one code hunk, all `.data` into one data hunk, etc.
|
||||
5. **Emit HUNK_RELOC32** — for each resolved external reference, emit a relocation entry
|
||||
6. **Write HUNK_HEADER** — calculate final hunk sizes and write executable header
|
||||
|
||||
---
|
||||
|
||||
## Inspecting Object Files
|
||||
|
||||
### hexdump
|
||||
|
||||
```bash
|
||||
xxd myfile.o | head -40 # look for $000003E7 (HUNK_UNIT) at start
|
||||
```
|
||||
|
||||
### hunkinfo (community tool)
|
||||
|
||||
```bash
|
||||
hunkinfo myfile.o # lists all hunks, sizes, symbols, externals
|
||||
```
|
||||
|
||||
### IDA Pro
|
||||
|
||||
IDA can load `.o` files directly using the Amiga HUNK loader plugin — useful for reversing library object files without full executable context.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- NDK39: `dos/doshunks.h`
|
||||
- SAS/C 6.x Programmer's Guide — object file chapter
|
||||
- VBCC documentation — object file format
|
||||
- vlink linker manual (covers HUNK_LIB/HUNK_INDEX): http://sun.hasenbraten.de/vlink/
|
||||
- GCC m68k-amigaos port documentation
|
||||
83
03_loader_and_exec_format/overlay_system.md
Normal file
83
03_loader_and_exec_format/overlay_system.md
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
[← Home](../README.md) · [Loader & HUNK Format](README.md)
|
||||
|
||||
# HUNK_OVERLAY — Overlay System
|
||||
|
||||
## Overview
|
||||
|
||||
The **overlay system** allows programs larger than available RAM to run by dividing code into **segments loaded on demand**. Only one overlay section is present in memory at any time; others are swapped in from disk when needed.
|
||||
|
||||
This predates virtual memory and was commonly used in A500-era applications with limited Fast RAM.
|
||||
|
||||
---
|
||||
|
||||
## When Overlays Are Used
|
||||
|
||||
- Application code exceeds available RAM
|
||||
- Rarely-used code paths (setup, error handling) should not occupy memory permanently
|
||||
- The game/app has a fixed resident core and multiple interchangeable level/module overlays
|
||||
|
||||
---
|
||||
|
||||
## HUNK_OVERLAY Structure
|
||||
|
||||
```
|
||||
HUNK_HEADER
|
||||
(normal header for resident hunks)
|
||||
|
||||
[Normal hunks: code, data, BSS]
|
||||
|
||||
HUNK_OVERLAY ($000003F5)
|
||||
<size_in_longs> total size of overlay table data
|
||||
<overlay_tree...> tree of overlay nodes
|
||||
|
||||
HUNK_BREAK ($000003F6) marks end of overlay tree
|
||||
```
|
||||
|
||||
### Overlay Tree Format
|
||||
|
||||
The overlay tree describes groups of overlays and their dependencies:
|
||||
|
||||
```
|
||||
<num_overlay_nodes>
|
||||
For each node:
|
||||
<num_hunks> number of hunks in this overlay
|
||||
<hunk_sizes...> size of each hunk in longwords
|
||||
<hunk_memory_types...> memory requirements
|
||||
```
|
||||
|
||||
The resident (non-overlay) hunks are hunk 0 through N. The overlay hunks are numbered starting at N+1.
|
||||
|
||||
---
|
||||
|
||||
## Runtime Overlay Support — overlaylibrary
|
||||
|
||||
AmigaOS provides `dos.library` support for overlays via `InternalLoadSeg` with an overlay table. The application calls `ObtainSemaphore()` + `OverlayLoad()` to swap overlays.
|
||||
|
||||
In practice, the overlay system is complex and rarely documented precisely. Most real Amiga applications avoid overlays in favour of:
|
||||
- Dynamic library loading (`OpenLibrary`)
|
||||
- Splitting functionality into separate executables run via `Execute`
|
||||
- AmigaOS shared library mechanism
|
||||
|
||||
---
|
||||
|
||||
## Practical Alternative: Dynamic Linking
|
||||
|
||||
Modern Amiga development (and OS 3.1+ best practices) uses `OpenLibrary()` instead of overlays:
|
||||
|
||||
```c
|
||||
struct MyLib *MyBase = (struct MyLib *)OpenLibrary("mycode.library", 0);
|
||||
if (MyBase) {
|
||||
MyBase->myFunction(arg1, arg2);
|
||||
CloseLibrary((struct Library *)MyBase);
|
||||
}
|
||||
```
|
||||
|
||||
This is functionally equivalent to overlay loading but uses the OS resource tracking system and allows multiple users.
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS overlay section
|
||||
- NDK39: `dos/doshunks.h` — HUNK_OVERLAY, HUNK_BREAK
|
||||
- Paul Tuma's Amiga HUNK format notes (community)
|
||||
Loading…
Add table
Add a link
Reference in a new issue