More information. Especially about RE and AREXX automation

This commit is contained in:
Ilia Sharin 2026-04-29 21:42:28 -04:00
parent 05452c6c12
commit 0aafceb31e
30 changed files with 8826 additions and 66 deletions

View file

@ -96,7 +96,10 @@ print(f'{sys.argv[1]}: {names.get(tag, f\"UNKNOWN ({tag:#010x})\")}')
|---|---|---|---|---|
| `$3E9` | 1001 | `HUNK_CODE` | `[tag] [size_longs] [code_bytes × size×4]` | **Machine-code section.** The loader allocates RAM, copies the bytes, then applies any `HUNK_RELOC32` that follows. Holds 68k instructions — never data. |
| `$3EA` | 1002 | `HUNK_DATA` | `[tag] [size_longs] [data_bytes × size×4]` | **Initialized read/write data.** Global variables with non-zero values, string literals, jump tables, etc. Any embedded pointers to other hunks require `HUNK_RELOC32` fixups. |
| `$3EB` | 1003 | `HUNK_BSS` | `[tag] [size_longs]` *(no data bytes)* | **Uninitialized data (zero-fill).** Only the size is stored — no bytes in the file. The loader calls `AllocMem(..., MEMF_CLEAR)`. A 64 KB zero array costs 4 bytes on disk. |
| `$3EB` | 1003 | `HUNK_BSS` | `[tag] [size_longs]` *(no data bytes)* | **Uninitialized data (zero-fill).** Only the size is stored — no bytes in the file. The loader calls `AllocMem(..., MEMF_CLEAR)`. A 64 KB zero array costs 4 bytes on disk. **No relocation follows BSS hunks** — there are no initialized values to fix up. |
> [!NOTE]
> **HUNK_DATA trailing space:** Data hunks have been observed with trailing `ds.width` variables that do not contribute to the local hunk length declared in the `HUNK_DATA` header, but are accounted for in the `HUNK_HEADER` size table. The OS loader allocates based on the header size table, so the extra space is available at runtime even though the hunk's own `num_longs` field doesn't include it.
---
@ -112,7 +115,7 @@ print(f'{sys.argv[1]}: {names.get(tag, f\"UNKNOWN ({tag:#010x})\")}')
| `$3F7` | 1015 | `HUNK_DREL32` | — | WORD (16-bit) | **Compact 32-bit reloc.** Same semantics as `HUNK_RELOC32` but count, hunk index, and offsets are stored as 16-bit WORDs, halving the table size. Valid only when all hunk offsets fit in 16 bits (hunk < 64 KB). Generated by BLink. |
| `$3F8` | 1016 | `HUNK_DREL16` | — | WORD (16-bit) | Compact 16-bit reloc with WORD-sized fields. Very rare. |
| `$3F9` | 1017 | `HUNK_DREL8` | — | WORD (16-bit) | Compact 8-bit reloc with WORD-sized fields. Essentially unused. |
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | WORD (16-bit) | **Compact absolute 32-bit reloc with WORD offsets.** Semantically identical to `HUNK_RELOC32` with WORD fields. Default output of vasm/vlink when all offsets fit in 16 bits. Preferred over `HUNK_DREL32` in OS 3.x-era tools. |
| `$3FC` | 1020 | `HUNK_RELOC32SHORT` | — | WORD (16-bit) | **Compact absolute 32-bit reloc with WORD offsets.** Semantically identical to `HUNK_RELOC32` with WORD fields. Default output of vasm/vlink when all offsets fit in 16 bits. Preferred over `HUNK_DREL32` in OS 3.x-era tools. **After the table, if the total WORD count is odd, a padding WORD (`$0000`) restores longword alignment** before the next hunk record. |
| `$3FD` | 1021 | `HUNK_RELRELOC32` | — | LONG (32-bit) | **PC-relative 32-bit reloc.** Patch: `*(LONG*)(base+off) += target_base (base+off+4)`. Used by GCC `-fPIC` and PIC shared libraries. |
| `$3FE` | 1022 | `HUNK_ABSRELOC16` | — | LONG (32-bit) | **Absolute 16-bit fixup.** Patches a UWORD with the low 16 bits of the target's absolute address. Required for `MOVE.W #abs_addr,Dn` patterns. Rare. |
@ -135,7 +138,26 @@ print(f'{sys.argv[1]}: {names.get(tag, f\"UNKNOWN ({tag:#010x})\")}')
| Hex | Dec | Constant | Payload | Description |
|---|---|---|---|---|
| `$3F0` | 1008 | `HUNK_SYMBOL` | `[tag] { [namelen_longs] [name_bytes…] [value] } … [0]` | **Local symbol table.** Maps label names → offsets within this hunk. Consumed by MonAm, wack, IDA Pro. Terminated by `namelen=0`. |
| `$3F1` | 1009 | `HUNK_DEBUG` | `[tag] [size_longs] [format_tag] [data_bytes…]` | **Opaque debug block.** The leading `format_tag` longword identifies the format: `$3D415053` = SAS/C stabs; `$3D474343` = GCC stabs; `$3D574152` = Warp/Storm C. See [`hunk_debug_info.md`](hunk_debug_info.md). |
| `$3F1` | 1009 | `HUNK_DEBUG` | `[tag] [size_longs] [format_tag] [data_bytes…]` | **Opaque debug block.** The leading `format_tag` longword identifies the debug data encoding — see [Debug Format Tags](#debug-format-tags) below for the full table. See [`hunk_debug_info.md`](hunk_debug_info.md) for stabs record layout. |
#### Debug Format Tags
The first longword after the size field in a `HUNK_DEBUG` block is a 4-character ASCII **format tag** identifying the debug data encoding:
| Format tag (hex) | ASCII | Compiler / Assembler | Description |
|---|---|---|---|
| `$3D415053` | `=APS` | SAS/C 6.x | SAS/C stabs debug symbols |
| `$3D474343` | `=GCC` | GCC (m68k-amigaos) | GCC stabs debug symbols |
| `$3D574152` | `=WAR` | Storm C / Warp C | Storm C / Warp C debug symbols |
| `$48434C4E` | `HCLN` | Devpac | Devpac assembler — source file name record |
| `$48454144` | `HEAD` | Devpac | Devpac assembler — start of source file marker |
| `$4C494E45` | `LINE` | Generic / multiple | Line-number debug info (used by several assemblers) |
| `$4F444546` | `ODEF` | BAsm | BAsm assembler debug symbols |
| `$4F505453` | `OPTS` | SAS/C | SAS/C compiler options record |
| `$5A4D4147` | `ZMAG` | GNU tools (ld) | GNU ZMAGIC debug hunk (full 6-byte tag `ZMAGIC`) |
> [!NOTE]
> `dos.library` v31+ treats **any** hunk ID whose lower 29 bits exceed `HUNK_ABSRELOC16` (`$3FE` / 1022) as a `HUNK_DEBUG` block and silently skips it — unless bit 29 is set, which causes `ERROR_BAD_HUNK`. This allows compilers to emit custom debug hunk types that newer loaders ignore transparently.
---
@ -163,6 +185,30 @@ print(f'{sys.argv[1]}: {names.get(tag, f\"UNKNOWN ({tag:#010x})\")}')
| `$3FB` | 1019 | `HUNK_INDEX` | **Symbol index for `HUNK_LIB`.** A packed string table plus a per-unit map of exported symbol names → unit byte offsets. Lets the linker locate a function without scanning every object in the archive. Always immediately follows the `HUNK_LIB` it describes. |
### Hunk ID Bit Masking
After the initial `HUNK_HEADER`, the OS loader (`dos.library`) only examines the **lower 29 bits** of each hunk type longword. The upper bits encode memory placement flags (see [Memory Placement Flags](#memory-placement-flags) below). This has two important consequences:
1. **Unknown hunk types become debug.** `dos.library` v31+ treats any hunk ID whose lower 29 bits exceed `HUNK_ABSRELOC16` (`$3FE` / 1022) as a `HUNK_DEBUG` block and silently skips it. This allows compilers to emit custom debug hunk types that newer loaders ignore without error.
2. **Bit 29 set → load failure.** If a hunk ID has bit 29 set but is not a recognized code/data/BSS type, the loader **fails** with `ERROR_BAD_HUNK` rather than treating it as debug.
```c
/* Typical loader logic (dos.library v31+) */
hunk_id = read_uint32(f);
if (hunk_id == HUNK_HEADER) { ... } /* first hunk only — full 32 bits */
/* After HUNK_HEADER: mask memory flags, check range */
hunk_id &= 0x3FFFFFFF; /* keep lower 30 bits */
if (hunk_id > HUNK_ABSRELOC16) { /* unknown type */
if (hunk_id & HUNKF_FAST) /* bit 29 set? */
return ERROR_BAD_HUNK; /* hard error */
/* else: treat as HUNK_DEBUG — skip silently */
}
```
> [!NOTE]
> The masking (typically `& 0x3FFFFFFF`) keeps 30 bits, not 29 as the simplified description suggests. The practical rule: after `HUNK_HEADER`, memory flag bits are stripped before the type code comparison.
### Memory Placement Flags
@ -534,3 +580,4 @@ block-beta
- NDK39: `dos/doshunks.h` — hunk type constants
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node01E0.html
- Community reference: http://sun.hasenbraten.de/vlink/release/vlink.pdf (HUNK format appendix)
- http://amiga-dev.wikidot.com/file-format:hunk — HUNK format reference with Python parsing code, debug format tags, and dos.library v31+ compatibility notes