mirror of
https://github.com/alfishe/amiga-bootcamp.git
synced 2026-06-13 00:26:28 +00:00
03/04: deep enrichment of loader/exec format and linking/libraries
Sections 03 and 04 augmented to bootcamp quality with targeted enrichment based on content analysis (not just file size). 03_loader_and_exec_format: - overlay_system.md: full rewrite — tree architecture diagram, HUNK_OVERLAY binary format, overlay manager runtime internals, worked binary example, linker support, modern alternatives - hunk_relocation.md: full rewrite — visual before/after diagram, patching algorithm with code, RELOC32SHORT and DREL32 formats, PC-relative impact comparison table, self-referencing relocs, error scenarios, Python reloc scanner tool 04_linking_and_libraries: - library_structure.md: full rewrite — ASCII memory layout diagram, JMP table encoding (why 6 bytes), MakeLibrary internals with both function array formats, complete library creation example with .fd file, checksum verification, lifecycle state diagram - shared_libraries_runtime.md: full rewrite — OpenLibrary 4-step resolution path, ramlib disk loader internals, disk search path, version negotiation table (v33-v47), CloseLibrary/Expunge deep dive, memory-low sweep, common pitfalls table - register_conventions.md: full rewrite — FPU register map, inter-library A6 save/restore pattern, small-data model with __saveds keyword, varargs/TagItem pattern deep dive, stack-based wrapper explanation, disassembly identification Updated indexes: - 03_loader_and_exec_format/README.md - 04_linking_and_libraries/README.md - Root README.md (sections 03 and 04)
This commit is contained in:
parent
99a6d53f57
commit
7df1f11f15
8 changed files with 1556 additions and 360 deletions
|
|
@ -4,23 +4,65 @@
|
|||
|
||||
## Overview
|
||||
|
||||
Relocation is the process of **patching absolute addresses** in a loaded executable to reflect its actual memory location. Since AmigaOS allocates memory dynamically, a program cannot know its load address at compile time — all inter-hunk references must be fixed up at runtime.
|
||||
Relocation is the process of **patching absolute addresses** in a loaded executable to reflect its actual memory location. Since AmigaOS allocates memory dynamically via `AllocMem()`, a program cannot know its load address at compile time — all inter-hunk references must be fixed up at runtime by the loader.
|
||||
|
||||
---
|
||||
|
||||
## Why Relocation Is Necessary
|
||||
|
||||
An Amiga executable contains references like:
|
||||
```asm
|
||||
LEA DataTable(PC), A0 ; PC-relative — no relocation needed
|
||||
MOVE.L #DataTable, A0 ; Absolute — MUST be relocated
|
||||
Consider a program with a code hunk and a data hunk:
|
||||
|
||||
```c
|
||||
/* Source code: */
|
||||
const char message[] = "Hello"; /* in data hunk */
|
||||
void foo(void) {
|
||||
puts(message); /* code references data hunk — absolute address needed */
|
||||
}
|
||||
```
|
||||
|
||||
The linker places `DataTable` at some hunk-relative offset (e.g., offset 0 in the data hunk). The absolute address is only known at load time. The relocation table tells the loader which longwords in the code contain these absolute values.
|
||||
The linker places `message` at offset 0 in the data hunk. But the absolute address of `message` depends on where `AllocMem` places the data hunk at runtime — this is unknown at link time.
|
||||
|
||||
```asm
|
||||
; What the linker writes (before loading):
|
||||
foo:
|
||||
PEA $00000000 ; ← linker writes offset 0 (within data hunk)
|
||||
JSR _puts
|
||||
ADDQ.L #4, SP
|
||||
|
||||
; After loading (data hunk loaded at $00040000):
|
||||
foo:
|
||||
PEA $00040000 ; ← loader patches: 0 + $40000 = $40000
|
||||
JSR _puts
|
||||
ADDQ.L #4, SP
|
||||
```
|
||||
|
||||
The relocation table tells the loader **which bytes** to patch and **what base address** to add.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC32 Format
|
||||
## Visual: Before and After Relocation
|
||||
|
||||
```
|
||||
BEFORE (raw file): AFTER (loaded at runtime):
|
||||
Code hunk loaded at $00020000
|
||||
Data hunk loaded at $00040000
|
||||
|
||||
Code Hunk: Code Hunk:
|
||||
offset $00: MOVEQ #0, D0 offset $00: MOVEQ #0, D0
|
||||
offset $04: LEA $00000000, A0 ←─┐ offset $04: LEA $00040000, A0 ✓ patched
|
||||
offset $0A: BSR $00000020 │ offset $0A: BSR $00000020
|
||||
offset $0E: MOVE.L $00000010, D1 ←┤ offset $0E: MOVE.L $00040010, D1 ✓ patched
|
||||
│
|
||||
HUNK_RELOC32: │
|
||||
target_hunk = 1 (data) │
|
||||
offsets = [$06, $10] ─────────┘ Relocation adds $00040000 at these offsets
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC32 Format ($3EC)
|
||||
|
||||
The most common relocation type:
|
||||
|
||||
```
|
||||
HUNK_RELOC32 ($000003EC)
|
||||
|
|
@ -28,105 +70,204 @@ HUNK_RELOC32 ($000003EC)
|
|||
[Repeat until terminator:]
|
||||
<num_offsets> Number of longword addresses to patch for this target hunk
|
||||
<target_hunk> Index of the hunk whose base address is added
|
||||
<offset_0> Byte offset within the current hunk to patch
|
||||
<offset_0> Byte offset within the CURRENT hunk to patch
|
||||
<offset_1>
|
||||
...
|
||||
|
||||
<0> num_offsets = 0 terminates the reloc list
|
||||
HUNK_END ($000003F2)
|
||||
```
|
||||
|
||||
### Patching Algorithm
|
||||
|
||||
For each entry in HUNK_RELOC32 of hunk `H`:
|
||||
```
|
||||
foreach (target_hunk, offsets[]):
|
||||
base = segment_base_address[target_hunk]
|
||||
foreach offset in offsets:
|
||||
*(ULONG *)(H_base + offset) += base
|
||||
```c
|
||||
/* For each entry group in HUNK_RELOC32 of hunk H: */
|
||||
for (group = 0; group < num_groups; group++)
|
||||
{
|
||||
ULONG count = Read32(); /* how many patch sites */
|
||||
if (count == 0) break; /* terminator */
|
||||
ULONG target_hunk = Read32(); /* which hunk's base to add */
|
||||
ULONG target_base = segment_base_address[target_hunk];
|
||||
|
||||
for (ULONG i = 0; i < count; i++)
|
||||
{
|
||||
ULONG offset = Read32(); /* byte offset within current hunk */
|
||||
ULONG *patch = (ULONG *)(hunk_H_base + offset);
|
||||
*patch += target_base; /* add actual load address */
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The value at `H_base + offset` already contains the **hunk-relative** address written by the linker. Adding the actual base produces the final absolute address.
|
||||
The value at the patch site already contains the **hunk-relative offset** written by the linker. Adding the target hunk's actual base address produces the final absolute address.
|
||||
|
||||
### Example
|
||||
### Worked Example — Two Hunks
|
||||
|
||||
Code hunk references data hunk at two sites:
|
||||
```
|
||||
Before load (raw file values):
|
||||
code[0x18] = $00000000 ; linker placed "data offset 0" here
|
||||
code[0x2C] = $00000010 ; linker placed "data offset 0x10" here
|
||||
File layout:
|
||||
Hunk 0 (CODE): 128 bytes, references data at offsets $18 and $2C
|
||||
Hunk 1 (DATA): 64 bytes
|
||||
|
||||
HUNK_RELOC32:
|
||||
num_offsets = 2
|
||||
target_hunk = 1 ; data hunk
|
||||
offsets = [0x18, 0x2C]
|
||||
Loaded at runtime:
|
||||
Hunk 0 → $00020000 (code)
|
||||
Hunk 1 → $00030000 (data)
|
||||
|
||||
After load (data hunk loaded at $20000):
|
||||
code[0x18] = $00000000 + $20000 = $00020000
|
||||
code[0x2C] = $00000010 + $20000 = $00020010
|
||||
HUNK_RELOC32 for Hunk 0:
|
||||
$00000002 ; 2 offsets to patch
|
||||
$00000001 ; target = hunk 1 (data)
|
||||
$00000018 ; patch at code+$18
|
||||
$0000002C ; patch at code+$2C
|
||||
$00000000 ; terminator
|
||||
|
||||
Before patch:
|
||||
code[$18] = $00000000 (data offset 0)
|
||||
code[$2C] = $00000010 (data offset $10)
|
||||
|
||||
After patch:
|
||||
code[$18] = $00000000 + $00030000 = $00030000 ✓
|
||||
code[$2C] = $00000010 + $00030000 = $00030010 ✓
|
||||
```
|
||||
|
||||
### Self-Referencing (Intra-Hunk) Relocation
|
||||
|
||||
Code can also reference its own hunk:
|
||||
|
||||
```
|
||||
HUNK_RELOC32 for Hunk 0:
|
||||
$00000001 ; 1 offset
|
||||
$00000000 ; target = hunk 0 (self!)
|
||||
$00000044 ; patch at code+$44
|
||||
$00000000 ; terminator
|
||||
|
||||
This happens when code contains an absolute reference to a label
|
||||
within the same hunk (e.g., a jump table with absolute addresses).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC16 and HUNK_RELOC8
|
||||
## HUNK_RELOC32SHORT ($3FC) — Compact Variant
|
||||
|
||||
Same format as HUNK_RELOC32 but patch **16-bit** or **8-bit** values:
|
||||
- `HUNK_RELOC16` ($3ED): patches UWORD at offset
|
||||
- `HUNK_RELOC8` ($3EE): patches UBYTE at offset
|
||||
|
||||
These are rare in practice — the 68000 requires even-aligned word accesses and only supports 16-bit displacement in most addressing modes.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_DREL32 — Short Relocation (32-bit)
|
||||
|
||||
`HUNK_DREL32` ($3F7) is an alternative relocation format used by some linkers (e.g., BLink) for smaller reloc tables:
|
||||
Uses 16-bit values instead of 32-bit for offsets:
|
||||
|
||||
```
|
||||
HUNK_DREL32
|
||||
HUNK_RELOC32SHORT ($000003FC)
|
||||
|
||||
[Repeat:]
|
||||
<num_offsets> (WORD, not LONGWORD)
|
||||
<target_hunk> (WORD)
|
||||
<offset_0> (WORD)
|
||||
<num_offsets> (UWORD — 16-bit!)
|
||||
<target_hunk> (UWORD)
|
||||
<offset_0> (UWORD)
|
||||
...
|
||||
|
||||
<0> terminator
|
||||
<0> UWORD terminator
|
||||
[padding to longword boundary if needed]
|
||||
```
|
||||
|
||||
By using 16-bit values, this format is more compact for programs with many relocations and small hunk sizes (<64 KB). AmigaOS `InternalLoadSeg` supports both formats.
|
||||
Saves space when all patch offsets fit in 16 bits (hunk size < 64 KB). The **semantics are identical** to HUNK_RELOC32 — only the field sizes differ.
|
||||
|
||||
Modern linkers (vlink, vasm) prefer this format for small programs.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELOC16 ($3ED) and HUNK_RELOC8 ($3EE)
|
||||
|
||||
Same format as HUNK_RELOC32 but patch **16-bit** or **8-bit** values respectively:
|
||||
|
||||
| Type | Patches | Use Case |
|
||||
|---|---|---|
|
||||
| HUNK_RELOC32 | ULONG (4 bytes) | Standard — absolute 32-bit addresses |
|
||||
| HUNK_RELOC16 | UWORD (2 bytes) | 16-bit displacement mode (rare) |
|
||||
| HUNK_RELOC8 | UBYTE (1 byte) | 8-bit short-branch offset (extremely rare) |
|
||||
|
||||
HUNK_RELOC16 and HUNK_RELOC8 are almost never seen in practice — the 68000 doesn't commonly use 16-bit absolute addresses, and linkers generate PC-relative code for short displacements instead.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_DREL32 ($3F7) — Base-Relative Compact Relocation
|
||||
|
||||
An alternative relocation format used by some linkers (BLink):
|
||||
|
||||
```
|
||||
HUNK_DREL32 ($000003F7)
|
||||
|
||||
[Repeat:]
|
||||
<num_offsets> (UWORD)
|
||||
<target_hunk> (UWORD)
|
||||
<offset_0> (UWORD)
|
||||
...
|
||||
|
||||
<0> UWORD terminator
|
||||
```
|
||||
|
||||
Semantically identical to HUNK_RELOC32 but uses 16-bit fields. More compact for programs with many relocations and small hunk sizes (< 64 KB). Supported by `InternalLoadSeg`.
|
||||
|
||||
---
|
||||
|
||||
## HUNK_RELRELOC32 ($3FD) — PC-Relative Relocation
|
||||
|
||||
Patches a **32-bit displacement** rather than an absolute address:
|
||||
|
||||
```c
|
||||
/* Patch algorithm for PC-relative: */
|
||||
*patch = target_base - (current_hunk_base + offset);
|
||||
/* The patched value is a signed offset from the patch site to the target */
|
||||
```
|
||||
|
||||
Used by GCC with `-fPIC` for position-independent code. Rare in standard AmigaOS programs.
|
||||
|
||||
---
|
||||
|
||||
## PC-Relative References (No Relocation Needed)
|
||||
|
||||
The 68020+ supports **PC-relative addressing** with 32-bit displacements:
|
||||
The 68020+ supports **32-bit PC-relative addressing**, and even the 68000 supports 16-bit PC-relative:
|
||||
|
||||
```asm
|
||||
LEA symbol(PC), A0 ; PC-relative load effective address
|
||||
MOVE.L data(PC), D0 ; PC-relative data read
|
||||
; 68000 — 16-bit PC-relative (within ±32 KB):
|
||||
LEA myData(PC), A0 ; PC-relative — no reloc needed
|
||||
BSR myFunction ; PC-relative branch — no reloc
|
||||
|
||||
; 68020+ — 32-bit PC-relative:
|
||||
MOVE.L myData(PC), D0 ; PC-relative with 32-bit displacement
|
||||
```
|
||||
|
||||
PC-relative references do not require relocation — the offset is relative to the instruction, so it is valid regardless of where the code is loaded. **GCC for 68k** generates PC-relative code by default (`-fpic`), significantly reducing the size of relocation tables.
|
||||
PC-relative references are **relocation-free** — the offset is relative to the instruction pointer, so it remains valid regardless of where the code loads.
|
||||
|
||||
SAS/C generates absolute references by default and relies heavily on `HUNK_RELOC32`.
|
||||
| Compiler | Default Mode | Relocation Impact |
|
||||
|---|---|---|
|
||||
| SAS/C | Absolute addressing | Heavy relocation (many HUNK_RELOC32 entries) |
|
||||
| GCC | PC-relative (`-fpic`) | Minimal relocation — smaller executables |
|
||||
| VBCC | PC-relative (with small code model) | Similar to GCC |
|
||||
|
||||
> **Practical impact**: A program with 500 internal function calls generates 500 HUNK_RELOC32 entries with absolute addressing (SAS/C), but nearly zero with PC-relative code (GCC). This affects both file size and load time.
|
||||
|
||||
---
|
||||
|
||||
## Relocation at Runtime — Segment Chain
|
||||
## Relocation and the Segment Chain
|
||||
|
||||
The loader tracks loaded segments as a **BPTR chain** (singly-linked list). The segment list head is returned by `LoadSeg()`:
|
||||
|
||||
```
|
||||
Segment 0 (code):
|
||||
BPTR → Segment 1
|
||||
[code data]
|
||||
byte -4: allocation size (for FreeMem)
|
||||
byte 0: BPTR → Segment 1
|
||||
byte 4: [code data starts here]
|
||||
|
||||
Segment 1 (data):
|
||||
BPTR → 0 (NULL)
|
||||
[data]
|
||||
byte -4: allocation size
|
||||
byte 0: BPTR → 0 (NULL = end of chain)
|
||||
byte 4: [data starts here]
|
||||
```
|
||||
|
||||
Each segment begins with a 4-byte BPTR to the next segment. Hunk index `n` corresponds to segment `n` in this chain.
|
||||
Each segment begins with a 4-byte BPTR to the next segment. Hunk index `n` in the relocation table corresponds to segment `n` in this chain. The base address used for relocation is `segment_address + 4` (skip the BPTR link).
|
||||
|
||||
---
|
||||
|
||||
## Relocation Error Scenarios
|
||||
|
||||
| Error | Cause | Symptom |
|
||||
|---|---|---|
|
||||
| Offset beyond hunk size | Corrupt HUNK_RELOC32 | Random memory corruption; Guru |
|
||||
| Invalid target hunk index | Corrupt reloc table | Crash during load |
|
||||
| Relocation to freed memory | Hunk couldn't be allocated | Dangling pointer — crash at use time |
|
||||
| Missing relocation entry | Linker bug | Pointer has wrong value; subtle crash |
|
||||
| Unaligned offset | Not longword-aligned | Bus error on 68000 (address error) |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -137,17 +278,41 @@ After loading a HUNK file with the Amiga plugin, IDA resolves relocations automa
|
|||
|
||||
### hexdump + manual
|
||||
|
||||
Locate HUNK_RELOC32 ($3EC) in raw hex:
|
||||
```bash
|
||||
# Find HUNK_RELOC32 in raw hex:
|
||||
xxd mybinary | grep "0003 ec"
|
||||
# Then read num_offsets, target_hunk, and offset longwords
|
||||
```
|
||||
|
||||
Then read num_offsets and target_hunk longwords that follow.
|
||||
### Python scanner
|
||||
|
||||
### hunkinfo (custom tool)
|
||||
```python
|
||||
import struct
|
||||
|
||||
```bash
|
||||
hunkinfo mybinary # shows all hunks, sizes, reloc counts
|
||||
def dump_relocs(filename):
|
||||
with open(filename, 'rb') as f:
|
||||
data = f.read()
|
||||
off = 0
|
||||
while off < len(data) - 4:
|
||||
tag = struct.unpack('>I', data[off:off+4])[0]
|
||||
if tag == 0x3EC: # HUNK_RELOC32
|
||||
off += 4
|
||||
while True:
|
||||
count = struct.unpack('>I', data[off:off+4])[0]
|
||||
off += 4
|
||||
if count == 0: break
|
||||
target = struct.unpack('>I', data[off:off+4])[0]
|
||||
off += 4
|
||||
offsets = []
|
||||
for i in range(count):
|
||||
o = struct.unpack('>I', data[off:off+4])[0]
|
||||
offsets.append(f'${o:04X}')
|
||||
off += 4
|
||||
print(f' target=hunk{target} offsets={offsets}')
|
||||
else:
|
||||
off += 4
|
||||
|
||||
dump_relocs('mybinary')
|
||||
```
|
||||
|
||||
---
|
||||
|
|
@ -157,4 +322,5 @@ hunkinfo mybinary # shows all hunks, sizes, reloc counts
|
|||
- NDK39: `dos/doshunks.h`
|
||||
- *Amiga ROM Kernel Reference Manual: Libraries* — AmigaDOS chapter, `InternalLoadSeg`
|
||||
- vlink linker documentation — relocation section
|
||||
- http://amigadev.elowar.com/read/ADCD_2.1/Libraries_Manual_guide/node01E0.html
|
||||
- See also: [HUNK Format](hunk_format.md) — complete hunk type reference
|
||||
- See also: [Exe Load Pipeline](exe_load_pipeline.md) — how LoadSeg uses relocations
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue