mirror of
https://github.com/alfishe/amiga-bootcamp.git
synced 2026-06-12 16:16:28 +00:00
03/exe_crunchers: expand architecture with full decrunch lifecycle
Replaced the brief 'key insight' paragraph with a comprehensive
two-phase execution model covering:
- Phase 1 (OS): what LoadSeg sees, how wrapper relocations are applied
- Phase 2 (stub): all 6 reconstruction steps with code examples:
1. Metadata block preserving original hunk count/sizes/memflags
2. Chip vs Fast RAM allocation (with IMPORTANT warning about
DMA-invisible Fast RAM for bitmap/audio data)
3. Decompression into separately allocated hunks
4. Relocation regeneration from embedded tables
5. BPTR segment chain construction (with C code)
6. Wrapper cleanup and JMP to original entry
- The segment list handoff problem (double-free hazard) and the
three solutions: pr_SegList patching, chain linking, in-place
This commit is contained in:
parent
4593ff135a
commit
aeaea88d75
1 changed files with 145 additions and 2 deletions
|
|
@ -12,6 +12,13 @@ This was essential in the floppy era: a 200 KB program crunched to 120 KB loads
|
|||
|
||||
## Architecture
|
||||
|
||||
### Two-Phase Execution Model
|
||||
|
||||
A crunched executable goes through **two loading phases**:
|
||||
|
||||
1. **Phase 1 — OS loads the wrapper** (`LoadSeg` handles the crunched HUNK file normally)
|
||||
2. **Phase 2 — Stub rebuilds the original program** (acting as a mini-`LoadSeg` inside the running process)
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "Original Executable"
|
||||
|
|
@ -32,9 +39,145 @@ graph LR
|
|||
style CD fill:#e8f4fd,stroke:#2196f3,color:#333
|
||||
```
|
||||
|
||||
### Key Insight
|
||||
### Phase 1: What the OS Loader Sees
|
||||
|
||||
A crunched executable is **itself a valid HUNK file**. The OS loader handles it normally — `LoadSeg()` allocates memory, loads hunks, applies relocations. The "magic" is that hunk 0 contains a decrunch stub instead of the original code, and the data hunk contains the compressed original program.
|
||||
The crunched file is a perfectly valid HUNK executable. `LoadSeg()` processes it like any other program:
|
||||
- Reads `HUNK_HEADER`, allocates 2–3 segments (stub code, compressed data, BSS workspace)
|
||||
- Applies the wrapper's own `HUNK_RELOC32` entries (minimal — just the stub's internal references)
|
||||
- Links the segments into a BPTR chain and returns the segment list
|
||||
- `CreateProc()` sets up a task and jumps to hunk 0 offset 0 — the decrunch stub
|
||||
|
||||
At this point the OS is done. It thinks it loaded a normal program. The original executable's structure, memory types, relocations — all of that is **inside the compressed payload** and invisible to the OS.
|
||||
|
||||
### Phase 2: What the Stub Must Reconstruct
|
||||
|
||||
The decrunch stub must rebuild everything `LoadSeg` would have done for the original executable:
|
||||
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A["Entry: stub begins executing"] --> B["1. Read metadata block"]
|
||||
B --> C["2. AllocMem for each original hunk<br/>(CHIP/FAST as specified)"]
|
||||
C --> D["3. Decompress payload into<br/>allocated hunk buffers"]
|
||||
D --> E["4. Apply relocations:<br/>patch absolute addresses"]
|
||||
E --> F["5. Build BPTR segment chain"]
|
||||
F --> G["6. Free stub + compressed data"]
|
||||
G --> H["7. JMP to original entry point"]
|
||||
|
||||
style B fill:#e8f4fd,stroke:#2196f3,color:#333
|
||||
style C fill:#fff3e0,stroke:#ff9800,color:#333
|
||||
style E fill:#fce4ec,stroke:#e91e63,color:#333
|
||||
style F fill:#e8f5e9,stroke:#4caf50,color:#333
|
||||
```
|
||||
|
||||
#### Step 1: Metadata — Preserving the Original Structure
|
||||
|
||||
The compressed payload includes a **metadata block** that captures the original executable's structure. This is stored either at a fixed offset in the compressed data or appended after it:
|
||||
|
||||
```c
|
||||
/* What the cruncher preserves in the metadata: */
|
||||
struct CrunchMetadata {
|
||||
ULONG num_hunks; /* original hunk count */
|
||||
ULONG hunk_sizes[]; /* size of each original hunk (bytes) */
|
||||
ULONG hunk_memflags[]; /* MEMF_CHIP, MEMF_FAST, MEMF_ANY per hunk */
|
||||
ULONG hunk_types[]; /* HUNK_CODE, HUNK_DATA, HUNK_BSS */
|
||||
/* Relocation data follows (format varies by cruncher) */
|
||||
};
|
||||
```
|
||||
|
||||
Without this metadata, the stub cannot allocate memory correctly — a bitmap hunk that needs Chip RAM would end up in Fast RAM and be invisible to the custom chip DMA.
|
||||
|
||||
#### Step 2: Memory Allocation — Chip vs Fast Separation
|
||||
|
||||
This is the critical step most people miss. The original executable might have had:
|
||||
|
||||
```
|
||||
Hunk 0: HUNK_CODE → MEMF_FAST (68000 code — any memory)
|
||||
Hunk 1: HUNK_DATA → MEMF_CHIP (bitmaps, audio samples — MUST be DMA-reachable)
|
||||
Hunk 2: HUNK_BSS → MEMF_ANY (zero-filled workspace)
|
||||
```
|
||||
|
||||
The stub must call `AllocMem()` **individually** for each original hunk with the correct memory type flags:
|
||||
|
||||
```asm
|
||||
; Stub allocates each original hunk separately:
|
||||
MOVEA.L 4.W, A6 ; SysBase
|
||||
; Hunk 0: code — any memory is fine
|
||||
MOVE.L code_size, D0
|
||||
MOVE.L #MEMF_PUBLIC, D1
|
||||
JSR -198(A6) ; AllocMem
|
||||
MOVE.L D0, hunk_bases+0 ; save base address
|
||||
|
||||
; Hunk 1: data — MUST be Chip RAM for DMA
|
||||
MOVE.L data_size, D0
|
||||
MOVE.L #MEMF_CHIP|MEMF_PUBLIC, D1
|
||||
JSR -198(A6) ; AllocMem
|
||||
MOVE.L D0, hunk_bases+4 ; save base address
|
||||
|
||||
; Hunk 2: BSS — just clear memory
|
||||
MOVE.L bss_size, D0
|
||||
MOVE.L #MEMF_PUBLIC|MEMF_CLEAR, D1
|
||||
JSR -198(A6) ; AllocMem
|
||||
MOVE.L D0, hunk_bases+8
|
||||
```
|
||||
|
||||
> [!IMPORTANT]
|
||||
> If a cruncher loses the CHIP/FAST distinction (merging everything into one hunk), programs with bitmap/audio data in data hunks will **silently fail** — the DMA hardware can only access Chip RAM. Symptoms: garbled graphics, no audio, or Guru Meditation on access.
|
||||
|
||||
#### Step 3: Decompress — Fill the Allocated Hunks
|
||||
|
||||
The decompressor reads from the compressed payload (in the wrapper's data hunk) and writes to the freshly allocated original hunks. For programs with multiple hunks, the decompressor either:
|
||||
- Decompresses into a flat temp buffer, then copies to individual hunks (Method 1)
|
||||
- Decompresses directly to each hunk in sequence, using stored boundaries (Method 2)
|
||||
|
||||
#### Step 4: Apply Relocations
|
||||
|
||||
The original HUNK_RELOC32 tables are **embedded in the compressed data** — they were part of the original file. After decompression, the stub must patch all absolute addresses to reflect the actual allocation addresses. See the [Relocation Handling](#relocation-handling) section below for the three strategies.
|
||||
|
||||
#### Step 5: Build the Segment Chain
|
||||
|
||||
AmigaDOS tracks loaded programs as a BPTR-linked segment list. The stub must construct this chain so `UnLoadSeg()` can free the memory later:
|
||||
|
||||
```c
|
||||
/* Each segment has a 4-byte BPTR link at offset -4: */
|
||||
/* [alloc_size][-4] [BPTR→next][0] [hunk data...][4+] */
|
||||
|
||||
/* Stub builds the chain: */
|
||||
for (int i = 0; i < num_hunks - 1; i++)
|
||||
{
|
||||
BPTR *link = (BPTR *)(hunk_bases[i]); /* offset 0 = BPTR to next */
|
||||
*link = MKBADDR(hunk_bases[i + 1]); /* point to next segment */
|
||||
}
|
||||
/* Last segment's link = 0 (NULL) — end of chain */
|
||||
*(BPTR *)(hunk_bases[num_hunks - 1]) = 0;
|
||||
```
|
||||
|
||||
> **Why this matters**: If the stub doesn't build a valid segment chain, `UnLoadSeg()` (called when the program exits) will crash or leak memory. Some simple crunchers skip this step entirely — the program runs fine but its memory is never freed.
|
||||
|
||||
#### Step 6: Free the Wrapper, Jump to Original
|
||||
|
||||
The stub frees the wrapper's own memory (compressed data, BSS workspace) and `JMP`s to the original entry point at hunk 0, offset 0 (after the BPTR link word):
|
||||
|
||||
```asm
|
||||
; Free wrapper hunks
|
||||
; (some stubs skip this and accept the memory leak)
|
||||
|
||||
; Restore registers to match what the OS originally passed
|
||||
MOVEM.L (SP)+, D0-D7/A0-A6
|
||||
|
||||
; Jump to original program entry
|
||||
MOVEA.L hunk_bases+0, A0
|
||||
ADDQ.L #4, A0 ; skip BPTR link at offset 0
|
||||
JMP (A0)
|
||||
```
|
||||
|
||||
### The Segment List Handoff Problem
|
||||
|
||||
There's a subtle issue: the OS gave the process a segment list pointing to the **wrapper** hunks. But the actual program now lives in **newly allocated** hunks. When the program exits, `UnLoadSeg()` will try to free the wrapper's segment list, which the stub may have already freed — double-free crash.
|
||||
|
||||
Sophisticated crunchers solve this by:
|
||||
1. **Patching the process's `pr_SegList`** to point to the new segment chain
|
||||
2. **Or** keeping the wrapper allocated and linking it into the chain
|
||||
3. **Or** replacing the wrapper's content in-place (overwriting stub+compressed with decompressed code)
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue