More interesting hot stuff!

This commit is contained in:
Ilia Sharin 2026-04-29 23:18:55 -04:00
parent 0aafceb31e
commit b01763982e
22 changed files with 2542 additions and 7 deletions

View file

@ -24,9 +24,18 @@ This section provides a systematic methodology for reverse engineering AmigaOS e
| [static/code_vs_data_disambiguation.md](static/code_vs_data_disambiguation.md) | Distinguishing code bytes from data — IDA/Ghidra workflows |
| [patching_techniques.md](patching_techniques.md) | Surgical binary patching methods |
| [unpacking_and_decrunching.md](unpacking_and_decrunching.md) | Executable unpacking, decruncher architecture, and manual extraction |
| [custom_loaders_and_drm.md](custom_loaders_and_drm.md) | Bypassing DOS, Trackloaders, and physical DRM tricks |
| [anti_debugging.md](anti_debugging.md) | The Cracker vs. Developer arms race: Trace vector abuse, NMI defeat, CIA timers |
| [whdload_architecture.md](whdload_architecture.md) | WHDLoad internals, slaves, resload_DiskLoad, and runtime memory patching |
| [case_studies/](case_studies/) | Real-world RE walkthroughs |
| [case_studies/ramdrive_device.md](case_studies/ramdrive_device.md) | ramdrive.device RE walkthrough |
### Game Reverse Engineering
| File | Topic |
|---|---|
| [games/game_reversing.md](games/game_reversing.md) | Game RE: disassembly, modification, asset extraction, save game analysis |
### Per-Compiler Reverse Engineering — Binary Field Manuals
| File | Topic |

View file

@ -0,0 +1,145 @@
[← Home](../README.md) · [Reverse Engineering](README.md)
# Amiga Anti-Debugging & The Arms Race
The late 1980s and early 1990s saw an intense "arms race" between Amiga software developers (implementing DRM) and crackers (removing DRM). Because the Amiga allowed user software to take complete, bare-metal control of the hardware, developers created highly sophisticated anti-debugging techniques to crash or hang the system if a cracker tried to analyze the game in a debugger.
This article codifies the most prominent anti-debugging tricks used on the Motorola 68000 and Amiga custom chips.
---
## 1. Trace Vector Abuse (Trace Vector Decoding - TVD)
The most famous anti-debugging trick is the **Rob Northen Copylock**. It abuses the 68000's Trace Exception to prevent single-stepping.
### The Mechanism
The 68000 CPU has a **Trace bit** (bit 15) in the Status Register (SR). When set, the CPU executes exactly one instruction and then automatically triggers a Trace Exception (Vector `$24` at memory address `$00000024`). Debuggers use this to implement "single-step" functionality.
Developers used **Trace Vector Decoding (TVD)** to maliciously overwrite the debugger's trace vector with their own decryption routine.
```assembly
; Copylock TVD Pattern
move.l $24.w, old_trace ; Save the debugger's vector (or just trash it)
move.l #my_decryption, $24.w; Install our own trace handler
ori.w #$8000, sr ; Set the CPU Trace bit!
nop ; The interrupt fires immediately after this instruction
; ... CPU jumps to my_decryption ...
my_decryption:
; Decrypt the next instruction of the game
; Execute it
; Re-encrypt it (optional)
rte ; Return from exception (which immediately fires AGAIN)
```
### The Cracker's Solution
If a cracker is single-stepping through this code, the moment `ori.w #$8000, sr` executes, the game steals the trace vector. The debugger loses control, and the game decrypts itself synchronously in the background. Crackers defeated this by using **Trace Emulators** or custom scripts that ran the decryption loop virtually, extracting the final decrypted payload from memory without relying on the hardware Trace bit.
---
## 2. Action Replay NMI Defeat
The **Action Replay** was a hardware cartridge that plugged into the Amiga's expansion port. Pressing its physical button generated a **Level 7 Non-Maskable Interrupt (NMI)**. This instantly froze the Amiga, dumping the user into a powerful machine-code monitor regardless of what the OS or game was doing.
Because it was non-maskable, developers could not simply disable it via the `INTENA` register.
### The Mechanism
While Level 7 interrupts cannot be masked, the CPU still uses an exception vector (`$0000007C`) to handle them. Developers simply pointed the NMI vector to a dummy `RTE` (Return from Exception) instruction.
```assembly
; Defeating the Action Replay
move.l #ignore_nmi, $7c.w ; Overwrite Level 7 Auto-Vector
; ... game code ...
ignore_nmi:
rte ; Silently return if the freeze button is pressed
```
### The Cracker's Solution
Action Replay Mk II and III introduced features to intercept writes to `$7C.w` or used the MMU (on 68030 processors) to write-protect the vector table, ensuring the freezer always retained control.
---
## 3. CIA Timer Checks
If a developer suspects their code is being single-stepped, they can measure the exact time it takes to execute a block of instructions.
### The Mechanism
The Amiga's 8520 CIA (Complex Interface Adapter) chips contain highly accurate hardware timers (Timer A and Timer B).
```assembly
; CIA Timer Anti-Debugging
move.b #$08, $bfd400 ; Start CIA-B Timer A
; ... decryption loop ...
move.b $bfd400, d0 ; Read timer value
cmp.b #$A0, d0 ; Did this take longer than X microseconds?
bgt.s .debugger_detected ; Yes? We are being single-stepped!
.debugger_detected:
; Intentionally corrupt the decryption key
eori.l #$FFFFFFFF, d7
```
If a human is pressing "Step" in a debugger, the time elapsed will be millions of times slower than native execution. The game detects this and silently corrupts the decryption key. The game will crash later, confusing the cracker.
---
## 4. Software Breakpoint Checksumming
A common way to set a breakpoint in a 68k debugger is to overwrite an instruction with `$4AFC` (`ILLEGAL`). When the CPU hits it, it triggers an Illegal Instruction exception (Vector `$10`), handing control back to the debugger.
### The Mechanism
To detect `$4AFC` breakpoints, games constantly checksum their own code in memory.
```assembly
; Checksumming a block of code
lea code_start(pc), a0
move.w #100, d7
moveq #0, d0
.csum:
add.w (a0)+, d0
dbf d7, .csum
cmp.w #$1234, d0 ; Does the checksum match?
bne.s .debugger_found ; If not, someone modified the code!
```
### Self-Modifying Code (SMC)
To make this even harder, developers used SMC. The code would dynamically alter its own instructions just before executing them. If a debugger placed a `$4AFC` breakpoint in the path of the SMC, the SMC would corrupt the breakpoint, or the breakpoint would corrupt the SMC calculation, leading to a crash.
---
## 5. VBR (Vector Base Register) Relocation
On the 68000, exception vectors are hardcoded at memory address `$00000000`. Standard software debuggers place their hooks here.
### The Mechanism
Starting with the 68010 (and present on the 68020/030/040/060), Motorola introduced the **Vector Base Register (VBR)**. This register allows the OS to move the entire exception vector table to any location in memory.
```assembly
; Moving the vector table (Requires Supervisor State)
lea new_vector_table, a0
movec a0, vbr
```
Games running on AGA Amigas (A1200/A4000) would allocate a new block of memory, copy the vector table there, alter the `vbr`, and then install their trace/NMI traps in the *new* table. Older debuggers hardcoded to read `$00000000` would be completely blind to these changes.
---
## 6. Hardware Register Traps
Debuggers often alter hardware state slightly (e.g., stopping DMA, changing screen colors, reading registers to update the UI).
### The Mechanism
* **Write-Only Registers**: Many custom chip registers (like `BLTCON0` or `COLOR00`) are write-only. If a debugger tries to read them to save the system state, it actually reads open bus (garbage). If the debugger blindly writes that garbage back when resuming execution, the game crashes.
* **Undocumented CIA Bits**: Some games would write specific patterns to undocumented bits in the CIA registers. Hardware freezers that didn't perfectly emulate or save/restore these undocumented bits would cause the game to fail its own integrity checks upon resuming.
---
## Summary
The Amiga anti-debugging scene was characterized by its proximity to the bare metal. Because there was no memory protection on the standard 68000, developers had free rein to abuse CPU exceptions, hijack hardware interrupts, and weaponize the system's own debugging features against reverse engineers.
To defeat these protections today, modern reverse engineers rely on instruction-level emulators (like WinUAE's built-in debugger), which allow for invisible tracing, hardware watchpoints, and memory dumps that the running game cannot detect.

View file

@ -0,0 +1,7 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# Reverse Engineering Case Studies
## Contents
- [ramdrive_device.md](ramdrive_device.md) — ramdrive.device walkthrough

View file

@ -0,0 +1,152 @@
[← Home](../README.md) · [Reverse Engineering](README.md)
# Bypassing Custom Loaders and DRM Analysis
In the classic Amiga era, the vast majority of commercial games **did not use AmigaOS or `dos.library`**. Instead, they booted directly from the floppy disk's bootblock, took full control of the hardware, and used custom MFM (Modified Frequency Modulation) routines to load data directly from the floppy drive (`trackdisk.device` or direct hardware banging).
This technique, known as a **Trackloader**, allowed for faster loading, non-standard disk formats (holding >880KB), and robust copy protection (DRM). For a reverse engineer, analyzing an Amiga game almost always starts with defeating the trackloader and its associated DRM.
---
## 1. The Boot Sequence & Taking Control
When an Amiga boots from a floppy, the Kickstart ROM reads the first two sectors (1024 bytes) into memory. If the first four bytes are `DOS\0`, it treats it as a standard AmigaDOS disk. If the bootblock is executable, the ROM jumps to it.
```mermaid
graph TD
A[Kickstart Boot] -->|Reads Block 0 & 1| B{Signature Check}
B -->|'DOS\\0'| C[Mount FastFileSystem]
B -->|Other Executable| D[Execute Bootblock Code]
D --> E[Disable OS Multitasking]
E --> F[Take Over Interrupts & DMA]
F --> G[Execute Custom Trackloader]
style D fill:#ffcccc,stroke:#ff0000
style G fill:#ffcccc,stroke:#ff0000
```
### 1.1 The "Hardware Takeover" Pattern
Commercial games typically execute a standard sequence to disable AmigaOS and ensure uninterrupted hardware access:
```assembly
; Standard Hardware Takeover (Often found in Bootblocks)
move.w #$7FFF, $dff09a ; INTENA - Disable all interrupts
move.w #$7FFF, $dff096 ; DMACON - Disable all DMA (sprites, copper, bitplane)
move.l $4.w, a6 ; Get ExecBase
jsr -132(a6) ; Forbid() - Disable multitasking
jsr -120(a6) ; Disable() - Disable OS interrupts
; Setup custom VBlank and hardware
move.l #my_copperlist, $dff080
move.w #$8020, $dff096 ; Enable Sprite DMA
move.w #$c020, $dff09a ; Enable VBlank Interrupt
```
> [!CAUTION]
> Once the bootblock calls `Forbid()` and writes to `INTENA`, standard Amiga debuggers (like MonAM or HRTmon running under the OS) will lose control unless they are Action Replay hardware cartridges or running in emulation (WinUAE/FS-UAE).
---
## 2. Trackloader Architecture
A trackloader replaces `dos.library` with custom code that reads raw MFM data from the floppy controller (Paula).
### 2.1 MFM and Sync Words
Floppy disks store data as magnetic flux transitions. To prevent long sequences of zeros (which cause the read head to lose synchronization), data is encoded using MFM.
To find the start of a sector, the hardware looks for a specific 16-bit **Sync Word**. The standard Amiga sync word is `$4489`.
### 2.2 Finding the Trackloader in IDA/Ghidra
When reverse engineering a game, you must locate the trackloader to extract the game's actual files. Look for these specific hardware register accesses:
| Address | Register | What it means in a Trackloader |
|---|---|---|
| `$DFF07E` | `DSKSYNC` | Setting the Sync Word (usually `$4489`) |
| `$DFF020` | `DSKPTH` | Disk DMA Pointer (where to write the read MFM data) |
| `$DFF024` | `DSKLEN` | Disk DMA Length (how many words to read, and start reading) |
| `$BFD100` | `CIAB-PRB` | CIA-B Port B: Used for floppy motor control, head stepping, and side selection. |
### 2.3 Advanced Trackloader Disk Formats
To prevent copying by standard AmigaDOS tools like X-Copy, developers altered the physical geometry and structure of the tracks on the floppy disk:
1. **Long Tracks**: A standard AmigaDOS track holds 11 sectors. Trackloaders would format tracks with 12 sectors by slightly reducing the physical gaps between sectors, fitting more data (and breaking standard sector-by-sector copying algorithms).
2. **Sync-less Formats**: Bypassing the `$4489` sync word entirely. The trackloader uses raw timing loops (via CIA timers) or completely custom bit-patterns to find the start of the data stream, preventing hardware from locking onto the track.
3. **Fuzzy/Weak Bits**: Mastering original disks with weak magnetic flux. The hardware reads the bit randomly as a 0 or a 1. When a standard drive copies the disk, it writes a "strong" 0 or 1. The DRM reads the sector multiple times; if the bit doesn't flip, it knows it's a pirated copy.
### 2.4 The Decodification Loop
After reading raw MFM data into memory, the trackloader must decode it back into binary data. This almost always involves interleaved loops using the `eor` (exclusive OR) instruction or bit-shifting to separate clock bits from data bits.
```assembly
; Typical MFM decoding loop footprint
.decode_loop:
move.l (a0)+, d0 ; Read MFM odd bits
move.l (a1)+, d1 ; Read MFM even bits
and.l d2, d0 ; Mask clock bits ($55555555)
and.l d2, d1
lsl.l #1, d0 ; Shift odd bits
or.l d1, d0 ; Combine into decoded byte
move.l d0, (a2)+ ; Write decoded data
dbf d7, .decode_loop
```
> [!TIP]
> If you find a loop utilizing the constant `$55555555` extensively near disk hardware access, you have found the MFM decoder.
---
## 3. DRM: The Rob Northen Copylock
The most famous Amiga copy protection is the **Rob Northen Copylock** (RN Copylock). It was designed to prevent cracking by encrypting the game executable and tying the decryption key to a physical flaw deliberately mastered onto track 0 of the original floppy disk.
### 3.1 Copylock Architecture
```mermaid
graph LR
A[Game Executable] --> B[Encrypted Payload]
A --> C[Copylock Wrapper]
C -->|Read Track 0| D[Custom MFM Read]
D -->|Extract Timing Data| E[Generate Decryption Key]
E -->|Decrypt & Jump| B
```
### 3.2 Trace Vector Abuse
Rob Northen's genius was preventing crackers from stepping through the decryption code by abusing the Motorola 68000 **Trace Exception** (Vector $24). By pointing the Trace vector to its own decryption routine and setting the CPU Trace bit, the code decrypts itself via hardware interrupts.
> **Deep Dive**: For a complete analysis of Copylock's trace exception abuse, CIA timer checks, and other anti-cracking techniques, see the dedicated [Anti-Debugging & Arms Race](anti_debugging.md) article.
---
## 4. Modern Analogies
| Amiga Paradigm | Modern Equivalent | Explanation |
|---|---|---|
| **Trackloader** | Custom Bootloader / Hypervisor | Bypassing the standard OS kernel to control hardware I/O directly for performance or DRM purposes. |
| **Rob Northen Trace Abuse** | Denuvo Anti-Tamper / VMProtect | Using CPU exceptions, hardware-specific timing, and virtualization to break standard debuggers (x64 SEH/Vectored Exception Handling abuse). |
| **MFM `$55555555` Decoding** | Base64 / AES Decryption loops | Transforming an obfuscated or wire-encoded stream back into executable binary code in memory before jumping to it. |
---
## 5. Reverse Engineering Best Practices
1. **Memory Dumps over Static Analysis**: Because most commercial games are packed (Imploder, PowerPacker) and encrypted (Copylock), static analysis of the binary on disk is often useless. Use WinUAE/FS-UAE's built-in debugger to let the game boot, decrypt itself into RAM, and then take a memory dump to analyze in IDA Pro.
2. **Identify `$DFF024` (DSKLEN)**: Set write breakpoints on `$DFF024` in your emulator. This is the hardware trigger to start a disk DMA read. When it hits, look at the stack to find the trackloader code.
3. **Beware of `$4.w` (ExecBase)**: If a game reads `$4.w` and immediately calls `Forbid()` (offset `-132`), it is preparing to kill the OS. Put your breakpoints *before* this happens if you are relying on an OS-level debugger.
---
## 6. FAQ
**Q: Why did games use Trackloaders instead of standard AmigaDOS files?**
A: AmigaDOS (OFS) has significant overhead. It requires memory for file buffers, wastes bytes on directory structures, and the floppy motor turns off between file reads. A custom trackloader keeps the motor spinning and reads entire raw cylinders into RAM sequentially, reducing loading times from minutes to seconds.
**Q: How do WHDLoad patches work with Trackloaders?**
A: WHDLoad is an OS-replacement system that patches games to run from hard drives. A WHDLoad "Slave" (the patch file) replaces the game's custom trackloader (which expects floppy hardware) with calls to WHDLoad's `resload_DiskLoad` API, emulating the floppy load via standard hard drive I/O.
**Q: If Copylock relies on a physical disk flaw, how do cracked ADFs work?**
A: Cracked disk images (ADFs) contain the already-decrypted game executable, with the Copylock routine completely bypassed or stubbed out (`NOP` instructions). The physical flaw cannot be represented in a standard ADF, which is why original, uncracked games must be preserved in IPF (Interchangeable Preservation Format) instead of ADF.

View file

@ -0,0 +1,10 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# Dynamic Analysis & Debugging
## Contents
- [serial_debug.md](serial_debug.md) — Monitoring system output via the serial port
- [live_memory_probing.md](live_memory_probing.md) — Using wack, MonAm, and other on-device tools
- [enforcer_mungwall.md](enforcer_mungwall.md) — Detecting illegal memory access (MMU protection)
- [setfunction_patching.md](setfunction_patching.md) — Tracing and intercepting library vectors

View file

@ -0,0 +1,8 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# Game Reverse Engineering
## Contents
- [game_reversing.md](game_reversing.md) — Asset extraction, save games, and engine disassembly
- [whdload_architecture.md](whdload_architecture.md) — WHDLoad internals, slaves, and memory patching

View file

@ -0,0 +1,712 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# Game Reverse Engineering — Disassembly, Modification, and Asset Extraction
## Overview
Commercial Amiga games were fortresses built from hand-written 68000 assembly, custom trackloaders, and executable packers. The OS was irrelevant — most titles booted straight from disk, seized the hardware, and never called `OpenLibrary` in their lives. Reversing them is not like reversing an AmigaOS `.library` where you anchor on `JSR -N(A6)` and follow the LVO table. You are facing raw metal: arbitrary register usage, self-modifying code, data embedded mid-instruction, and protection schemes designed by people who understood the 68000 trace exception better than Motorola's own engineers.
This article covers the complete workflow for reverse engineering Amiga games — from the first triage of an NDOS disk image to identifying the memory location that holds your lives counter, extracting sprite data, and building a patch that compiles. The techniques here apply equally to demos, bootblock intros, and any binary that bypasses AmigaOS.
> [!NOTE]
> This article assumes you are already comfortable with 68000 assembly and Amiga hardware registers. If not, start with [Hand-Written Assembly RE](../static/asm68k_binaries.md) and the [OCS Custom Registers](../../01_hardware/ocs_a500/custom_registers.md).
---
## The Game Binary Landscape
Before touching a disassembler, determine what kind of binary you are facing. The approach differs radically.
```mermaid
graph TD
START["Load disk image"] --> SIGNATURE{"Bootblock signature?"}
SIGNATURE -->|"DOS\\0"| DOS["AmigaDOS disk<br/>files accessible"]
SIGNATURE -->|"Other / NDOS"| NDOS["Non-DOS bootblock<br/>custom trackloader"]
DOS --> FILETYPE{"Examine main executable"}
NDOS --> PROTECTED{"Trackloader type?"}
PROTECTED -->|"Standard MFM"| MFM["Raw track read<br/>sync $4489"]
PROTECTED -->|"Custom format"| CUSTOM["Non-standard sync<br/>long tracks, weak bits"]
FILETYPE -->|"HUNK executable"| HUNK["Standard RE workflow"]
FILETYPE -->|"Packed / Crunched"| PACKED["Decrunch first<br/>→ exe_crunchers.md"]
FILETYPE -->|"Raw binary blob"| RAW["Absolute load address<br/>no relocations"]
```
### NDOS vs DOS-Based Games
| Type | Boot Sequence | Disk Format | RE Strategy |
|---|---|---|---|
| **NDOS** | Bootblock disables OS, installs custom trackloader | Non-standard or raw MFM | Dump from emulator after boot; analyze raw memory image |
| **DOS-based** | Standard AmigaDOS boot; executable launched from disk | Standard OFS/FFS | Analyze HUNK executable directly; may still use hardware banging |
| **Hybrid** | AmigaDOS boot, but executable takes over hardware | Standard filesystem | Analyze HUNK, but expect `Forbid()` + direct register access |
Most pre-1992 games are NDOS. Most post-1992 titles (especially AGA games and CD32 ports) use AmigaDOS but still bypass the OS for graphics and audio.
### Packed vs Unpacked
Games were packed to fit on floppies. Common packers:
| Packer | Era | Signature | Decrunch Speed |
|---|---|---|---|
| **PowerPacker** | 19891994 | `$42` + `LEA`/`MOVE.L` pattern | Fast |
| **Imploder** | 19881992 | `$49` (often); ATN!Imploder header | Medium |
| **ByteKiller** | 19881991 | Short `BRA.S` over header, `MOVEM.L` | Very fast |
| **Shrinkler** | 1999+ | Context-mixing setup; no fixed magic | Slow (minutes on 7 MHz) |
> [!IMPORTANT]
> Always attempt automated decrunching with `xfdmaster.library` before manual analysis. See [Executable Unpacking](../unpacking_and_decrunching.md) for the full decruncher archaeology workflow.
### Copy Protection Landscape
| Scheme | Mechanism | How to Defeat |
|---|---|---|
| **Rob Northen Copylock** | Trace exception decryption tied to disk timing | Let it run in emulator; dump decrypted payload from RAM |
| **Custom trackloader** | Non-standard MFM, long tracks | Use RawDIC + Imager Slave; see [Custom Loaders](../custom_loaders_and_drm.md) |
| **Weak / fuzzy bits** | Mastered flux that reads randomly | Preserve as IPF; ADF loses the weakness |
| **Checksum loops** | Self-checksums with delayed failure | NOP out checksum routine; trace to find patch point |
---
## Tooling
### IRA — Interactive Reassembler
IRA is the native Amiga disassembler of choice for generating **re-assemblable source code**. Unlike IDA or Ghidra, which produce annotated databases, IRA outputs 68000 assembly source (`.asm`) plus a configuration file (`.cnf`) that you can refine iteratively.
**Basic workflow:**
```bash
# First pass: auto-detect code vs data
ira -A -KEEPZH -NEWSTYLE -COMPAT=bi -PREPROC MatchPatch
# Refine the .cnf file manually, then re-run with config:
ira -A -KEEPZH -NEWSTYLE -COMPAT=bi -CONFIG MatchPatch
```
| Flag | Purpose |
|---|---|
| `-A` | Show hex bytes alongside disassembly (invaluable for hex editing later) |
| `-KEEPZH` | Preserve zero hunks (empty hunks that may hold metadata) |
| `-NEWSTYLE` | Modern label naming convention |
| `-COMPAT=bi` | Big-endian compatibility mode |
| `-PREPROC` | Attempt automatic code/data separation |
| `-CONFIG` | Re-run using manual `.cnf` corrections |
**Why IRA over IDA/Ghidra for games?**
- Outputs compilable assembly source, not just annotations
- The `.cnf` file lets you add `SYMBOL` (rename labels), `LABEL` (add new labels), `COMMENT`, and `BANNER` directives, then regenerate
- Native awareness of Amiga executable quirks
- Cross-platform: compiles for Windows, macOS, and Linux (and runs much faster than on real hardware)
**Signs IRA misidentified code as data:**
- `DC.L` lines containing values like `$4E75` (`RTS`) or `$4E71` (`NOP`)
- Data areas with labels that look like subroutine names
**Signs IRA misidentified data as code:**
- Strings of `EXT_` declarations at the start of a program
- Code sections full of `ORI #0` (`$0000`)
- Sequences of hex values from `$41` to `$7A` — these are likely unidentified ASCII text
### Ghidra + ghidra-amiga
Ghidra with the `ghidra-amiga` extension provides a full HUNK loader, M68k decompiler, custom chip register mapping, and automatic LVO resolution. See [Ghidra Setup](../ghidra_setup.md) for installation and configuration.
**When Ghidra shines for games:** C-coded late-era titles, cross-reference graphs, global renaming.
**When Ghidra struggles:** Hand-written assembly with no prologues, self-modifying code, `JMP (PC, D0.W)` jump tables, and mixed code/data sections.
### IDA Pro
IDA Pro with the Amiga HUNK plugin is the traditional static analysis choice for Amiga binaries. It excels at interactive annotation, FLIRT signatures, and scripted automation. See [IDA Setup](../ida_setup.md) for configuration details.
**When IDA shines for games:** Interactive tracing, custom IDA Python scripts for jump table resolution, and hardware register enum creation.
**When IDA struggles:** No native M68k decompiler (unlike Ghidra). Heavily optimized hand-written assembly requires manual function boundary definition.
### Where to Get the Tools
| Tool | Where to Obtain | Notes |
|---|---|---|
| **IRA** | Aminet: `dev/misc/ira.lha` | Also compiles from source for Windows/macOS/Linux |
| **Ghidra** | https://ghidra-sre.org/ | Free, from NSA; v10.x+ recommended |
| **ghidra-amiga** | https://github.com/BartmanAbyss/ghidra-amiga | Load as Ghidra extension; do not unzip |
| **IDA Pro** | https://hex-rays.com/ida-pro/ | Commercial; requires separate Amiga HUNK plugin |
| **WinUAE** | https://www.winuae.net/ | Windows Amiga emulator with built-in debugger |
| **FS-UAE** | https://fs-uae.net/ | Cross-platform (macOS/Linux/Windows); debugger via `Shift+F12` |
| **xfdmaster.library** | Aminet: `util/pack/xfdmaster.lha` | Native Amiga decruncher; use via `xfdDecrunch` CLI |
| **hunkinfo** | Aminet: `dev/misc/hunkinfo.lha` | Quick hunk structure dump |
| **SPS / IPF tools** | https://softpres.org/ | For preserving copy-protected disks as IPF |
| **RawDIC** | Bundled with WHDLoad distribution | Used with custom Imager Slaves for protected disks |
> [!NOTE]
> Many of these tools are also available pre-installed in curated Amiga emulation distributions like **Amiga Forever** or pre-configured WinUAE environments from EAB.
### Emulator Debugging
Static analysis alone is often insufficient for games. You need dynamic verification.
**WinUAE / FS-UAE debugger:**
| Key | Action |
|---|---|
| `Shift+F12` | Enter debugger |
| `g <address>` | Go to address |
| `z` | Step one instruction |
| `t` | Trace (step into) |
| `W <address> <length>` | Write watchpoint |
| `R <address> <length>` | Read watchpoint |
| `m <address>` | Dump memory |
| `d <address>` | Disassemble from address |
| `s "filename" <start> <end>` | Save memory range to file |
**Critical technique — memory dump after decrunch:**
1. Boot the game in emulator
2. Enter debugger (`Shift+F12`)
3. Find the decruncher's final `JMP` to the original entry point
4. Set a breakpoint on that `JMP`
5. Let the game run — it decrunches in memory
6. When breakpoint hits, dump the entire decrunched region with `s`
---
## Pre-Flight: Research Before Disassembly
The most common mistake in game RE is failing to check if the work is already done.
1. **Search for existing analysis** — EAB (English Amiga Board) threads, GitHub repos, speedrun communities, and TCRF (The Cutting Room Floor) often document exactly what you are looking for.
2. **Check for source code releases** — Original authors sometimes release source decades later (e.g., *Frontier: Elite II* sources, various demo sources).
3. **Contact the author** — Many Amiga developers are reachable and willing to share insights or even original source.
4. **Preserve the original media** — If dealing with copy-protected disks, create IPF images (not ADF) using SPS/IPF tools. ADF loses weak-bit protection and custom track formats.
---
## Phase 1: Triage and Loading
### Step 1: Identify the Binary Type
```bash
# Check first bytes of disk image or executable
xxd game.adf | head -1
```
| First Bytes | Meaning |
|---|---|
| `44 4F 53 00` (`DOS\0`) | Standard AmigaDOS disk |
| Other executable | NDOS bootblock or custom format |
| `00 00 03 F3` | HUNK executable (file, not disk) |
### Step 2: For NDOS Disks — Extract the Bootblock
The bootblock is the first 1024 bytes (2 sectors) of the disk. It is loaded to `$7C00` by Kickstart and executed directly.
```bash
# Extract bootblock from ADF
dd if=game.adf of=bootblock.bin bs=1024 count=1
```
Analyze `bootblock.bin` at base address `$7C00`. Look for:
- `BRA` or `JMP` to the main loader
- `DSKSYNC` writes (`$DFF07E`) — indicates trackloader
- `INTENA` / `DMACON` writes — indicates OS takeover
### Step 3: For HUNK Executables — Dump Structure
```bash
hunkinfo game.exe
```
Note hunk types, sizes, and whether symbols are present. Some late-era games shipped with debug symbols accidentally left in — these are gold.
### Step 4: Detect Packing
Scan the first CODE hunk for packer signatures. See [Executable Unpacking](../unpacking_and_decrunching.md) for the full signature table.
---
## Phase 2: Finding Anchors
A 500 KB game binary is overwhelming. You need **anchors** — known values or patterns that let you orient yourself.
### Anchor 1: Text Strings
Games contain strings for menus, cheat codes, status messages, and file names.
```bash
# Extract strings from binary
strings game.exe > strings.txt
# Or with IRA:
ira -TEXT=1 -A -PREPROC game.exe
```
**String types and what they reveal:**
| String Pattern | Likely Meaning |
|---|---|
| `"graphics.library"` | Game uses OS graphics (unusual) |
| `"dos.library"` | Game uses OS file I/O |
| `"FORM"`, `"ILBM"`, `"8SVX"` | Embedded IFF assets |
| `"MATCHPATCH"`, `"ZOOL"` | Game title or internal project name |
| File paths like `"worlds/nif2txt.dat"` | Level data loading routines nearby |
| `"CHEAT ENABLED"` | Cheat code handler |
> [!WARNING]
> `strings` often returns garbage from misidentified data sections. Cross-reference with the disassembly to confirm the string is actually referenced by code.
### Anchor 2: Known Numeric Values
Games contain specific numbers: starting lives, maximum health, item prices, level counts.
| Decimal | Hex (16-bit) | Hex (32-bit) | Likely Meaning |
|---|---|---|---|
| 3 | `$0003` | `$00000003` | Starting lives |
| 10 | `$000A` | `$0000000A` | Common statistic |
| 100 | `$0064` | `$00000064` | Percentage scale, health |
| 1000 | `$03E8` | `$000003E8` | Score multiplier, currency |
| 320 | `$0140` | `$00000140` | Screen width (lowres) |
| 200 | `$00C8` | `$000000C8` | Screen height (PAL lowres) |
Search for these values in hex. If you find a `MOVE.W #$0003, D0` near initialization code, you have likely found the lives setup.
### Anchor 3: Hardware Register Accesses
Even games that take over the OS usually hit hardware registers. These are unambiguous anchors.
| Register | Address | What It Reveals |
|---|---|---|
| `JOY0DAT` | `$DFF00A` | Joystick/mouse port 0 reads — player input handling |
| `JOY1DAT` | `$DFF00C` | Joystick/mouse port 1 reads |
| `AUD0LCH``AUD3LCH` | `$DFF0A0``$DFF0D0` | Audio channel setup — sound effects, music |
| `VHPOSR` | `$DFF006` | Vertical/horizontal position — RNG seeding, VBlank waits |
| `VPOSR` | `$DFF004` | Vertical position (high bits) — frame timing |
| `COP1LC` | `$DFF080` | Copper list pointer — display setup |
| `BLTCON0` | `$DFF040` | Blitter control — graphics rendering |
**Input handling identification:**
```asm
; Classic joystick read pattern:
MOVE.W $DFF00A, D0 ; Read JOY0DAT
AND.W #$0101, D0 ; Mask direction bits
; ... decode into game state ...
```
**Random number generator identification:**
```asm
; Common RNG seed pattern — uses beam position for entropy:
MOVE.W $DFF006, D0 ; VHPOSR = current raster position
; ... shuffle with ROXR ...
```
The `ROXR` (rotate right with extend) instruction is a dead giveaway for RNG routines. Once you find the RNG, every caller is potentially a game mechanic.
### Anchor 4: AmigaOS Library Calls
Some games, especially later titles and CD32 ports, use AmigaOS for initialization or file I/O.
```asm
; OpenLibrary call pattern:
MOVEA.L 4.W, A6 ; SysBase
LEA graphics_name(PC), A1
MOVEQ #33, D0 ; minimum version
JSR -552(A6) ; OpenLibrary
```
| LVO | Library | Function | Game RE Relevance |
|---|---|---|---|
| `-552` | exec | `OpenLibrary` | Loading libraries |
| `-30` | dos | `Open` | File loading — level data, save games |
| `-42` | dos | `Read` | Reading data into memory — reveals file→memory mapping |
| `-48` | dos | `Write` | **Save games and high scores** — critical for state analysis |
| `-36` | dos | `Close` | File cleanup |
**Save game exploitation:** `Write` calls in games are almost always save games or high scores. The data written holds persistent game state (inventory, stats, level progress). Finding the `Write` call reveals the in-memory structure of the save game, enabling editor construction.
### Anchor 5: File Read Patterns
```asm
; File read pattern — reveals memory destinations:
JSR -30(A6) ; Open(file_name, MODE_OLDFILE)
MOVEA.L D0, D1 ; FileHandle
MOVEA.L #buffer, D2 ; Destination address
MOVE.L #size, D3 ; Bytes to read
JSR -42(A6) ; Read(FileHandle, buffer, size)
```
The `buffer` address is the in-memory location of that file's data. Cross-reference this with string anchors (e.g., `"worlds/nif2txt.dat"`) to map file contents to memory layout.
---
## Phase 3: Mapping Game Mechanics
Once anchored, trace outward to reconstruct the game's logic.
### Finding the Main Game Loop
Games need to synchronize to the display refresh (50 Hz PAL / 60 Hz NTSC). Look for:
```asm
; VBlank wait pattern — the heartbeat of the game:
wait_vblank:
MOVE.W $DFF006, D0 ; VHPOSR
AND.W #$FF00, D0 ; Mask vertical position
CMP.W #$0000, D0 ; Wait for line 0 (start of frame)
BNE.S wait_vblank
```
Or the more common VPOSR check:
```asm
wait_frame:
MOVE.W $DFF004, D0 ; VPOSR
AND.W #$01FF, D0 ; Mask vertical position bits
CMP.W #303, D0 ; Wait for bottom of PAL frame
BNE.S wait_frame
```
The code immediately following the VBlank wait is the **main game loop**.
### Identifying Score and Lives
1. **Hex search**: Search for the starting value (e.g., 3 lives = `$0003`).
2. **Cross-reference**: Find all instructions that write this value. One is initialization; others are decrement (lose life) or increment (gain life).
3. **Verify with emulator**: Patch the value at the memory location and run the game. If you start with 99 lives, you found it.
### Identifying the RNG
```asm
; Typical Amiga game RNG (seeded from beam position):
rng_seed:
MOVE.W $DFF006, D0 ; VHPOSR
EOR.W D0, rng_state ; Mix with current state
ROXR.W #1, rng_state ; Shuffle
MOVE.W rng_state, D0 ; Return random value
RTS
```
**Key signatures:**
- Read from `$DFF006` or `$DFF004`
- `ROXR` or `ROR` instruction
- Called from combat, item drops, enemy spawn, or any probabilistic mechanic
### Audio as a Navigation Aid
Audio register writes reveal what the code is doing:
```asm
; Sound effect trigger:
MOVE.L #bullet_sample, $DFF0A0 ; AUD0LCH/LCL = sample pointer
MOVE.W #period, $DFF0A6 ; AUD0PER = playback period
MOVE.W #volume, $DFF0A8 ; AUD0VOL = volume
```
If you extract the sample referenced by `bullet_sample` and hear a gunshot, you have found the shooting code. From there, trace back to find collision detection, enemy damage, and scoring.
---
## Phase 4: Modification and Patching
### Hex Editing Known Values
Once you have identified a value's location in the binary:
1. Note the file offset and original bytes from the IRA `-A` output
2. Open the binary in a hex editor
3. Search for the unique byte sequence surrounding the value
4. Patch and test
| File Offset | Original | Patched | Effect |
|---|---|---|---|
| `$0001A4` | `66 0A` | `4E 71 4E 71` | Replace `BNE` with two `NOP`s (defeat branch) |
| `$003210` | `03` | `63` | Change starting lives from 3 to 99 |
> [!WARNING]
> Patched bytes must preserve instruction alignment. A 16-bit `MOVE.W` patch that changes length will shift all subsequent code and break absolute addresses.
### Building a Re-Assemblable Patch
For complex modifications, IRA's output is ideal:
1. Disassemble with IRA to get `game.asm` and `game.cnf`
2. Edit `game.asm` directly (e.g., change `MOVEQ #3, D0` to `MOVEQ #99, D0`)
3. Assemble with vasm or AsmOne
4. Test on emulator
### Trainer / Cheat Menu Construction
A trainer is a small patch that installs a hotkey handler to modify game state at runtime:
```asm
; Minimal trainer hook — intercepts keyboard and grants lives
MOVE.W $BFEC01, D0 ; Read keyboard data port (CIAA)
NOT.B D0
ROR.B #1, D0 ; Decode raw keycode
CMP.B #$45, D0 ; F1 key?
BNE.S .no_cheat
MOVE.B #99, lives_counter ; Grant 99 lives
.no_cheat:
```
Install this in the VBlank interrupt or keyboard handler. See [SetFunction Patching](../dynamic/setfunction_patching.md) for runtime hook techniques.
---
## Phase 5: Asset Extraction
### Text and Strings
Use the IRA `-TEXT=1` option or `strings` to find all text. For games with custom text encoding (e.g., compressed or shifted character sets), identify the font rendering routine and reverse the encoding table.
### Graphics — IFF Extraction
Amiga games often store graphics in IFF format (`FORM ILBM`) or raw planar bitmaps.
**IFF detection:** Search for `FORM` (`$464F524D`) and `ILBM` (`$494C424D`) signatures in the binary or memory dump. The IFF header gives you width, height, depth, and palette.
**Raw planar extraction:**
1. Find `BPL1PT``BPL5PT` writes in the copper list or code
2. The pointers reveal bitmap base addresses in memory
3. Dump the memory range; decode as planar (interleaved or non-interleaved based on `BPLMOD`)
### Audio — Sample and Module Extraction
| Format | Signature | Extraction |
|---|---|---|
| **8SVX** | `FORM` + `8SVX` | IFF audio chunk; playable directly |
| **Protracker MOD** | `M.K.`, `FLT4`, `4CHN` | Standard 31-sample + pattern data format |
| **Raw PCM** | None — identified via `AUDxLCH` writes | Mono 8-bit signed; import to Audacity as raw 8-bit signed, ~800028000 Hz |
---
## Decision Guide: Choosing Your Toolchain
| Scenario | Recommended Tool | Why |
|---|---|---|
| Need re-assemblable source | **IRA** | Outputs `.asm` + `.cnf`; iterative refinement |
| Need C pseudocode / cross-references | **Ghidra + ghidra-amiga** | Decompiler, global renaming, xref graph |
| Heavy OS library usage (late games) | **Ghidra** | Automatic LVO resolution |
| Pure assembly, no OS calls | **IRA + emulator** | Ghidra decompiler gives up; IRA + dynamic trace works better |
| Packed / protected game | **Emulator debugger first** | Let protection run, dump decrypted memory, then load into static tool |
| Quick value patch (lives, score) | **Hex editor** | Fastest for one-byte changes |
| Bootblock analysis (1024 bytes) | **IRA or raw disassembly** | Small enough to read linearly |
---
## Historical Context
### Why Games Bypassed the OS
| Factor | Impact |
|---|---|
| **7 MHz CPU** | Every CPU cycle mattered. `graphics.library` added overhead (layer locking, clipping checks). |
| **512 KB Chip RAM** | OS structures consumed precious DMA-accessible memory. Games needed every byte for sprites and sound. |
| **Disk speed** | 880 KB floppy at ~50 KB/s effective. Custom trackloaders achieved 23× speed by reading raw tracks sequentially. |
| **Copy protection** | AmigaDOS disks were trivially copyable with X-Copy. NDOS + custom formats + weak bits made mass duplication harder. |
| **Demoscene culture** | Assembly was the standard. Using a C compiler for a game engine was seen as lazy until the mid-1990s. |
### The NDOS-to-DOS Transition
- **19851990**: Almost all commercial games are NDOS, hand-written assembly, custom trackloaders.
- **19901993**: Hybrid era. Games boot from AmigaDOS but take over hardware after loading. Some use `dos.library` for file I/O.
- **19931996**: AGA and CD32 era. Larger budgets, more C code, AmigaDOS-based loading. WHDLoad emerges to patch games for hard drive installation.
---
## Modern Analogies
| Amiga Game RE Concept | Modern Equivalent | Where It Holds / Breaks |
|---|---|---|
| NDOS bootblock takeover | UEFI bootkit / custom bootloader | Holds: bypasses OS entirely. Breaks: bootblock is 1024 bytes, UEFI is MBs. |
| Custom trackloader | Direct NAND flash controller access | Holds: raw media access for speed. Breaks: no MFM encoding on flash. |
| Executable packer | UPX, VMProtect packing | Holds: runtime decompression + jump to OEP. Breaks: modern packers use virtualization. |
| Rob Northen Copylock | Denuvo anti-tamper | Holds: trace/exception abuse, timing checks. Breaks: Copylock is 68000-specific; Denuvo uses x64 VM. |
| Hardware register banging | Embedded MCU programming (STM32, Arduino) | Holds: direct MMIO register access. Breaks: Amiga chips are video/audio-specific. |
| Memory patch (lives counter) | Cheat Engine / GameGuardian | Holds: scan for known value, patch at runtime. Breaks: modern games use encrypted/process-isolated memory. |
---
## Best Practices
1. **Always preserve original media as IPF before modifying** — ADF loses copy protection and custom formats.
2. **Try automated decrunching first**`xfdmaster.library` can save hours.
3. **Document every patch with file offset, original bytes, and rationale** — you will forget why you changed something.
4. **Use the `-A` flag in IRA** — seeing raw hex bytes alongside disassembly is essential for building patch tables.
5. **Verify anchors dynamically** — a suspected lives counter may actually be a loop iterator. Patch and test in emulator.
6. **Build a register map as you trace** — hand-written assembly has no ABI. Document what each register means in each routine.
7. **Save memory dumps at key moments** — after decrunch, after level load, after title screen. Compare dumps to find dynamic data structures.
8. **Trace audio register writes to locate game events** — sound effects are the most reliable event markers in assembly.
9. **Cross-reference file reads with string anchors**`"level1.dat"` + `dos.library Read` = level data structure.
10. **Work iteratively** — name one function, trace its callers, name them too. Do not attempt to understand the entire binary in one pass.
---
## Antipatterns
### 1. The Linear Reading Trap
**Wrong**: Opening the disassembly at offset 0 and reading top-to-bottom expecting to understand the game.
**Why it fails**: Hand-written assembly is non-linear. The entry point sets up interrupts and copper lists, then the real game logic lives in ISR chains and event handlers scattered across the binary.
**Right**: Start from anchors (strings, hardware registers, known values) and trace outward using cross-references.
### 2. The Compiler Assumption
**Wrong**: Expecting `A6` to be a library base, `D0`/`D1` to be scratch, and `LINK`/`UNLK` function boundaries.
**Why it fails**: Games are hand-written assembly. `A6` might hold the hardware base pointer. `D6` might be the frame counter. Functions may have no prologue.
**Right**: Treat every register as unknown until proven otherwise. Document the actual convention per routine.
### 3. The OS Dependency Delusion
**Wrong**: Searching extensively for `JSR -N(A6)` library calls to anchor analysis.
**Why it fails**: Most games make zero OS calls after initialization. The action is at `$DFF000` and `$BFE001`, not in `exec.library`.
**Right**: Scan for hardware register constants (`$DFF`, `$BFE`) first. If none appear, then check for OS calls.
### 4. The Phantom String
**Wrong**: Assuming every readable ASCII sequence in the binary is a meaningful string.
**Why it fails**: Random data bytes can decode as printable ASCII. A string with no code cross-reference is likely not a string.
**Right**: Always verify that code actually references the string address (via `LEA string(PC), A0` or similar).
### 5. The In-Place Patch Disaster
**Wrong**: Changing a `MOVE.W #3, D0` to `MOVE.W #99, D0` without checking instruction length.
**Why it fails**: `MOVEQ #3, D0` is 2 bytes. `MOVE.W #99, D0` is 4 bytes. The patch overflows into the next instruction, corrupting the code stream.
**Right**: Use equivalently-sized instructions. `MOVEQ #99, D0` is invalid (`MOVEQ` range is -128 to +127, so `#99` is fine and still 2 bytes). For larger values, `MOVE.W` is required but check alignment.
---
## Pitfalls & Common Mistakes
### 1. Misidentifying Data as Code
Mixed code/data is the norm in game binaries. Copper lists, sprite data, and audio samples often reside in CODE hunks.
```asm
; This looks like instructions:
OR.B #$80, D0
OR.B #0, D0
OR.B #$82, D0
OR.B #$FF, D0
; But it is actually a copper list:
; DC.W $0180, $0000 = COLOR00 = $0000
; DC.W $0182, $0FFF = COLOR01 = $0FFF
```
**Fix**: Search for `COP1LC` writes to find copper list addresses. Force-define those ranges as data arrays, not code.
### 2. Ignoring Relocation State
When you dump decrunched memory from an emulator, absolute addresses have already been patched by the decrunch stub. If you try to run that dump at a different load address, it crashes.
**Fix**: Note the load address used by the emulator. If re-assembling, either use the same base address or reconstruct PC-relative addressing.
### 3. Debugging After OS Death
Games call `Forbid()` and disable interrupts early. If your debugger relies on AmigaOS (like MonAm or HRTmon), it stops working the moment the game takes over.
**Fix**: Use emulator built-in debuggers (WinUAE/FS-UAE `Shift+F12`) or hardware cartridges (Action Replay) that trap via NMI, not OS services.
### 4. Overlooking Self-Modifying Code
Copy protection and optimization both use SMC. The disassembly shows one instruction; the runtime executes another.
```asm
; Static disassembly shows:
MOVE.W D0, D0 ; This gets patched at runtime
; Init routine overwrites it:
MOVE.W #$4E71, (patched+2, PC) ; Patch in a NOP
```
**Fix**: Set write breakpoints on CODE hunk addresses in the emulator. If anything writes there during init, you have SMC.
### 5. Confusing VBlank Wait with Game Logic
The VBlank wait loop is easy to find but tells you nothing about *what* the game does each frame.
**Fix**: Trace forward from the VBlank wait exit. The next block is usually the frame update routine. Set a breakpoint there and step through one full frame.
---
## Use Cases
### Speedrun Research
Reverse engineering reveals frame-perfect mechanics: RNG seeds, hitbox dimensions, level transition triggers. Documenting the `RNG_seed` routine and its callers lets speedrunners manipulate luck.
### Translation Projects
Finding the text rendering routine and font data enables text replacement. Games with embedded ASCII strings are trivial; games with custom encoding require reversing the blit-based font routine.
### Save Game Editors
The `dos.library Write` call that saves game state reveals the exact memory structure of the persistent state. Mapping this structure enables external save game editors.
### Modding and Enhancement
Patching the weapon damage table, adding a cheat menu, or replacing audio samples all require understanding the binary's data layout. IRA's re-assemblable output makes this sustainable.
### Preservation and Documentation
Documenting the internal structure of unreleased or poorly documented games contributes to the historical record. TCRF and similar archives rely on this work.
---
## FAQ
### Q1: IRA vs Ghidra — which should I use?
Use **IRA** when you need re-assemblable source code or are working with pure hand-written assembly. Use **Ghidra** when you need cross-references, decompilation, or the game was written in C. Many RE projects use both: Ghidra for exploration, IRA for final patch generation.
### Q2: How do I handle a game with no readable strings?
High entropy and no strings suggest encryption or compression. Let the game boot in an emulator, dump memory after decrunch/decryption, then analyze the dump. The decrypted payload will have strings.
### Q3: Can I reverse a game back to C?
Not really. Generic decompilation of hand-written 68000 assembly produces unreadable pseudocode. The only successful "decompilations" are hand-crafted rewrites based on deep understanding of the assembly (e.g., *GLFrontier*).
### Q4: How do I find the level data format?
Anchor on file read calls (`dos.library Read`) or look for large data tables referenced by the rendering code. Level data often follows audio/graphic assets in memory. Compare memory dumps between levels to find what changes.
### Q5: What if the game uses a custom trackloader I can't read?
Use WinUAE's disk DMA breakpoint (`W $DFF024 2`) to catch every disk read. Trace backward from the breakpoint to find the trackloader code. Document the sync word, sector count, and MFM decode routine. See [Custom Loaders](../custom_loaders_and_drm.md).
### Q6: How do I patch a game that checksums itself?
Find the checksum routine (usually a tight loop with `ADD.L` or `EOR.L` over a memory range). NOP it out, or recalculate the checksum to match your patch. The checksum routine is often called from multiple places — patch all callers.
### Q7: Why does my patched game crash on real hardware but work in emulator?
Emulators are more forgiving of timing violations. Your patch may have altered cycle-exact code (e.g., a copper wait or blitter poll). Verify that you haven't changed instruction timing or introduced bus errors.
---
## References
- [Hand-Written Assembly RE](../static/asm68k_binaries.md) — Pure m68k binary methodology
- [Executable Unpacking](../unpacking_and_decrunching.md) — Decruncher archaeology and memory extraction
- [Custom Loaders & DRM](../custom_loaders_and_drm.md) — Trackloaders, copy protection, RawDIC
- [Ghidra Setup](../ghidra_setup.md) — Ghidra + ghidra-amiga extension configuration
- [Anti-Debugging](../anti_debugging.md) — Trace vector abuse, NMI defeat, checksum loops
- [WHDLoad Architecture](../whdload_architecture.md) — Slave authoring and snooping
- [Copper Programming](../../08_graphics/copper_programming.md) — Copper list format
- [Blitter Programming](../../08_graphics/blitter_programming.md) — Blitter register sequences
- [Paula Audio](../../01_hardware/ocs_a500/paula_audio.md) — Audio DMA registers
- *Amiga Hardware Reference Manual* — Custom chip register reference
- *M68000 Programmer's Reference Manual* — Instruction set and cycle timing
- EAB: Small IRA Tutorial — https://eab.abime.net (search "IRA tutorial")
- ghidra-amiga: https://github.com/BartmanAbyss/ghidra-amiga
- Tetracorp Amiga RE Guide — https://tetracorp.github.io/guide/reverse-engineering-amiga.html

View file

@ -0,0 +1,222 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# WHDLoad Architecture & Reverse Engineering
If [Trackloaders](custom_loaders_and_drm.md) were the developers' way of taking complete control of the Amiga to bypass the OS, **WHDLoad** is the modern reverse engineer's way of taking that control *back*.
WHDLoad is essentially an AmigaOS-compliant "hypervisor" that wraps hardware-banging games, fools them into thinking they have absolute control of an Amiga 500, and intercepts their physical hardware requests to run them from modern hard drives.
Creating a WHDLoad port of a protected game is a multi-stage process involving disk imaging, execution profiling, reverse engineering, and assembly programming.
---
## 1. End-to-End Developer Workflow
Before diving into the low-level details, the entire lifecycle of creating a WHDLoad patch flows from the physical floppy disk through reverse engineering, ending in a deployable hard-drive package.
```mermaid
graph TD
subgraph Phase1 [Phase 1: Imaging]
A[Original Protected Floppy] -->|RawDIC + Imager Slave| B[(Disk.1 Image)]
end
subgraph Phase2 [Phase 2: Reverse Engineering]
B --> C{WHDLoad Snoop Mode}
C -->|Logs hardware traps| D[Analyze Memory & Trackloader]
D -->|Write 68k Assembly| E[Compile Game Slave]
end
subgraph Phase3 [Phase 3: Execution]
F[AmigaOS / Workbench] -->|Launch| G[WHDLoad Host]
G -->|Reads| B
G -->|Reads| E
G -->|Kills OS, Allocates Walled Garden| H[Memory]
E -->|Hooks disk access & memory| I[Game Executable]
I -.->|resload API| G
end
```
---
## 2. The Imaging Phase: From Floppy to File
Before a game can run from a hard drive, its physical floppy disks must be perfectly preserved as data files.
### 2.1 DIC vs. RawDIC
WHDLoad provides two primary tools for imaging:
* **DIC (Disk Image Creator)**: Used for standard AmigaDOS formatted disks. It rips the disk into standard `.iso` or `.adf` style files (usually named `Disk.1`, `Disk.2`).
* **RawDIC**: Used for games with custom trackloaders or physical DRM (weak bits, long tracks). RawDIC does not know how to read these formats inherently.
### 2.2 Imager Slaves & The Amiga Floppy Controller
To use RawDIC on a protected game, the developer must first reverse-engineer the game's bootblock and write an **Imager Slave**.
An Imager Slave is a **68000 Assembly language program** (never C code). Developers write the `.asm` file using the official [WHDLoad Developer Package](http://whdload.de/), which provides necessary macros (like `rawdic.i`). The source code is compiled using a standard Amiga assembler (like PhxAss or VASM) into an executable binary.
When you run RawDIC from the command line, you pass the Imager Slave to it as an argument:
`RawDIC slave=MyGame.ISlave`
RawDIC loads the Imager Slave, which then dictates exactly how to decode the disk's specific Magnetic Flux Reversals (MFM).
To understand how RawDIC works, you must understand the Amiga's unique floppy architecture, which is radically different from a PC (NEC 765) or ZX Spectrum (WD1793). The Amiga does not use an intelligent, high-level floppy controller. It uses two custom chips:
1. **[CIA-A / CIA-B (8520)](../01_hardware/common/cia_chips.md)**: Handles mechanical logic. The CPU writes to CIA registers to turn on the motor, select a drive, choose head direction, and send step pulses.
2. **[Paula (8364)](../01_hardware/common/floppy_hardware.md)**: Handles data transfer via **DMA (Direct Memory Access)**. Paula does not decode sectors. It simply looks for a 16-bit "Sync Word" (standard is `$4489`) in the magnetic flux. When it sees that word, Paula blindly streams the raw, decoded MFM bitstream directly into Chip RAM, completely bypassing the CPU.
RawDIC is a **software tool** that leverages this architecture directly. It bypasses the standard AmigaOS `trackdisk.device`. Instead, it uses the CIA to mechanically step to a track, and then programs Paula's DMA registers (`DSKLEN`, `DSKPTH`, `DSKSYNC`).
If a game uses a proprietary format (e.g., a custom sync word like `$8944` instead of `$4489`), standard AmigaDOS fails. An Imager Slave tells RawDIC exactly what custom parameters to feed Paula. RawDIC then triggers the DMA transfer, pulling the custom MFM stream into memory, where the Imager Slave executes custom 68000 routines to decode the MFM bits into a flat `Disk.1` payload file suitable for WHDLoad.
> **Note: RawDIC vs. Hardware Flux Readers**
> This process is completely separate from modern PC-based hardware devices like **KryoFlux**, **SuperCard Pro**, or **GreaseWeazle**. Those are external USB devices that capture magnetic flux at the hardware level to create archival `.scp` or `.raw` images. RawDIC, conversely, is an Amiga-native software solution that relies on Paula's DMA and the developer's Imager Slave to decode the MFM stream on-the-fly.
### 2.3 Post-Processing
If the game uses a custom filesystem, the developer might not want a massive `Disk.1` image. They might write a script to extract individual files from the raw tracks so the WHDLoad patch can load them natively via the Host OS. They also manually correct any bad checksums caused by original mastering errors.
---
## 3. The Snooping Phase (Execution Profiling)
Once the data is ripped, the developer must figure out exactly what the game is doing to the hardware. WHDLoad has built-in profiling called **Snooping** (activated via `Snoop=1` or `Snoop=2` in the `.info` tooltypes).
When Snooping is enabled, WHDLoad uses the CPU's Memory Management Unit (MMU) to trap all memory accesses. It generates a massive log of:
* **Custom Register Violations**: Intercepts illegal byte-writes to 16-bit custom registers (which legally require word writes, with exceptions like `bltcon0l`). It also traps writes to read-only registers or reads from write-only registers.
* **CIA Hazards**: Detects illegal read-modify-write instructions (like `BCHG`) on CIA Time of Day registers when the Alarm bit is active.
* **Memory Bounds**: Every read/write outside the game's allocated memory.
* **Advanced DMA Validation**: WHDLoad provides granular Snoop flags (`ChkBltSize`, `ChkBltWait`, `ChkCopCon`, `ChkAudPt`) that use instruction tracing and bounds checking to catch Copper and Blitter jobs attempting to read/write outside `BaseMem`, or the Copper illegally attempting to configure the Blitter (`custom.copcon` bit 1).
> **Warning: 68040/060 Snoop Limitations**
> On 68040 and 68060 processors, `MOVEM` (Move Multiple) instructions can sometimes bypass Snoop's Access Fault handler. This occurs because the MMU only verifies the *first* address accessed during the burst transfer, potentially allowing illegal chip accesses further down the block to slip through undetected. **Because of this hardware flaw, a 68030 with a full MMU is considered the "gold standard" hardware for accurately profiling games.**
This log becomes the developer's "To-Do" list. Every illegal or hardware-banging operation in the log must be intercepted and patched.
---
## 4. Writing the Game Slave
The **Game Slave** (`game.slave`) is a small piece of 68000 assembly code written specifically for one game. It is the core reverse-engineering patch.
### 4.1 The Walled Garden & MMU Virtualization
When WHDLoad launches, it allocates a contiguous block of RAM for the game (`BaseMem` and optionally `ExpMem`). Using the MMU, WHDLoad builds a precise translation tree that explicitly marks the following physical regions as **Valid Pages**:
* `$00000000` through the end of allocated `BaseMem`/`ExpMem`.
* `$dff000 - $dff200` (Custom Hardware Registers).
* `$bfd000 - $bff000` (CIA Registers).
Every other memory page is marked as **Invalid**. Any read or write outside these explicitly defined Walled Garden boundaries immediately triggers an Access Fault Exception handled by WHDLoad. This forces the game to believe it is running at absolute address `$00000000`, with full ownership of the 512KB Chip RAM, while absolutely protecting the host OS.
### 4.2 KickEmu (OS Emulation)
Some games bypass the OS for disk access but still rely on `exec.library` or `graphics.library` for initialization. WHDLoad provides **KickEmu**, a set of pre-built modules that load actual Kickstart ROM images (1.3 or 3.1) into Fast RAM and emulate a pristine boot environment just for the game.
### 4.3 Patching the Footprint
The Slave searches the game's loaded memory for the exact byte signature of its custom trackloader. It then overwrites the original `JSR` (Jump to Subroutine) entry points with jumps to the Slave's own code.
```mermaid
graph TD
A[AmigaOS] -->|Runs| B[WHDLoad Host]
B -->|Allocates RAM| C[(Memory Walled Garden)]
B -->|Loads| D[Game.Slave]
B -->|Kills OS & Interrupts| D
D -->|Searches & Patches| E[Game Executable]
D -->|Executes| E
E -.->|Intercepted Hardware Access| D
D -.->|Proxies via Resload API| B
```
### 4.4 Redirection to `resload_DiskLoad`
Inside the Slave's hook, the game's request (e.g., "Cylinder 5, Head 0") is translated into a byte offset within the `Disk.1` image file.
The Slave then calls `resload_DiskLoad`, a callback function provided by the WHDLoad Host.
```mermaid
sequenceDiagram
participant Game
participant Slave
participant Host
participant HardDrive
Game->>Slave: Read Track 5, Head 0 (Patched JSR)
Slave->>Slave: Translate physical Track to file offset
Slave->>Host: resload_DiskLoad(offset, length, dest_ram)
Host->>HardDrive: Read bytes from disk.1
HardDrive-->>Host: Data
Host-->>Slave: Write to dest_ram
Slave-->>Game: RTS (Data loaded!)
```
### 4.5 The Soft Reset (`QuitKey`)
WHDLoad requires that every game can be exited gracefully, returning the user to the AmigaOS Workbench without rebooting.
The Slave implements this by intercepting the keyboard hardware interrupt (Level 2) or the Action Replay NMI (Level 7). When the user presses the designated `QuitKey` (often F10 or PrtScn), the Slave intercepts it and triggers a `resload_Abort` call. WHDLoad then restores the OS interrupt vectors, flushes the caches, and turns multitasking back on.
---
## 5. Hardware Virtualization & Fixes
Beyond disk access, Slaves must fix hardware incompatibilities so an A500 game runs on an A1200 or 68060 accelerator:
1. **MMU Virtualization & Traps**: WHDLoad takes over the MMU to proxy OS functions and validate memory bounds. Because of this, running debugging tools like **Enforcer** or **CyberGuard** simultaneously with WHDLoad causes machine lockups, as WHDLoad intentionally generates hundreds of MMU hits during normal operation.
2. **SMC Defeat & The 68060 Paradox**: The 68000 had no instruction cache; the 68020+ does. Games using Self-Modifying Code (SMC) crash on newer processors because the CPU executes stale instructions. Slaves patch these areas with `resload_FlushCache`.
* *The 68060 Paradox*: The Motorola 68060 introduced a "Branch Cache" that **completely ignores the MMU setup**. Even if WHDLoad marks a memory page as Non-Cacheable to protect SMC, the 68060 will *still* cache branch instructions. Game Slaves must explicitly use `resload_SetCPU` (or the historical `resload_SetCACR`) to disable or flush this cache, or the game will crash on an 060.
3. **Access Faults**: Games might try to read memory outside their allocated Chip RAM. Slaves `NOP` out these checks.
3. **Interrupt Timing**: If a game relies on the specific speed of a 68000 executing a loop for timing, it runs 10x too fast on a 68030. Slaves replace these loops with `resload_Delay` calls to normalize the speed.
---
## 6. Advanced Debugging & Profiling
### 6.1 Advanced Snooping & Memory Dumps
While basic `Snoop` logs hardware violations, developers can combine `Snoop=1`, `Expert=1`, and `DebugKey` to trigger a total system dump. Pressing the configured `DebugKey` forces WHDLoad to write the entire Walled Garden memory state, CPU registers, and custom chip states to disk.
Developers use the included **SP (Save Picture)** tool to extract raw framebuffer images directly from these dump files by parsing the captured copperlists, which is invaluable for identifying exactly when and where a game crashes or hangs during display routines.
### 6.2 System Monitors & Freezer Integration
WHDLoad directly supports specific software freezers (like HRTmon and ThrillKill). When WHDLoad detects a supported freezer in memory during startup, it modifies its MMU setup to declare the monitor's memory as valid and WriteThrough cacheable.
It forwards all NMI (Non-Maskable Interrupts) to the monitor's vector table. If the VBR (Vector Base Register) is moved, WHDLoad compares the `FreezeKey` at each Level 7 interrupt, transforming the stackframe into an NMI stackframe to safely drop the user into the debugger without disrupting the host OS.
### 6.3 Memory Protection API & Checksum Defeat
Some games implement anti-tampering checksums that scan memory for modifications to the trackloader. Reverse engineers can easily defeat these checks using WHDLoad's `resload_ProtectRead` and `resload_ProtectWrite` APIs.
By declaring the 4KB memory page containing the modified code as protected, WHDLoad modifies the page descriptors in the MMU translation tree. Any subsequent access to that page by the game's protection routine will instantly trigger an Access Fault exception. WHDLoad's exception handler evaluates the access; if it matches the specific patched bytes, it halts execution and drops the developer exactly at the checksum routine's Program Counter (PC), completely exposing the DRM mechanism.
---
## 7. Publishing the Container
Once the Imager Slave and Game Slave are complete, the developer packages the release. A standard WHDLoad container looks like this:
* `Game.slave`: The compiled Game Slave binary.
* `Disk.1`, `Disk.2`: The disk images ripped by RawDIC.
* `Game.info`: The AmigaOS icon file containing WHDLoad Tooltypes (e.g., `Preload=1`, `QuitKey=$59`).
* `ReadMe`: Documentation detailing what protections were removed, what hardware is required, and who wrote the Slave.
**Integrity Checking:** The Slave contains a hardcoded CRC16 or MD5 hash of the original unmodified disk images. When WHDLoad launches, it hashes the `Disk.1` file. If the user tries to use a corrupted dump or an improperly cracked ADF file downloaded from the internet, WHDLoad will throw an integrity error, ensuring that the Slave is only patching the exact bytes it was programmed for.
---
## 8. Development Resources & SDK
If you want to create your own WHDLoad installs, the official tools are freely available:
### 7.1 Acquiring the DevKit
You must download the **DEV Package** (not the USR package) from the official [WHDLoad Homepage](http://whdload.de/). The USR package only contains the runtime tools for end-users.
* Look for `WHDLoad_dev.lha` or the versioned archive (e.g., `WHDLoad_20.0_dev.lzx`).
### 7.2 What's in the SDK?
The DEV package is the definitive toolkit for reverse engineers:
* **`Include/`**: Contains the critical assembly macros (`rawdic.i` for Imager Slaves, `resload.i` for the host API, and `kickemu.i` for OS faking).
* **`Src/`**: Dozens of open-source Game Slaves and Imager Slaves that act as reference examples.
* **`Autodoc/`**: The detailed API reference for every `resload_` function (e.g., `resload_DiskLoad`, `resload_FlushCache`).
* **CLI Utilities**: Additional command-line tools for low-level patching and analysis:
* **Patcher**: A generic binary patcher to apply standard crack patches.
* **Reloc**: A tool to handle and relocate standard AmigaDOS executables within the Walled Garden.
* **Fa / Ibb / Itd**: Utilities for file analysis and track-disk debugging.
### 7.3 Official Documentation
The complete HTML documentation for WHDLoad development is available online at [http://whdload.de/docs/WHDLoad.html](http://whdload.de/docs/WHDLoad.html). It includes:
* Memory map specifications for the Walled Garden.
* The exact calling conventions for the `resload` API.
* Detailed guides on configuring `Snoop` mode and using KickEmu.
### 7.4 How to Use It
1. Extract the `WHDLoad_dev.lha` archive to your Amiga hard drive (or cross-compilation environment).
2. Copy the `Include/` files to your assembler's standard include directory.
3. Use a 68k macro assembler (like **VASM**, **PhxAss**, or **Barfly**) to compile your `.asm` code.
4. To build an Imager Slave, include `rawdic.i` and compile. To build a Game Slave, include `resload.i` and compile.

View file

@ -72,6 +72,25 @@ For dynamic debugging, the workflow is identical to IDA:
---
## Step 6: GCC Binary Specific Workflows
When dealing with GCC-compiled Amiga binaries (especially those with debug info), there are a few Ghidra-specific workflows to note:
**1. Install `ghidra-gcc2-stabs`** (`RidgeX/ghidra-gcc2-stabs`) if the binary has debug info. After loading:
- Run the script: `Analysis → Run Script → ImportGCC2Stabs.java`
- The script reads `HUNK_DEBUG`, extracts `N_FUN`/`N_SLINE`/`N_LSYM` stabs, and creates function labels, source line annotations, and local variable names automatically.
- Even partial stabs (e.g., `N_SO` + `N_FUN` only) restore function boundaries and names.
**2. PC-relative string handling.** Ghidra's m68k analyzer natively handles `LEA xxx(PC), An` correctly and creates data cross-references. Check the `References` view for `LEA` targets — strings listed there can be viewed and renamed.
**3. Function boundary heuristic.** Ghidra's default analysis finds GCC functions reasonably well. For missed functions:
- Use `Search → For Instruction Patterns``MOVEM.L *, -(SP)` (opcode `48E7`) to find all prologues.
- Right-click → `Create Function` at each found address.
**4. Recognizing tail calls.** Ghidra may misidentify `BRA _otherFunc` as a local branch. If Ghidra marks code after a `BRA` as unreachable or creates a new function at the `BRA` target, verify manually: if the `BRA` target is a named function elsewhere in `.text`, it's a tail call — the `BRA` terminates the current function and the target function returns directly to the original caller.
---
## References
- [ghidra-amiga by BartmanAbyss](https://github.com/BartmanAbyss/ghidra-amiga) — The definitive Amiga loader and extension suite for Ghidra.

View file

@ -217,6 +217,28 @@ If C pseudocode generation is a strict requirement for your workflow, you must u
---
## Step 12: GCC Binary Specific Workflows
When analyzing a binary compiled with GCC (often identified by a `.text` hunk or a `LINK A6` in the first function), the standard analysis workflow changes slightly:
**1. Handle `.text` as mixed code+data.** GCC embeds strings and jump tables directly in the code hunk. After auto-analysis:
- Search for `LEA xxx(PC), An` instructions (Edit → Find → by instruction mnemonic or IDAPython)
- For each, check if the target address contains ASCII bytes — if yes, press `A` to define as string
- Mark the string as `DATA` type so IDA doesn't try to disassemble it as code
**2. Function boundary detection without LINK.** IDA's auto-analysis finds most functions via call-graph tracing from the entry point. For stragglers:
- Every `BSR addr` / `JSR addr` target is a function entry — use `Create function` (P key) at those addresses
- Look for `MOVEM.L Dn/An, -(SP)` at addresses following a `RTS` — strong function-start indicator
- Use IDAPython to scan: `for ea in idautils.Heads(): if idc.print_insn_mnem(ea) == 'MOVEM.L': ...`
**3. Identify `main()` in stripped builds.** The libnix startup sequence is fixed:
```
Entry → MOVEA.L 4.W, A6 → JSR __startup_SysBase → (open dos.library) → JSR _main
```
The `JSR` immediately after the `dos.library` open is `_main`. Mark it as a function and rename.
---
## References
- IDA Pro 7.x documentation — processor modules, FLIRT

View file

@ -0,0 +1,22 @@
[← Home](../../README.md) · [Reverse Engineering](../README.md)
# Static Analysis & Binary Archaeology
## Contents
### Fundamentals
- [hunk_reconstruction.md](hunk_reconstruction.md) — Understanding the Amiga HUNK structure in disassemblers
- [code_vs_data_disambiguation.md](code_vs_data_disambiguation.md) — Distinguishing instructions from data blocks
- [m68k_codegen_patterns.md](m68k_codegen_patterns.md) — Common 68k assembly sequences and optimizations
### API & Data Analysis
- [library_jmp_table.md](library_jmp_table.md) — Reconstructing library jump tables and LVOS
- [api_call_identification.md](api_call_identification.md) — Identifying system calls in naked disassembly
- [string_xref_analysis.md](string_xref_analysis.md) — Using strings to anchor functional analysis
- [struct_recovery.md](struct_recovery.md) — Identifying AmigaOS structures in memory
### Language-Specific RE
- [asm68k_binaries.md](asm68k_binaries.md) — Hand-written 68k assembly (Demos, Bootblocks)
- [ansi_c_reversing.md](ansi_c_reversing.md) — Recovering C code logic
- [cpp_vtables_reversing.md](cpp_vtables_reversing.md) — C++ Objects and VTables
- [other_languages.md](other_languages.md) — AMOS, Blitz Basic, Amiga E, and more

View file

@ -116,6 +116,61 @@ _large_func:
---
## Debug Information — HUNK_DEBUG and Stabs
GCC 2.95.x for AmigaOS embeds debug info when compiled with `-g`. The format is **stabs** (BSD DBX format) — not DWARF2, which is disabled on this target. Debug data lives in a `HUNK_DEBUG` block (hunk type `0x3F1`) separate from `HUNK_SYMBOL`.
### Hunk Types for Symbols
| Hunk type | Hex | Contents |
|---|---|---|
| `HUNK_SYMBOL` | `0x3F0` | Linker-visible public symbol names + offsets. Present in non-stripped builds. |
| `HUNK_DEBUG` | `0x3F1` | Stabs debug info: source file names, function names, line numbers, type info. Only with `-g`. |
`HUNK_DEBUG` structure:
```
ULONG magic; // BSD a.out magic (checked against ZMAGIC)
ULONG symsz; // size of symbol table (N × sizeof(struct nlist))
ULONG strsz; // size of string table
struct nlist[N]; // stabs entries
char strings[]; // null-terminated string pool
```
### Key Stabs Entry Types
| Stab type | Decimal | Meaning |
|---|---|---|
| `N_OPT` | 60 | Compiler option — value `"gcc2_compiled."` marks GCC 2.x output |
| `N_SO` | 100 | Source file. Two consecutive `N_SO` entries = directory + filename |
| `N_SOL` | 132 | Included sub-source file (`#include`) |
| `N_FUN` | 36 | Function entry: `"name:Fdesc"` (global) or `"name:fdesc"` (static); `n_value` = start address. Empty name `""` marks function *end*. |
| `N_SLINE` | 68 | Source line: `n_desc` = line number, `n_value` = code offset from function start |
| `N_LSYM` | 128 | Local variable (stack): `"name:type"`, `n_value` = frame offset |
| `N_GSYM` | 32 | Global variable |
| `N_LBRAC` / `N_RBRAC` | 192 / 224 | Open/close scope block |
### Finding Function Names in a Debug Build
1. Locate `HUNK_DEBUG` (type `0x3F1`) in the binary
2. Read the BSD header; verify magic
3. Iterate `struct nlist` entries, looking for `n_type == N_FUN`
4. The string before the `:` in the stabs string is the function name
5. `n_value` is the function's start offset within the hunk
6. The next `N_FUN` with empty name marks the function's end
**Tooling:**
- `GccFindHit` (from `cnvogelg/m68k-amigaos-toolchain`) — reads HUNK_DEBUG and maps a crash address to source file + line + function
- `ghidra-gcc2-stabs` (GitHub: `RidgeX/ghidra-gcc2-stabs`) — Ghidra plugin that imports stabs from `HUNK_DEBUG` and creates function labels, line numbers, and local variable annotations automatically
### `gcc2_compiled.` Marker
An `N_OPT` stab with string `"gcc2_compiled."` appears before the first `N_SO` entry in every GCC 2.x debug build. It:
- Confirms GCC 2.x lineage (not SAS/C, VBCC, or GCC 3+)
- Is only present when `-g` was passed — absent in release/stripped builds
- In stripped binaries, use hunk naming (`.text`) and code patterns instead
---
## Calling Conventions
GCC uses a simpler calling convention model than SAS/C — one primary convention with variations controlled by function attributes. However, what GCC lacks in convention count it makes up for in **register allocation flexibility**: every function gets a customized stack frame and register save set based on exactly which variables the compiler decides to keep in registers.
@ -347,6 +402,31 @@ GCC's call-site code reveals whether the caller passes parameters in registers o
> [!NOTE]
> **Varargs functions** (like `Printf`, `sprintf`, custom `Format()`) force ALL arguments onto the stack in GCC 2.95.x — even the first two. This is a reliable disambiguator: if you see a call with 3+ stack pushes and NO register args, the target is likely a varargs function.
#### Varargs Callee — `va_arg()` Expansion
Inside a varargs function, `va_list` is a plain `char *` pointer into the stack frame. `va_start` initializes it; `va_arg(ap, T)` reads the next argument and advances by `sizeof(T)` rounded to 4 bytes.
```asm
; va_arg() for a 32-bit value — canonical pattern:
MOVEA.L -$04(A6), A0 ; load va_list (ap) from stack slot
MOVE.L (A0)+, D0 ; read next arg, advance ap by 4
MOVE.L A0, -$04(A6) ; write back updated ap
; va_arg() inside a loop (ap kept in address register):
.va_loop:
MOVE.L (A2)+, D0 ; A2 = ap; read 32-bit arg, A2 += 4
TST.L D0
BEQ.S .va_done
; process D0 ...
BRA.S .va_loop
```
**Key recognition patterns:**
- `(An)+` post-increment reads in a loop — the defining mark of `va_arg` iteration
- `ap` is either kept in an address register across the loop or reloaded/stored each iteration
- 16-bit types (`short`) are promoted to 32 bits on the stack — `va_arg` still advances by 4, not 2
- Format strings for `printf`-style calls always appear as `LEA .LCx(PC), Dn` followed by `MOVE.L Dn, -(SP)` (the format string is the first stack push, last to arrive at the function)
### `__attribute__((interrupt))` — Interrupt Handler
```asm
@ -369,6 +449,76 @@ _exit_func:
; May be followed by ILLEGAL or DC.B 0 padding
```
### AmigaOS-Specific GCC Attributes
GCC 2.95.x for AmigaOS defines attribute macros that map to Amiga calling conventions. These produce fundamentally different code and must be recognized to correctly reconstruct function prototypes.
#### `__regargs` — Register-Based Argument Passing
`__attribute__((regparm(N)))` (available as the `__regargs` macro) passes the first N arguments in registers using a different layout than standard cdecl:
| Arg # | Type | Standard cdecl | `__regargs` |
|---|---|---|---|
| **arg1** | integer | D0 | D0 |
| **arg1** | pointer | D0 | **A0** |
| **arg2** | integer | D1 | D1 |
| **arg2** | pointer | D1 | **A1** |
| **arg3** | any | stack | D2 or A2 |
| **remaining** | any | stack | stack |
The critical difference: **pointer arguments arrive in address registers (A0, A1), not D0/D1**. If you assume cdecl and a function's first parameter is a pointer, you will look for it in `D0` — but with `__regargs` it arrives in `A0`.
```asm
; Standard cdecl: Write(fh, buf, len) — fh/buf/len in D0/D1/stack
MOVE.L #1024, -(SP) ; len on stack
MOVE.L buffer, D1 ; buf in D1 (integer-sized)
MOVE.L fh, D0 ; fh in D0
BSR _Write
; __regargs: WriteEx(fh, buf, len) — fh(int) in D0, buf(ptr) in A0, len in D1
MOVE.L #1024, D1 ; len in D1
MOVEA.L buffer, A0 ; buf (pointer) → A0, not D1!
MOVE.L fh, D0 ; fh in D0
BSR _WriteEx
```
Callee with `__regargs` — how the prologue differs:
```asm
_WriteEx: ; __regargs: D0=fh, A0=buf(ptr), D1=len
MOVEM.L D2-D3/A2, -(SP)
MOVE.L D0, D2 ; save fh (from D0 — same as cdecl)
MOVEA.L A0, A2 ; save buf from A0 — NOT from D1!
MOVE.L D1, D3 ; save len from D1
```
**RE trap**: If you assume cdecl and see `MOVEA.L A0, A2` early in the prologue with no preceding `MOVEA.L D0, A2`, the function is `__regargs` and the first pointer arg arrived directly in A0.
#### `__saveds` — Small-Data Register Reload
`__attribute__((saveds))` forces the function to reload the small-data base register (A4) at entry from `__DATA_BAS`. Used for library functions callable from a different task context. Recognizable by `LEA __DATA_BAS(PC), A4` as the very first instruction before any other work:
```asm
_saveds_func:
LEA __DATA_BAS(PC), A4 ; reload small-data base — __saveds signature
MOVEM.L D2/A2, -(SP)
; ... normal function body follows ...
```
On most AmigaOS GCC builds without `-msep-data`, `__saveds` is a no-op in the generated code — present in source for SAS/C compatibility, invisible in the binary.
#### `__chip` — Chip RAM Variable Placement
Variables declared `__attribute__((chip))` land in a `.datachip` section. The linker emits this as a separate `HUNK_DATA` block with chip-RAM flag bits set in the hunk size longword (bits 3031 encode `MEMF_CHIP`):
```
HUNK_DATA 0x3EA size=0x80000040 ; bit 30 set = MEMF_CHIP requested
; ...chip-RAM variable data...
HUNK_END
```
Accesses to chip-RAM variables in disassembly look identical to normal `.data` accesses — the `MEMF_CHIP` flag is only visible in the hunk header, not in the instructions.
---
## Library Call Patterns
@ -409,6 +559,109 @@ When `-fPIC` is enabled, globals are accessed through a GOT (Global Offset Table
---
## BOOPSI / MUI Dispatcher Pattern
BOOPSI and MUI custom class dispatchers compiled with GCC require special recognition because the OS invokes them with a **non-GCC calling convention**: arguments arrive in specific m68k registers hardcoded by the OS ABI, not in D0/D1/stack.
### OS Entry Convention vs GCC Normal
| Register | OS-mandated meaning | GCC normal cdecl meaning |
|---|---|---|
| **A0** | `IClass *cl` — the class pointer | arg1 (if pointer, with `__regargs`) |
| **A1** | `Msg msg` — message (first field = MethodID) | arg2 (if pointer, with `__regargs`) |
| **A2** | `Object *obj` — the object being operated on | callee-saved (must be preserved) |
GCC-compiled dispatchers always begin by saving A2 and remapping all three inputs to callee-saved registers before any dispatch logic:
```asm
_MyClass_Dispatcher:
; Entered with: A0=class, A1=msg, A2=obj (OS convention)
MOVEM.L D2-D3/A2-A4, -(SP) ; save callee-saved regs (A2 saved here!)
MOVEA.L A0, A3 ; A3 = cl (callee-saved)
MOVEA.L A2, A4 ; A4 = obj (A2 clobbered next, save first)
MOVEA.L A1, A2 ; A2 = msg (now A2 holds msg, not obj)
MOVE.L (A2), D2 ; D2 = msg->MethodID (first field)
```
### MethodID Dispatch — CMP Chain vs Jump Table
For fewer than ~8 methods, GCC emits a linear comparison chain:
```asm
CMPI.L #$0101, D2 ; OM_NEW?
BEQ .om_new
CMPI.L #$0102, D2 ; OM_DISPOSE?
BEQ .om_dispose
CMPI.L #$0103, D2 ; OM_SET?
BEQ .om_set
CMPI.L #$0104, D2 ; OM_GET?
BEQ.S .om_get
; ... more methods ...
; Default: forward to superclass
MOVEA.L A3, A0 ; restore cl
MOVEA.L A4, A2 ; restore obj
; A1 still = msg (or reload from A2 if clobbered)
JMP (_IDoSuperMethodA).L ; tail-call superclass dispatcher
```
For dense MethodID ranges with 8+ methods, GCC may emit a jump table. The MethodID base is subtracted, range-checked, then used as a scaled index:
```asm
MOVE.L D2, D0
SUB.L #$0101, D0 ; normalize MethodID to 0-based index
CMPI.L #<max_method_idx>, D0
BHI.S .default_handler ; out of range → superclass
ADD.L D0, D0 ; scale by 2 (word offsets)
MOVE.W .method_table(PC,D0.L), D1
JMP .method_table(PC,D1.W) ; indirect branch through table
.method_table:
DC.W .om_new-.method_table
DC.W .om_dispose-.method_table
; ...
```
### Common MUI Method IDs
These appear in `CMPI.L #$XXXXXXXX, D2` comparisons in MUI class dispatchers:
| MethodID | BOOPSI/MUI Method | Typical handler action |
|---|---|---|
| `0x0101` | `OM_NEW` | Allocate instance data, call superclass OM_NEW |
| `0x0102` | `OM_DISPOSE` | Free resources, call superclass OM_DISPOSE |
| `0x0103` | `OM_SET` | Apply attribute list from msg |
| `0x0104` | `OM_GET` | Return attribute value |
| `0x80420006` | `MUIM_Draw` | Render the gadget |
| `0x8042000D` | `MUIM_Cleanup` | Release render resources |
| `0x80420012` | `MUIM_Setup` | Prepare for rendering |
Custom class methods use MUI-registered IDs starting at `0x80420000 + offset`. If you see a large hex constant as a CMPI operand starting with `0x8042`, it's a custom MUI method.
### `MakeClass()` / `MUI_CreateCustomClass()` Call Pattern
Class initialization code (often in a global constructor or `LibInit`):
```asm
; MUI_CreateCustomClass(NULL, superclass_name, NULL, inst_size, dispatcher):
PEA _MyClass_Dispatcher ; dispatcher function pointer
MOVE.L #<instance_data_size>, -(SP)
MOVE.L #0, -(SP) ; taglist (NULL)
PEA .superclass_str(PC) ; "Group.mui" etc.
MOVE.L #0, -(SP) ; base (NULL for public classes)
JSR _MUI_CreateCustomClass
LEA $14(SP), SP ; clean 5 args × 4 bytes
MOVE.L D0, (_MyClass).L ; store IClass * globally
```
**RE checklist for dispatcher identification:**
1. Function entered with no MOVEM of D0/D1 first — instead A0, A1, A2 are immediately remapped
2. First read is `MOVE.L (A2), Dn` or `MOVE.L (A1), Dn` — loading MethodID
3. A chain of `CMPI.L #$0101``#$0104` or larger hex values
4. At least one path ends with `JMP (_IDoSuperMethodA).L` or `BSR _DoSuperMethod`
5. Nearby global holds the result of `MUI_CreateCustomClass`/`MakeClass`
---
## C++ Support — What It Means for RE
### Global Constructors and Destructors
@ -459,6 +712,59 @@ See [cpp_vtables_reversing.md](../cpp_vtables_reversing.md) for the complete GCC
- `offset_to_top` at `vtable[-2]`
- C++ name mangling follows GCC 2.95 conventions (different from StormC++)
### C++ Exception Handling — SJLJ Mechanism
GCC 2.95.x on AmigaOS uses **SJLJ (setjmp/longjmp) exception handling**. Zero-cost DWARF2 unwinding is explicitly disabled (`DWARF2_UNWIND_INFO 0` in `amigaos.h`) because AmigaOS has no OS-level stack unwinder.
Every function containing a `try` block gets an exception frame registered on a thread-local EH stack at entry and deregistered at exit:
```asm
; try-block function prologue — SJLJ EH:
_func_with_try:
LINK A6, #-<eh_frame_size> ; allocate ExceptionFrame on stack
; ExceptionFrame layout: {jmp_buf[6], *prev_frame, *exception_type, *handler}
MOVEM.L D2/A2, -(SP)
LEA -<eh_frame_size>(A6), A0 ; A0 = &ExceptionFrame
JSR ___sjljeh_init_handler ; push frame onto __sjlj_eh_stack
JSR _setjmp ; setjmp into frame's jmp_buf
TST.L D0
BEQ.S .normal_path ; D0=0: initial entry, execute try body
; D0≠0: returning from longjmp — exception in flight
BRA .catch_handler
.normal_path:
; ... try block body ...
.function_exit:
JSR ___sjljeh_remove_handler ; pop frame from __sjlj_eh_stack
MOVEM.L (SP)+, D2/A2
UNLK A6
RTS
.catch_handler:
; ... catch block body ...
BRA.S .function_exit
```
**RE identification of try/catch blocks:**
- `JSR ___sjljeh_init_handler` — always marks the start of a try region
- `JSR _setjmp` followed immediately by `TST.L D0` / `BEQ` — the try/catch branch
- `JSR ___sjljeh_remove_handler` — always paired with init, marks the end of the guarded region
- Functions without exceptions have neither; the overhead is obvious (30+ extra instructions)
**Throw site pattern:**
```asm
; throw SomeException():
; allocate exception object (or use static)
MOVE.L D0, _current_exception ; store exception pointer globally
JSR ___sjljeh_throw ; unwind: calls longjmp on innermost frame
; unreachable (noreturn)
```
If you see `___sjljeh_throw` in a function, that function throws an exception. If you see `___sjljeh_init_handler` + `_setjmp`, it catches one.
---
## Optimization Level Fingerprints
@ -717,6 +1023,20 @@ The A6 frame pointer choice (rather than A5) comes from the System V m68k ABI, w
---
## Practical RE Workflow — Stripped Binary Analysis
Quick steps when you have a stripped GCC Amiga binary with zero symbols:
1. **Confirm compiler**: `.text` hunk name? → GCC. `LINK A5` in first function? → SAS/C. `LINK A6`? → GCC with frame pointer.
2. **Find entry**: `MOVEA.L 4.W, A6` near start of `.text` → libnix entry. Follow to `JSR _main`.
3. **Apply FLIRT**: identify C runtime, `dos.library` stubs, math functions — this names 2040% of functions immediately.
4. **Trace call graph from `main`**: rename each function as you understand it; GCC's per-function register save sets help scope the function boundary.
5. **Look for `__CTOR_LIST__`**: if present, trace it before `main` — global constructors may initialize important state.
6. **Check for `__regargs`**: if a function's first pointer arg seemingly doesn't use D0/D1, check if it arrives in A0 (regparm convention).
7. **BOOPSI/MUI class?**: look for `MakeClass`/`MUI_CreateCustomClass` and trace to the dispatcher function.
---
## FAQ
**Q: How do I tell GCC 2.95.x from GCC 6.x (bebbo) in a binary?**
@ -738,4 +1058,14 @@ A: Search for libnix startup signature: `MOVE.L 4.W, A6` / `JSR ___startup_SysBa
- [startup_code.md](../../../04_linking_and_libraries/startup_code.md) — libnix/clib2 startup internals
- *bebbo's amiga-gcc*: https://codeberg.org/bebbo/amiga-gcc
- *GeekGadgets*: GCC 2.95 for AmigaOS (archived documentation)
- *adtools/amigaos-gcc-2.95.3 — amigaos.h*: https://github.com/adtools/amigaos-gcc-2.95.3/blob/master/gcc/config/m68k/amigaos.h — calling convention, attributes, EH config
- *RidgeX/ghidra-gcc2-stabs*: https://github.com/RidgeX/ghidra-gcc2-stabs — Ghidra plugin for stabs import
- *BartmanAbyss/ghidra-amiga*: https://github.com/BartmanAbyss/ghidra-amiga — Ghidra hunk loader + AmigaOS types
- *cnvogelg/GccFindHit* (stabs parser): https://github.com/cnvogelg/m68k-amigaos-toolchain/blob/master/tools/GccFindHit.c
- *cahirwpz — AmigaOS GCC regparm*: http://cahirwpz.users.sourceforge.net/gcc-amigaos/regparm.html
- *Tetracorp — Reverse Engineering Amiga*: https://tetracorp.github.io/guide/reverse-engineering-amiga.html
- *GCC 2.95 caveats*: https://gcc.gnu.org/gcc-2.95/caveats.html
- *STABS format reference*: https://sourceware.org/gdb/onlinedocs/stabs.html
- *SDI_hook.h (BOOPSI/MUI dispatcher macros)*: https://github.com/amiga-mui/betterstring/blob/master/include/SDI_hook.h
- *BOOPSI documentation*: https://wiki.amigaos.net/wiki/BOOPSI_-_Object_Oriented_Intuition
- See also: [sasc.md](sasc.md), [vbcc.md](vbcc.md) — compare with other compilers