# Compiler Fingerprints — Identifying the Toolchain
## Overview
Knowing which compiler produced an Amiga binary dramatically reduces reverse engineering effort. Each compiler has distinctive code generation patterns at the function prologue, epilogue, calling sequence, and global variable access levels.
---
## SAS/C 6.x Fingerprints
### Function Prologue
```asm
; Classic SAS/C stack frame with A5 as frame pointer:
LINK A5, #-N ; allocate N bytes on stack
MOVEM.L D2-D7/A2-A4, -(SP) ; save callee-saved registers
- C++ methods use custom mangling: `__ct__6Window` (constructor), `__dt__` (destructor)
- C++ Vtable layout starts directly at first method (no `offset_to_top`)
- May contain `PPC_CODE` hunks for WarpOS binaries
---
## Lattice C 3.x/4.x
The predecessor to SAS/C (1985-1989 era).
```asm
; Lattice C 3.x/4.x:
LINK A5, #-$14
MOVEM.L D2-D5/A2-A3, -(SP) ; Saves fewer regs than SAS/C
```
**Distinguishing features:**
- Saves only 6-7 registers instead of SAS/C's 9.
- Uses `MOVE.L #small_val, D0` instead of `MOVEQ`.
- Uses long branches (`BRA`, `BEQ`) instead of short branches (`BRA.S`).
---
## DICE C
Lean, fast compiler by Matt Dillon.
```asm
; DICE C:
MOVEM.L D2-D4/A2-A3, -(SP) ; Per-function save, no frame pointer
```
**Distinguishing features:**
- Extremely similar to VBCC/GCC (no frame pointer).
- Entry point is uniquely named `_mainCRTStartup`.
- Often uses `ADDQ.L #4, SP` to clean up stack arguments.
- Uses `MOVEA.L (_LibBase).L, A6` for library calls.
---
## Hand-Written Assembly (Assembler-Only Code)
Unlike compiler-generated code with predictable prologues and calling sequences, hand-written 68000 assembly (common in demos, games, and bootblocks) is unconstrained.
**Distinguishing features:**
- **No `LINK` or `SUBQ.L #N,SP`** in the entire binary.
- **Custom Hardware Base Pointers:** Authors often dedicate a register (typically `A4` or `A5`) to `$DFF000` (custom chip base) for the entire program: `LEA $DFF000, A4`.
- **Ad-hoc Calling Conventions:** Parameters passed in arbitrary registers. `A6` might be used as a data pointer rather than the library base.
- **Maximally Specified Saves:** `MOVEM.L D0-D7/A0-A6, -(SP)` used aggressively for interrupt handlers or per-routine saves, rather than the compiler's minimal necessary set.
- **Self-Modifying Code (SMC):** `MOVE.W #imm, (next_insn+2, PC)` to patch instructions at runtime.
- **Hardware Register Banging:** Direct immediate access to `$DFFxxx` (custom chips), `$BFExxx` (CIAA), and `$BFDxxx` (CIAB).
- **PC-Relative Data Tables:** `LEA table(PC), An` used for copper lists, sprite data, and audio samples mixed within the `CODE` hunk.
> [!TIP]
> For a deep dive into reversing hand-written Amiga assembly, see the **[Hand-Written Assembly Field Manual](static/asm68k_binaries.md)**.
### Assembler Toolchain Fingerprints
Because macro assemblers translate mnemonics 1:1, they lack the rigid calling conventions of C compilers. However, the choice of assembler (and the era it belongs to) can leave subtle forensic clues in the binary:
| **ASM-One** (and ASM-Pro) | 1990s Demoscene Standard | **Literal Translation:** Early versions did not automatically optimize `MOVE.L #0, Dn` to `MOVEQ`.<br>**Section Merging:** Often outputs a single giant `CODE` hunk containing data, copper lists, and BSS because coders frequently omitted `SECTION` directives.<br>**Symbols:** `HUNK_SYMBOL` tables lack the `_` prefix typical of C linkers.<br>**Relocation Ordering:** Unlike external linkers that group `HUNK_RELOC32` arrays strictly by target hunk, ASM-One's single-pass compilation often emits a single massive relocation block at the end of the file in sequential generation order. |
| **Seka / K-Seka** | 1980s Early Demoscene | **The Literal Extreme:** Absolutely zero optimization. What you write is what you get.<br>**Compact output:** Often used for bootblocks and 4K intros; does not generate standard Amiga hunks natively unless explicitly coded to do so. |
| **Devpac (HiSoft)** | 1980s-90s "Pro" Standard | **Disciplined Hunks:** Devpac encouraged proper `SECTION CODE,CODE` and `SECTION DATA,DATA` usage, resulting in cleanly separated binary hunks.<br>**Optimization:** Featured early peephole optimization (short branches, `MOVEQ`).<br>**Debug Hunks:** Devpac injects proprietary debug structures. Look for `HUNK_DEBUG` ($03F1) blocks containing the `"HCLN"` (HiSoft Compressed Line Numbers) or `"LINE"` ASCII signatures, and unique Devpac-only hunk types like `HUNK_DEXT` ($03F7) and `HUNK_DREL32` ($03F8). |
| **PhxAss** | Late 90s Performance | **Aggressive Optimization:** Automatically shrinks `MOVE.L` to `MOVEQ`, and `LEA/JMP` to PC-relative `BSR/BRA` where possible.<br>**Object Linking:** Often output object files linked via `Blink`. `Blink` leaves its own structural fingerprints, strictly ordering `HUNK_RELOC32` offsets in ascending order per target hunk, and cleanly terminating relocation arrays. |
| **Barfly** | 1990s High-Speed | Extremely fast. Output binaries are functionally similar to PhxAss, often utilizing external linkers and producing highly optimized instruction sequences. |
| **vasm** | Modern Cross-Assembler | Can emulate the syntax and output style of Devpac (`-m68000 -Fhunkexe -phxass`) or ASM-One, making its footprint identical to the legacy assembler it is configured to mimic. |
- **Amiga ROM Kernel Reference Manual (RKRM): Includes and Autodocs** — Definitive source for standard Commodore Hunk IDs (`HUNK_CODE`, `HUNK_RELOC32`, `HUNK_DEBUG`).
- **AmigaDOS Executable Format Documentation** — Details the loader's behavior of skipping unrecognized hunk blocks, which allowed for proprietary debugger extensions.
- **HiSoft Devpac Amiga Assembler Manual** — Primary source for understanding `"HCLN"` (HiSoft Compressed Line Numbers) and its proprietary `HUNK_DEXT` / `HUNK_DREL32` structures.
- **Amiga Development Wiki (Wikidot)** — Excellent community repository documenting the exact bit-layout of the reverse-engineered `HCLN` compression scheme.
- **English Amiga Board (EAB) / Aminet Archives** — Primary historical source for the demoscene evolution of ASM-One, Seka, and PhxAss, including their specific linking behaviors and macro habits.
- **[Per-Compiler RE Field Manuals](static/compilers/README.md)** — In-depth binary analysis for each compiler