[← Home](../../../README.md) · [Reverse Engineering](../../README.md) · [Static Analysis](../README.md) · [Compilers](README.md) # VBCC — Reverse Engineering Field Manual ## Overview **VBCC** (Volker Barthelmann's C Compiler) is a portable, retargetable ISO C89 compiler that produces the smallest binaries among Amiga compilers. Its key RE characteristics are: **no frame pointer** (SP-relative access only), **per-function register saves** (only what's actually used), **PC-relative string addressing**, and a distinctive **`__reg()`** calling convention for AmigaOS library calls. VBCC generates clean, tight code that can look deceptively like hand-optimized assembly. Key constraints: - **No LINK instruction** — VBCC never uses `LINK A5` or `LINK A6`. Locals are accessed via `$offset(SP)`. Function boundaries are defined by `MOVEM.L ... -(SP)` at entry and `RTS` at exit. - **Minimal register saves** — Unlike SAS/C (9 registers always) or GCC (per-function but often substantial), VBCC saves only the exact registers used. A leaf function with no locals has no prologue at all. - **Tail-call optimization** — VBCC uses `BRA.S` to common epilogue blocks and `BRA` to tail-call other functions more aggressively than any other Amiga compiler. - **`__MERGED` hunks** — VBCC sometimes merges CODE and DATA into a single hunk when the small data model is active. - **Hunk names**: `CODE`, `DATA`, `BSS` (+ optional `__MERGED` for small-data) ```mermaid graph TB subgraph "Source (.c)" SRC["C source (C89)"] end subgraph "VBCC Compiler Pipeline" VC["vc (driver)"] VBCCM68K["vbccm68k (code generator)"] VASM["vasm (assembler)"] VLINK["vlink (linker)"] end subgraph "Binary Output" HUNK["Amiga HUNK executable"] CODE["CODE hunk"] DATA["DATA hunk"] MERGED["__MERGED (optional, small-data)"] end SRC --> VC VC --> VBCCM68K --> VASM --> VLINK VLINK --> HUNK HUNK --> CODE & DATA HUNK --> MERGED ``` --- ## Binary Identification — The VBCC Signature ### Function Prologue — Nothing or Minimal ```asm ; VBCC leaf function (no locals, no calls): _simple_func: ; NO prologue at all ; ... function body ... RTS ; VBCC function with locals: _moderate_func: MOVEM.L D2-D3/A2, -(SP) ; saves ONLY the 3 registers used ; ... function body ... MOVEM.L (SP)+, D2-D3/A2 RTS ; VBCC large function: _large_func: MOVEM.L D2-D5/A2-A3, -(SP) ; per-function exact save LEA -$80(SP), SP ; allocate stack frame ; ... function body ... LEA $80(SP), SP MOVEM.L (SP)+, D2-D5/A2-A3 RTS ``` **Key differentiator from GCC**: Both VBCC and GCC use per-function register saves, but VBCC's code is consistently tighter. VBCC uses `BRA.S label` to share common epilogue/cleanup code, where GCC duplicates it. VBCC uses `MOVEQ` and `ADDQ` aggressively for small constants. ### String Addressing Like GCC, VBCC uses PC-relative string addressing: ```asm LEA .str_hello(PC), A0 JSR _Printf .str_hello: DC.B "Hello", $0A, 00 ``` ### The `__reg()` Calling Convention — Unique VBCC Fingerprint VBCC's `__reg()` keyword places C variables in named CPU registers without inline assembly: ```c /* VBCC source: */ BPTR __reg("d0") MyOpen(__reg("d1") CONST_STRPTR name, __reg("d2") LONG accessMode); ``` ```asm ; Generated code for Open("foo", MODE_OLDFILE): MOVEA.L _DOSBase, A6 LEA .str_foo(PC), A0 MOVE.L A0, D1 ; name → D1 MOVEQ #1002, D2 ; MODE_OLDFILE → D2 JSR -$1E(A6) ; Open() LVO ``` **No other Amiga compiler generates this exact register-to-argument mapping without inline assembly stubs.** The `__reg()` assignments are visible only through the register usage pattern — functions that take args in specific registers (D1, D2, D3, etc.) without stack access. --- ## Library Call Patterns VBCC library calls are compact and direct: ```asm ; VBCC library call — minimal code: MOVEA.L (_DOSBase).L, A6 ; load library base (absolute with relocation) MOVE.L fh(SP), D1 ; arg from stack MOVE.L buf(SP), D2 MOVE.L len(SP), D3 JSR -$2A(A6) ; Read() ; Return value check: TST.L D0 BMI.S .error ``` VBCC differs from SAS/C here: SAS/C would load args through A5-relative offsets (`$08(A5)`). VBCC uses SP-relative offsets. Since SP may change within the function (pushing args), VBCC carefully maintains SP offsets. ### `#pragma amicall` — VBCC Library Call Pragmas ```c #pragma amicall(DOSBase, 0x1E, Open(d1, d2)) // VBCC pragma format is simpler than SAS/C: // - Library base name (identifier, not a string) // - LVO in hex // - Function name with argument register list ``` In the binary, these pragmas produce the same `JSR -$XXX(A6)` patterns as any other compiler — the pragma just controls argument register assignment. --- ## Optimization Patterns VBCC prioritizes **code density** over raw speed. Its signatures: | Pattern | VBCC Style | SAS/C Equivalent | |---|---|---| | **Shared epilogue** | `BRA.S .epilogue` from multiple exit points | Duplicated epilogue at each return | | **Tail calls** | `BRA _other_func` (discard own frame first) | `JSR _other_func` / `RTS` | | **Small constant loading** | `MOVEQ #N, Dn` whenever possible | `MOVE.L #N, Dn` for some small values | | **Stack frame** | `LEA -$N(SP), SP` (when frame > 32K or variable) | `LINK A5, #-N` | | **Loop termination** | `DBRA Dn, loop` (when counter fits in 16 bits) | `SUBQ.L #1, Dn` / `BNE loop` | ### Cross-Module Optimization VBCC supports cross-module optimization — when linking, `vlink` can reorder and merge functions across `.o` files. In the binary, this means function layout may NOT match source file order, and small static functions may be inlined at link time. --- ## Same C Function — VBCC Output ```asm ; CountWords() — VBCC, -O -speed: ; C prototype: ULONG CountWords(CONST_STRPTR str) _CountWords: MOVEM.L D2-D3, -(SP) ; only D2, D3 needed MOVEQ #0, D2 ; D2 = count MOVEQ #0, D3 ; D3 = in_word MOVEA.L $0C(SP), A0 ; A0 = str (arg at SP + 12) BRA.S .loop_test .loop_body: CMPI.B #' ', (A0) ; *str == ' '? BEQ.S .not_word CMPI.B #'\t', (A0) BEQ.S .not_word CMPI.B #'\n', (A0) BEQ.S .not_word TST.B D3 BNE.S .next_char ADDQ.L #1, D2 ; count++ MOVEQ #1, D3 ; in_word = TRUE BRA.S .next_char .not_word: MOVEQ #0, D3 ; in_word = FALSE .next_char: ADDQ.L #1, A0 ; str++ .loop_test: TST.B (A0) BNE.S .loop_body MOVE.L D2, D0 ; return count MOVEM.L (SP)+, D2-D3 RTS ``` **VBCC-specific observations**: 1. **`MOVEM.L D2-D3, -(SP)`** — only 2 registers saved. Minimal. 2. **`BRA.S .loop_test`** — unconditional branch to loop condition at top. 3. **`BRA.S .next_char`** — shared increment code reached from two paths. 4. **Identical to GCC** in this function because the function is simple enough that optimization differences don't show. For more complex functions (with multiple return paths, struct access, switch statements), VBCC's shared-epilogue and tail-call patterns emerge. ``` Cross-Compiler Comparison (CountWords, bytes of code): SAS/C -O2: ~52 bytes (LINK A5 + 9-reg save + epilogue overhead) GCC -O2: ~48 bytes (no LINK, minimal save, CMPI.B) VBCC -speed:~46 bytes (no LINK, minimal save, aggressive BRA sharing) DICE C: ~48 bytes (similar to VBCC) ``` --- ## Named Antipatterns ### "The Missing Frame Trap" — Assuming LINK for Function Boundaries ```asm ; VBCC function boundaries are RTS-delimited, not LINK-delimited. ; If your IDA script searches for LINK to find functions, you'll miss ALL VBCC functions. ; VBCC function entry could be any of: ; 1. MOVEM.L ..., -(SP) (most common) ; 2. LEA -$XX(SP), SP (large frame) ; 3. First instruction after previous RTS (leaf functions) ; 4. TST.L D0 / BEQ ... (function that doesn't save any regs) ``` ### "The Register Ghost" — `__reg()` Without Symbols Without source-level `__reg()` declarations, VBCC function arguments appear to use arbitrary register assignments. This can look like a custom ABI. The pattern is actually the VBCC `__reg()` convention encoded via `` headers during compilation. --- ## Pitfalls & Common Mistakes ### 1. Confusing VBCC and GCC Output Both omit frame pointers and use per-function saves. Disambiguate by: - **Hunk names**: VBCC uses `CODE`/`DATA`; GCC uses `.text`/`.data` (usually) - **`__MERGED` hunk**: VBCC-specific — no other compiler produces this - **Function naming**: VBCC emits names like `_funcname`; GCC emits `.Lxxx` internal labels - **BRA density**: VBCC has more `BRA.S` instructions (shared epilogues); GCC tends to duplicate code ### 2. Misreading SP-Relative Offsets ```asm ; At function entry (after MOVEM.L D2-D3, -(SP)): ; SP points 8 bytes below entry SP (D2 and D3 pushed) ; Arg1 is at $0C(SP) (8 bytes regs + 4 bytes return addr) ; But after LEA -$10(SP), SP: ; Arg1 is now at $1C(SP) (8 regs + 4 ret + 16 locals) ; The offset CHANGES when SP is modified — unlike A5-relative offsets ``` Track every `LEA +/-$N(SP), SP` instruction — each one shifts ALL subsequent SP-relative offsets. --- ## Use Cases ### Software Known to Be VBCC-Compiled | Application | Notes | |---|---| | **ScummVM (some ports)** | Large C codebase; VBCC's strict C89 catches portability issues | | **Modern Amiga utilities** | Many 2000s+ CLI tools use VBCC for small binary size | | **AROS system components** | VBCC is a supported AROS build compiler | | **MUI 5 custom classes** | Tight BOOPSI dispatch benefits from VBCC's register allocation | | **AmigaOS 4 system libraries** | Hyperion's SDK supports VBCC for OS4 development | --- ## Historical Context VBCC was created by Volker Barthelmann in the mid-1990s as a lightweight alternative to GCC's growing complexity. While GCC was the "heavy" compiler with C++ support, VBCC targeted developers who wanted a fast, standards-compliant C89 compiler that produced small binaries. Unlike SAS/C (commercial, dead since 1996) and GCC (open source but complex), VBCC occupies a unique niche: actively maintained, free for personal use, with a clean codebase. Its `vlink` linker and `vasm` assembler companion tools form a complete toolchain that has become the de facto standard for modern Amiga development alongside GCC bebbo. --- ## Modern Analogies | VBCC Concept | Modern Equivalent | |---|---| | `__reg()` | `register ... asm("d0")` in GCC/Clang (GNU C extension) | | Per-function register save | Clang's `-O2` with aggressive register allocation | | Cross-module optimization | LTO (Link-Time Optimization) in modern compilers | | `vlink` with `vasm` | LLVM's integrated `lld` linker with `clang` | | Config-driven target system | LLVM's `TargetRegistry` and target description files | --- ## FPGA / Emulation Impact - **No `LINK`/`UNLK`**: VBCC binaries don't use these instructions, reducing test coverage needs for frame pointer ops on FPGA cores. - **Aggressive `LEA` for stack frames**: `LEA -$N(SP), SP` must correctly update SP in a single instruction — verify your FPGA core handles LEA with SP destination correctly. - **Cross-module optimization**: No runtime impact; all inlining and merging happens at link time. --- ## FAQ **Q: How do I distinguish VBCC from GCC output?** A: Check hunk names — VBCC uses `CODE`/`DATA`, GCC typically uses `.text`/`.data`. Check for `__MERGED` hunk (VBCC-only). Check internal labels: VBCC uses `_name` format; GCC uses `.Lxxx`. Check BRA density — VBCC shares epilogues more aggressively. **Q: Does VBCC support C++?** A: No. If you find C++ constructs (vtables, `new`/`delete`, name mangling), it's NOT VBCC. **Q: Can VBCC and GCC object files be mixed?** A: No. They use different calling conventions for internal runtime functions. Link the entire project with one compiler. Assembly (`vasm`) can be mixed with VBCC C code using `vlink`. --- ## References - [13_toolchain/vbcc.md](../../../13_toolchain/vbcc.md) — VBCC usage and `__reg()` details - [compiler_fingerprints.md](../../compiler_fingerprints.md) — Quick identification - [13_toolchain/vasm_vlink.md](../../../13_toolchain/vasm_vlink.md) — vasm/vlink toolchain - VBCC homepage: http://sun.hasenbraten.de/vbcc/ - See also: [sasc.md](sasc.md), [gcc.md](gcc.md) — compare with other compilers