More information added

This commit is contained in:
Ilia Sharin 2026-04-27 18:34:07 -04:00
parent a383d4c065
commit 05452c6c12
10 changed files with 1617 additions and 10 deletions

View file

@ -2,9 +2,34 @@
# Identifying OS API Calls in Disassembly
## Background: How AmigaOS Library Calls Work
## Overview
Before diving into identification techniques, it helps to understand the mechanics from first principles.
You are staring at a disassembly listing of an unknown Amiga binary. You see `JSR (-$48,A6)` and have no idea what it calls. Multiply this by a thousand such instructions across the binary, and you realize: **without a systematic way to identify every OS call, reverse engineering an Amiga program is impossible.**
The AmigaOS library calling convention encodes every public function as a negative byte offset from a library base pointer — the **Library Vector Offset (LVO)**. If you know which library base lives in `A6` and what LVO is being called, you know exactly what the code does. This article covers the complete methodology: from raw `JSR (-N,A6)` to a fully annotated disassembly where every OS call is named.
```mermaid
graph LR
subgraph "Source Code"
SRC["Open(name, mode)"]
end
subgraph "Compiler Output"
ASM["JSR -30(A6)"]
end
subgraph "Runtime Dispatch"
JT["JMP table<br/>entry at -30"]
IMPL["dos.library<br/>Open_impl()"]
end
subgraph ".fd File Mapping"
FD["##bias 30<br/>Open(name,mode)(d1,d2)"]
end
SRC -->|"compiles to"| ASM
ASM -->|"dispatches through"| JT
JT --> IMPL
FD -.->|"documents the<br/>LVO→name mapping"| ASM
```
---
### What is a Shared Library?
@ -287,6 +312,217 @@ If you encounter `JSR (-N,A6)` and don't know which library A6 holds:
---
## Decision Guide — Which Lookup Method?
When you encounter `JSR (-N,A6)`, you have multiple ways to resolve the call. Choose based on what you know:
```mermaid
graph TD
Q[\"JSR -N,A6 —<br/>what does it call?\"/]
Q -->|\"A6 is a known library base?\"| KNOWN["Look up N in<br/>that library's .fd file"]
Q -->|\"A6 is unknown\"| TRACE["Trace A6 back<br/>to its source"]
TRACE -->|\"Found global+OpenLibrary\"| ID["Identify library<br/>from string arg"]
ID --> KNOWN
TRACE -->|\"Can't trace\"| HEUR["Heuristic:<br/>LVO in common ranges?"]
HEUR -->|\"-30 to -300\"| DOS["Likely dos.library"]
HEUR -->|\"-120 to -558\"| EXEC_LIKELY["Likely exec.library"]
HEUR -->|\"Other\"| SEARCH["Search all .fd files<br/>for matching bias"]
```
| Method | Speed | Accuracy | When to Use |
|---|---|---|---|
| **Known A6 + .fd lookup** | Instant | 100% | You've already identified the library base |
| **Trace A6 + find OpenLibrary** | ~2 min | 100% | Unknown library base, need certainty |
| **LVO range heuristic** | ~10 sec | ~80% | Quick triage, common LVOs overlap |
| **Grep all .fd files** | ~1 min | 95% | Unknown library, LVO not in common ranges |
---
## Named Antipatterns
### 1. \"The Kitchen Sink LVO Table\"
**What it looks like** — loading a massive precomputed table covering every LVO in every library into IDA, then blindly applying it without verifying A6:
```python
# BROKEN — applies ALL LVOs globally, ignores which library A6 holds
for lvo, name in ALL_LVOS.items():
idc.set_name(base + lvo, name) # wrong: base = Who knows?
```
**Why it fails:** A6 could hold `DOSBase`, `GfxBase`, or `IntuitionBase` at any point. An LVO `-30` means `dos.library Open()` only when A6=`DOSBase`. Applied to `GfxBase`, it's `graphics.library BltBitMap()` — completely different. You get a disassembly full of confidently wrong labels.
**Correct:** Always identify A6's library first, then apply that specific library's LVO map:
```python
if get_name(global_ptr) == "_DOSBase":
apply_lvos(lib_base, DOS_LVO)
elif get_name(global_ptr) == "_GfxBase":
apply_lvos(lib_base, GFX_LVO)
```
### 2. \"The Ghost Library\"
**What it looks like** — assuming the first `JSR (-N,A6)` after a `MOVEA.L 4.W, A6` uses the exec library base, but A6 was overwritten between the load and the call:
```asm
MOVEA.L 4.W, A6 ; exec base — but this is never used
MOVEA.L (_DOSBase).L, A6 ; A6 overwritten with DOS base
JSR (-552,A6) ; WRONG assumption: this is NOT exec OpenLibrary
; Correct: LVO -552 for dos.library is ExAll
```
**Why it fails:** The disassembler shows `JSR (-552,A6)` and annotates it as `exec.library OpenLibrary()` because that's the most common match. But A6 was reloaded with `_DOSBase` — the actual call is `dos.library ExAll()` at LVO `-552`. Same LVO, different library, completely different behavior.
**Correct:** Track A6's value at every `JSR`. Never assume A6 is static across a function.
```asm
MOVEA.L 4.W, A6 ; A6 = SysBase (verified: exec at $4)
MOVEA.L (_DOSBase).L, A6 ; A6 = DOSBase (verified: global labeled)
JSR (-552,A6) ; LVO -552 in dos.library = ExAll
```
### 3. \"The Stale Base\"
**What it looks like** — calling through A6 after `CloseLibrary()`:
```c
// BROKEN
CloseLibrary(DOSBase);
if (result)
DOSBase->DoSometime(); // DOSBase is now stale — crash or call into freed memory
```
In disassembly, you see `JSR (-N,A6)` after a `JSR (-558,A6)` (CloseLibrary). A6 becomes a dangling pointer. Any subsequent call through it hits freed memory — a crash or, worse, silent corruption.
**Correct:** After `CloseLibrary`, zero the base pointer. In the disassembly, flag any `JSR` that follows a `CloseLibrary` sequence as suspicious.
---
## Pitfalls
### 1. LVO Collisions Across Libraries
**The bug:**
```asm
MOVEA.L (_IntuitionBase).L, A6
JSR (-42,A6) ; Is this Read()? No!
```
**Why:** LVO `-42` is `dos.library Read()` AND `intuition.library DrawImageState()` AND `graphics.library RectFill()`. LVOs are only unique within a single library.
**Correct:** The library base in A6 disambiguates. Always label the library base global first, then resolve LVOs.
### 2. Private LVOs in Third-Party Libraries
**The bug:** Using NDK `.fd` files to resolve calls to a third-party library (e.g., `Miami.library`, `muimaster.library`). The NDK doesn't document these — the LVO table won't match.
**Correct:** Third-party libraries require third-party `.fd` files. Search Aminet for `"libraryname" fd` or reconstruct the LVO table from the library binary itself (see [library_jmp_table.md](library_jmp_table.md)).
### 3. Inline Variants — Bypassing the JMP Table
**The bug:**
```asm
MOVEA.L (_DOSBase).L, A6
MOVEA.L (-30,A6), A0 ; read JMP table entry (NOT the JMP itself)
JSR (A0) ; call directly — you won't see LVO -30 here
```
**Why:** Some compilers (especially GCC with `-fbaserel`) inline the JMP table read. The `JSR (A0)` has no static LVO that grep can match. You must trace A0 back to the `MOVEA.L (-30,A6),A0` to recover the LVO.
**Correct:** When you see `JSR (A0)` or `JSR (An)` with a register, check the immediately preceding instruction for a `MOVEA.L (-N, A6), An` — that `-N` is your LVO.
---
## Use-Case Cookbook
### Find All File I/O Operations
To identify every file open/read/write/close in a binary:
1. Search for `JSR (-552,A6)` (exec OpenLibrary) and identify the `dos.library` open
2. Label the resulting global `_DOSBase`
3. Xref `_DOSBase` — every read is a function that uses dos.library
4. Filter for `JSR (-30,A6)` (Open), `JSR (-42,A6)` (Read), `JSR (-48,A6)` (Write), `JSR (-36,A6)` (Close)
5. Cross-reference the `D1` register before each Open call to identify **which files** are being opened
### Find All Memory Allocations
1. Search for `JSR (-198,A6)` where A6=`SysBase` (exec AllocMem)
2. Note `D0` (size) and `D1` (attributes — `MEMF_CHIP` = `$0002`, `MEMF_FAST` = `$0004`, `MEMF_CLEAR` = `$10000`)
3. Identify allocations that request Chip RAM — these are for audio buffers, copper lists, or bitplanes
4. Trace the returned pointer in `D0` to find what the allocation is used for
### Trace an Unknown Message Flow
1. Find `JSR (-378,A6)` (exec PutMsg) — identifies message senders
2. Find `JSR (-384,A6)` (exec GetMsg) — identifies message receivers
3. Find `JSR (-408,A6)` (exec WaitPort) — identifies blocking receivers
4. Trace `A0` before each `PutMsg` to identify **which port** the message targets
5. Trace `A0` before each `GetMsg` to identify **which port** the receiver listens on
6. If sender and receiver port names match, you've found a communication pair
### Map an Application's Library Dependencies
```python
# IDA Python: dump every library used by a binary
import idautils, idc
def find_library_opens():
"""Find all OpenLibrary calls and print library names."""
for ea in idautils.Heads():
if idc.print_insn_mnem(ea) == 'JSR':
# Check if A6 register is used (LVO-style call)
op = idc.print_operand(ea, 0)
if 'A6' in op and '-552' in op: # OpenLibrary
# Walk back to find LEA with string
prev = idc.prev_head(ea)
if idc.print_insn_mnem(prev) == 'LEA':
str_addr = idc.get_operand_value(prev, 0)
lib_name = idc.get_strlit_contents(str_addr)
if lib_name:
print(f" {idc.here():08X}: OpenLibrary({lib_name.decode()})")
find_library_opens()
```
---
## Cross-Platform Comparison
| AmigaOS Concept | Win32 Equivalent | Linux ELF Equivalent | Notes |
|---|---|---|---|
| LVO-based library call (`JSR -30(A6)`) | IAT (Import Address Table) thunk | PLT (Procedure Linkage Table) stub | Both use indirection; Amiga's is register+offset, modern OSes use memory-based tables |
| `.fd` file (function descriptor) | `.lib` import library + `GetProcAddress` | `.so` ELF symbol table | `.fd` files are human-readable text; PE/ELF symbol tables are binary |
| `OpenLibrary("dos.library", 36)` | `LoadLibrary("kernel32.dll")` | `dlopen("libc.so.6", ...)` | Same pattern: load by name, get base pointer, resolve functions |
| Library base in A6 | DLL base address in EAX/RAX | Shared object handle from `dlopen` | Amiga uses a dedicated register convention; Win32/Linux use a variable |
| JMP table at negative offsets | IAT entries at RVA offsets | `.got.plt` entries | Amiga's table grows downward from base; PE/ELF tables are at positive offsets |
| No runtime linking required (ROM libraries always present) | Delay-load DLLs | Lazy binding via `LD_BIND_NOW` | Amiga ROM libraries are always mapped — no load failure possible |
---
## FAQ
### How do I identify library calls without .fd files?
If the library is a standard AmigaOS library, `.fd` files are in `NDK39/fd/`. For third-party libraries, search the binary for the JMP table using `4EF9` (the `JMP ABS.L` opcode) clustered at regular 6-byte intervals. See [library_jmp_table.md](library_jmp_table.md).
### What if the binary uses a custom calling convention?
Some demos and games bypass the OS calling convention entirely — they call library functions directly by address (no LVO indirection). This is usually done for speed or obfuscation. In these cases, identify calls by the address falling within a known library's code segment, not by LVO pattern.
### Why does the LVO look wrong — it's not a multiple of 6?
Check the `.fd` file's `##bias` value. Bias = `|LVO|`. So `##bias 30` → LVO `30` → slot 4 (`30÷61`). If you see `JSR (-$1E,A6)`, convert to decimal: `-30`. The hex `$1E` = 30 decimal. Always work in decimal when matching `.fd` biases.
### Can the same LVO appear in two different registers?
Yes. `JSR (-30,A5)` and `JSR (-30,A6)` are different calls if A5 and A6 hold different library bases. The LVO alone does not identify the call — the **register + LVO pair** does.
---
## References
- NDK39: `fd/` directory — all library `.fd` files (plain text, open in any editor)