9.2 KiB
Recovering Data Structures
Overview
You see MOVE.L ($17A,A6), A0 in disassembly. You know A6 is SysBase. But +$17A — what field is that? Without structure definitions, every offset is just a number. With them, +$17A becomes SysBase->LibList and the disassembly transforms from arithmetic to narrative.
Amiga executables are built on a deep stack of OS structures — ExecBase, Node, List, Task, Process, IORequest, Message, MsgPort. Recovering these structures in disassembly means matching base register + constant offset patterns against the NDK 3.9 header definitions. This article covers the methodology, the most commonly encountered structures, and the IDA Pro workflows that automate the process.
graph LR
subgraph "Disassembly"
RAW["MOVE.L ($17A,A6), A0"]
end
subgraph "NDK Headers"
HDR["exec/execbase.h<br/>LibList at +0x17A"]
end
subgraph "Annotated"
ANN["SysBase->LibList.lh_Head<br/>= first library node"]
end
RAW -->|"match offset"| HDR
HDR -->|"apply structure type"| ANN
The MOVE.L offset(An),Dm Pattern
Structure field accesses appear as:
MOVEA.L _DOSBase, A6
MOVE.L ($17A,A6), A0 ; SysBase->LibList at offset +0x17A
MOVE.L (A0), A1 ; lh_Head
The key is the base register and constant offset. Match the offset against known structure definitions.
Common Structures and Key Offsets
struct ExecBase (at absolute address $4)
| Offset | Field | Type |
|---|---|---|
| +0 | LibNode |
struct Library |
| +0x128 | TaskReady |
struct List |
| +0x132 | TaskWait |
struct List |
| +0x17A | LibList |
struct List |
| +0x182 | DeviceList |
struct List |
| +0x21E | ChipRevBits0 |
UWORD |
| +0x280 | MemList |
struct List |
struct Node (8 bytes)
| Offset | Field |
|---|---|
| +0 | ln_Succ (next node) |
| +4 | ln_Pred (prev node) |
| +8 | ln_Type (UBYTE) |
| +9 | ln_Pri (BYTE priority) |
| +10 | ln_Name (STRPTR) |
List traversal:
MOVEA.L lh_Head, A0 ; first node
.loop:
TST.L (A0) ; ln_Succ == NULL?
BEQ.S .done
; process node at A0
MOVEA.L (A0), A0 ; A0 = ln_Succ
BRA.S .loop
struct Process (extends struct Task)
| Offset | Field |
|---|---|
| +0 | pr_Task (struct Task) |
| +92 | pr_MsgPort |
| +128 | pr_CLI (BPTR, non-NULL if CLI) |
| +172 | pr_SegList (BPTR to segment list) |
Detection in disassembly:
MOVE.L ($80,A4), D0 ; pr_CLI at offset +0x80
BEQ.S .wb_launch ; NULL = Workbench
struct IORequest / struct IOStdReq
| Offset | Field |
|---|---|
| +0 | io_Message (struct Message) |
| +20 | io_Device |
| +24 | io_Unit |
| +28 | io_Command (UWORD) |
| +30 | io_Flags (UBYTE) |
| +32 | io_Error (BYTE) |
| +36 | io_Length (ULONG) |
| +40 | io_Actual (ULONG) |
| +44 | io_Data (APTR) |
| +48 | io_Offset (ULONG) |
Annotating Structures in IDA Pro
Define a structure type:
View → Open Subviews → Local Types→Insert→ paste C struct definition- IDA parses the struct and creates a type entry
- Navigate to the base register in disassembly
- Press
T(structure offset) on anyoffset(An)operand - Select the struct type → all accesses auto-annotated
Import NDK39 headers:
Use File → Load file → Parse C header file → select exec/execbase.h, exec/tasks.h, etc. from NDK39.
Decision Guide — When to Use Each Approach
| Approach | Speed | Accuracy | Best For |
|---|---|---|---|
.fd file mapping |
Instant | 100% (known libs) | Library function identification — not structure recovery |
| Manual offset matching | 1–5 min per struct | 100% (verified against NDK) | Small structures or one-off field identification |
IDA Structure subview + T hotkey |
30 sec | 100% (if struct defined) | Batch annotation of known structures |
| Parse C header file | 1 min setup | 100% | Importing full NDK type system |
| Heuristic: offset clustering | ~2 min | 70–90% | Unknown structures — group accesses by register, infer field boundaries |
Named Antipatterns
1. "The Offset Hallucination"
What it looks like — assuming ($1A,A6) is always lib_Version because the number looks right:
MOVE.W ($1A,A6), D0 ; looks like version?
; Actually: lib_Version is at +$14 (offset 20), NOT +$1A (offset 26)
; +$1A = lib_Node.ln_Name (upper word of STRPTR)
Why it fails: Hex offsets are deceptive. $14 and $1A differ by 6 bytes — one field apart in a packed structure. Without the header definition, off-by-one-field errors are invisible until runtime.
Correct: Always verify against the NDK header. lib_Version is at +$14 (UWORD), not +$1A.
2. "The Nested Structure Blur"
What it looks like — accessing SysBase->LibList.lh_Head but interpreting it as SysBase->TaskWait.lh_Head:
MOVEA.L ($17A,A6), A0 ; +$17A = LibList (correct)
; Not: +$132 = TaskWait — that's a different list entirely
Why it fails: SysBase has multiple struct List fields. LibList (+$17A), DeviceList (+$182), TaskReady (+$128), TaskWait (+$132), MemList (+$280) — all use the same lh_Head access pattern. Without checking the exact offset, you'll identify the wrong list.
Correct: The offset is the discriminator. +$17A = LibList, +$128 = TaskReady. Never guess.
Use-Case Cookbook
Recover an Unknown Allocator — Trace AllocMem → FreeMem Pair
; Find the alloc:
MOVEQ #$1000, D0 ; size = 4096
MOVE.L #$10002, D1 ; MEMF_CLEAR | MEMF_PUBLIC
MOVEA.L 4.W, A6
JSR (-198,A6) ; AllocMem → D0 = ptr
MOVEA.L D0, A4
; ... later, find the free:
MOVEA.L A4, A1
MOVEQ #$1000, D0
JSR (-210,A6) ; FreeMem(A1=ptr, D0=size)
Trace D0 from AllocMem through the function to identify which struct is being allocated. If the code then accesses ($14,A4), you have a struct Library allocation.
Batch-Annotate All ExecBase Accesses
# IDA Python: apply ExecBase structure to all SysBase-relative accesses
def apply_execbase_structure():
sid = idc.get_struc_id("ExecBase")
if sid == idc.BADADDR:
idc.import_type(-1, "ExecBase")
sid = idc.get_struc_id("ExecBase")
sysbase = idc.get_name_ea_simple("SysBase")
if sysbase == idc.BADADDR:
print("SysBase not found")
return
# Find all instructions referencing SysBase-relative offsets
for xref in idautils.XrefsTo(sysbase):
ea = xref.frm
# Navigate forward looking for offset(An) operands
for i in range(10):
ea = idc.next_head(ea)
for n in range(2):
op = idc.print_operand(ea, n)
if op and '(' in op and 'A6' in op:
idc.op_stroff(ea, n, sid, 0)
print(f"Applied ExecBase at {ea:#010x}")
apply_execbase_structure()
Cross-Platform Comparison
| Amiga Concept | Win32 Equivalent | Linux ELF Equivalent | Notes |
|---|---|---|---|
| Structure recovery by offset matching | PDB symbol file (debug info) | DWARF debug info .debug_info |
Amiga lacks embedded debug info — must match offsets manually |
| NDK headers as ground truth | Windows SDK headers + PDB | GLibc headers + DWARF | Same idea: header defines layout, disassembly shows access pattern |
MOVE.L ($14,A6), D0 |
mov eax, [esi+14h] |
mov rax, [rbp+0x14] |
Universal pattern: base register + constant offset |
IDA T hotkey for struct offset |
IDA T on x86/ARM too |
Same | IDA's struct offset annotation is platform-independent |
FAQ
How do I identify a struct when the base register changes?
Track register writes backward. If A4 holds a struct pointer but you can't tell which struct, find the last MOVEA.L ..., A4. If it came from AllocMem, the size in D0 tells you the struct size — match against known struct sizes from NDK. If it came from OpenLibrary, it's a library base.
What if the OS version changed the struct layout?
Commodore maintained binary compatibility — fields were appended, never reordered. An offset that works on Kickstart 1.3 also works on 3.1, because the earlier fields are at the same positions. However, fields added in later versions only exist in those versions. Always check lib_Version before accessing fields added after OS 1.3.
References
Exec Node Traversal Loops
A recurring pattern: walking the LibList or DeviceList:
; Annotated after struct recovery:
MOVEA.L SysBase, A6
LEA (LibList,A6), A0 ; &SysBase->LibList
MOVEA.L (lh_Head,A0), A1 ; first lib node
.scan:
MOVEA.L (ln_Succ,A1), A2 ; peek next
TST.L A2
BEQ.S .not_found
; compare ln_Name string
MOVEA.L (ln_Name,A1), A0
JSR ___strcmp
TST.L D0
BEQ.S .found
MOVEA.L A2, A1
BRA.S .scan
References
- NDK39:
exec/execbase.h,exec/tasks.h,exec/nodes.h,exec/io.h - exec_base.md — full ExecBase field listing
- lists_nodes.md — MinList/List traversal
- IDA Pro: Structure subview, Local Types, T hotkey for struct offset