More content added

2026-07-28 04:36:28 +00:00 · 2026-04-26 14:46:18 -04:00 · 2026-04-26 14:46:18 -04:00 · 8133b3a6cb
commit 8133b3a6cb
parent 5fac29ccd5
90 changed files with 7794 additions and 705 deletions
--- a/08_graphics/README.md
+++ b/08_graphics/README.md
@ -2,7 +2,7 @@

 # Graphics Subsystem — Overview

-The Amiga graphics system is built on custom DMA-driven hardware (Agnus/Alice + Denise/Lisa) managed through `graphics.library`. It supports planar bitmaps, hardware sprites, a Copper display coprocessor, and a Blitter for fast 2D operations. Three chipset generations (OCS → ECS → AGA) expanded resolution, colour depth, and bandwidth.
+The Amiga graphics system is built on custom DMA-driven hardware (Agnus/Alice + Denise/Lisa) managed through `graphics.library`. It supports planar bitmaps, hardware sprites, a Copper display coprocessor, and a Blitter for fast 2D operations. Three chipset generations (OCS → ECS → AGA) expanded resolution, color depth, and bandwidth.

 ## Section Index

@ -16,7 +16,7 @@ The Amiga graphics system is built on custom DMA-driven hardware (Agnus/Alice +
 | [copper_programming.md](copper_programming.md) | Copper deep dive: architecture, copper list construction, gradient and raster effects |
 | [blitter.md](blitter.md) | Blitter DMA engine, minterms, BltBitMap |
 | [blitter_programming.md](blitter_programming.md) | Blitter deep dive: minterms, cookie-cut masking, line draw, fill mode |
-| [sprites.md](sprites.md) | Hardware sprites: DMA engine, data format, attached 15-colour sprites, multiplexing, AGA enhancements, priority control |
+| [sprites.md](sprites.md) | Hardware sprites: DMA engine, data format, attached 15-color sprites, multiplexing, AGA enhancements, priority control |
 | [rastport.md](rastport.md) | RastPort drawing context: draw modes, patterns, layer clipping, text pipeline, blitter minterms |
 | [views.md](views.md) | View, ViewPort, MakeVPort, display construction |
 | [text_fonts.md](text_fonts.md) | TextFont bitmap layout, baseline rendering, algorithmic styles, AvailFonts enumeration |
--- a/08_graphics/animation.md
+++ b/08_graphics/animation.md
@ -292,11 +292,11 @@ saveSize += sizeof(WORD) * bob->BobVSprite->Height * bob->BobVSprite->Depth;
 bob->SaveBuffer = AllocMem(saveSize, MEMF_CHIP | MEMF_CLEAR);
 ```

-## GEL System Initialisation and Render Loop
+## GEL System Initialization and Render Loop

-The GEL system requires explicit initialisation before use. The core lifecycle is: **init → add objects → sort → draw → sync → repeat → cleanup**.
+The GEL system requires explicit initialization before use. The core lifecycle is: **init → add objects → sort → draw → sync → repeat → cleanup**.

-### Initialisation
+### Initialization

 ```c
 #include <graphics/gels.h>
@ -771,7 +771,7 @@ myVS.PlaneOnOff = 0x00;      /* planes 1-3 get 0 (transparent) */

 | Function | Description |
 |---|---|
-| `InitGels(head, tail, gi)` | Initialise GEL list with sentinel VSprites |
+| `InitGels(head, tail, gi)` | Initialize GEL list with sentinel VSprites |
 | `AddVSprite(vs, rp)` | Add a VSprite to the GEL list |
 | `AddBob(bob, rp)` | Add a BOB (and its backing VSprite) |
 | `AddAnimOb(ao, head, rp)` | Add an AnimOb and all its components |
--- a/08_graphics/bitmap.md
+++ b/08_graphics/bitmap.md
@ -1,17 +1,76 @@
 [← Home](../README.md) · [Graphics](README.md)

-# BitMap — Planar Bitmap Structure and Layout
+# BitMap — Planar Layout, AllocBitMap, Interleaving, and the RastPort Relationship

 ## Overview

-Amiga display memory uses **planar** layout: each bitplane is a separate contiguous memory region. A pixel's colour index is formed by reading one bit from each plane at the same x,y position. This is fundamentally different from chunky (packed-pixel) or interleaved formats.
+The Amiga's display system is built on **planar bitmaps**: rather than storing each pixel's color in a contiguous byte or word (chunky), the display hardware reads one bit from each of several independent memory regions called **bitplanes**. A pixel's color index is the concatenation of bits at the same (x, y) coordinate across all planes. This design was chosen in 1985 because it minimizes DMA bandwidth: a 16-color screen needs only 4 bits per pixel across the bus, and the Copper can manipulate individual bitplanes with simple pointer changes. The trade-off is software complexity — drawing a single pixel requires read-modify-write across multiple planes, and modern developers accustomed to RGB framebuffers must rethink their mental model. The `struct BitMap` is the central data structure that describes this layout, and `graphics.library` provides `AllocBitMap()` (OS 3.0+) to manage its allocation, alignment, and Chip RAM requirements automatically.

 ---

-## struct BitMap
+## Architecture
+
+### BitMap in the Display Pipeline
+
+```mermaid
+graph LR
+    subgraph "Software"
+        APP["Application\nDraw into BitMap"]
+        RP["RastPort\nDrawing context"]
+    end
+
+    subgraph "graphics.library"
+        AB["AllocBitMap()\nChip RAM allocation\nAlignment enforcement"]
+        BL["BltBitMap()\nBlitter DMA copy"]
+    end
+
+    subgraph "Hardware"
+        BP["BitPlane DMA\n(Agnus/Alice)"]
+        DEN["Denise/Lisa\nPixel compositor"]
+        OUT["Video output"]
+    end
+
+    APP --> RP
+    RP --> AB
+    AB --> BL
+    BL --> BP
+    BP --> DEN
+    DEN --> OUT
+
+    style BP fill:#e8f4fd,stroke:#2196f3,color:#333
+    style DEN fill:#fff3e0,stroke:#ff9800,color:#333
+```
+
+### Standard vs Interleaved Layout
+
+```mermaid
+graph TB
+    subgraph "Standard Planar"
+        SP0["Plane 0\n10,240 bytes"]
+        SP1["Plane 1\n10,240 bytes"]
+        SP2["Plane 2\n10,240 bytes"]
+        SP3["Plane 3\n10,240 bytes"]
+    end
+
+    subgraph "Interleaved"
+        I0["Row 0: P0 P1 P2 P3\n160 bytes"]
+        I1["Row 1: P0 P1 P2 P3\n160 bytes"]
+        I2["Row 2: P0 P1 P2 P3\n160 bytes"]
+    end
+
+    style I0 fill:#e8f5e9,stroke:#4caf50,color:#333
+```
+
+In **standard planar**, each plane is a contiguous block. In **interleaved**, planes are striped row-by-row within a single allocation. Interleaving improves cache locality and allows the Blitter to fetch all planes for a given row in one pass.
+
+---
+
+## Data Structures
+
+### struct BitMap

 ```c
-/* graphics/gfx.h — NDK39 */
+/* graphics/gfx.h — NDK 3.9 */
 struct BitMap {
    UWORD  BytesPerRow;    /* bytes per row per plane (must be even) */
    UWORD  Rows;           /* height in pixels */
@ -22,30 +81,40 @@ struct BitMap {
 };
 ```

---
+### Field Reference

-## BMF_ Flags
+| Field | Type | Description | Constraints |
+|---|---|---|---|
+| `BytesPerRow` | `UWORD` | Bytes per scanline **per plane** | Must be even; minimum 2; typically `width / 8` rounded up to even |
+| `Rows` | `UWORD` | Height in pixels | Maximum 1024 on OCS/ECS; 2048+ on AGA with large-modulo tricks |
+| `Flags` | `UBYTE` | `BMF_*` allocation flags | Set by `AllocBitMap`; do not modify directly |
+| `Depth` | `UBYTE` | Number of bitplanes | 1–8 (AGA); 1–6 (OCS/ECS practical limit) |
+| `Planes[]` | `PLANEPTR` | Pointers to each plane's base address | All must be in Chip RAM if displayable; `NULL` for unused planes |
+
+### BMF_ Flags

 ```c
-#define BMF_CLEAR        (1<<0)  /* clear planes on allocation */
-#define BMF_DISPLAYABLE  (1<<1)  /* allocated in displayable (Chip) RAM */
-#define BMF_INTERLEAVED  (1<<2)  /* planes are interleaved in memory */
-#define BMF_STANDARD     (1<<3)  /* use standard allocation */
-#define BMF_MINPLANES    (1<<4)  /* minimum number of planes */
+#define BMF_CLEAR        (1<<0)  /* Zero-fill planes on allocation */
+#define BMF_DISPLAYABLE  (1<<1)  /* Allocate in Chip RAM (DMA-visible) */
+#define BMF_INTERLEAVED  (1<<2)  /* Row-interleaved plane layout */
+#define BMF_STANDARD     (1<<3)  /* Use standard (non-super) allocation */
+#define BMF_MINPLANES    (1<<4)  /* Allocate minimum planes; rest NULL */
 ```

 ---

 ## Planar Memory Layout

-For a 320×256×4 display (16 colours):
+### Standard Planar
+
+For a 320×256×4 display (16 colors):

 ```
 BytesPerRow = 320/8 = 40 bytes
 Rows = 256
 Depth = 4

-Plane 0: 40 × 256 = 10,240 bytes  (bit 0 of colour index)
+Plane 0: 40 × 256 = 10,240 bytes  (bit 0 of color index)
 Plane 1: 40 × 256 = 10,240 bytes  (bit 1)
 Plane 2: 40 × 256 = 10,240 bytes  (bit 2)
 Plane 3: 40 × 256 = 10,240 bytes  (bit 3)
@ -53,66 +122,477 @@ Plane 3: 40 × 256 = 10,240 bytes  (bit 3)
 Total = 4 × 10,240 = 40,960 bytes
 ```

-Pixel colour at (x, y):
-```
-bit0 = (Planes[0][y * BytesPerRow + x/8] >> (7 - x%8)) & 1
-bit1 = (Planes[1][y * BytesPerRow + x/8] >> (7 - x%8)) & 1
-bit2 = (Planes[2][y * BytesPerRow + x/8] >> (7 - x%8)) & 1
-bit3 = (Planes[3][y * BytesPerRow + x/8] >> (7 - x%8)) & 1
-colour_index = (bit3 << 3) | (bit2 << 2) | (bit1 << 1) | bit0
-```
-
---
-
-## Allocation
-
+Pixel color at (x, y):
 ```c
-/* OS 3.0+ — AllocBitMap: */
-struct BitMap *bm = AllocBitMap(320, 256, 4,
-                                BMF_CLEAR | BMF_DISPLAYABLE, NULL);
-/* Always in Chip RAM when BMF_DISPLAYABLE */
-
-/* Manual allocation (OS 1.x compatible): */
-struct BitMap bm;
-InitBitMap(&bm, 4, 320, 256);
-for (int i = 0; i < 4; i++)
-    bm.Planes[i] = AllocRaster(320, 256);  /* MEMF_CHIP */
-
-/* Free: */
-FreeBitMap(bm);  /* or FreeRaster per plane */
+UBYTE byte =Planes[p][y * bm->BytesPerRow + x / 8];
+UBYTE bit  = (byte >> (7 - (x & 7))) & 1;
 ```

---
+> **Big-Endian note**: The 68000 stores the leftmost pixel of each byte in bit 7, not bit 0. `(7 - (x & 7))` extracts the correct bit. Modern little-endian developers often get this backwards.

-## Interleaved BitMaps
+### Interleaved Planar
+
+With `BMF_INTERLEAVED`, memory is organized as:

-With `BMF_INTERLEAVED`, all planes are stored sequentially row by row:
 ```
-Row 0, Plane 0: 40 bytes
-Row 0, Plane 1: 40 bytes
-Row 0, Plane 2: 40 bytes
-Row 0, Plane 3: 40 bytes
-Row 1, Plane 0: 40 bytes
+BytesPerRow = 40 * 4 = 160   /* covers all planes for one row */
+
+Row 0: [Plane0 bytes][Plane1 bytes][Plane2 bytes][Plane3 bytes]
+Row 1: [Plane0 bytes][Plane1 bytes][Plane2 bytes][Plane3 bytes]
 ...
 ```

-BytesPerRow becomes `40 × Depth = 160`, and each `Planes[i]` pointer is offset by `i * 40` from the base. This layout is more cache-friendly and allows single-pass blits.
+Pointer arithmetic:
+```c
+/* Planes[i] points to Plane i's data within the interleaved block */
+Planes[0] = base;
+Planes[1] = base + 40;
+Planes[2] = base + 80;
+Planes[3] = base + 120;
+
+/* Accessing pixel (x, y) in plane p: */
+UBYTE *row = base + y * 160;
+UBYTE byte = row[p * 40 + x / 8];
+```
+
+**Why interleave?**
+- **Cache efficiency**: All planes for a row are contiguous
+- **Blitter speed**: Single modulo value advances to next row; fewer setup registers
+- **ScrollRaster**: Hardware scroll works correctly with interleaved layout
+
+**Trade-off**: Interleaved BitMaps are harder to manipulate with custom CPU rendering because plane pointers are not independent.

 ---

-## AGA 8-Bit Bitmaps
+## API Reference
+
+### Allocation

-AGA (A1200/A4000) supports up to 8 bitplanes = 256 colours:
 ```c
-struct BitMap *bm = AllocBitMap(320, 256, 8, BMF_CLEAR | BMF_DISPLAYABLE, NULL);
-/* 8 planes × 10,240 = 81,920 bytes of Chip RAM */
+/* OS 3.0+ — preferred method */
+struct BitMap *AllocBitMap(ULONG width, ULONG height, ULONG depth,
+                           ULONG flags, struct BitMap *friend);
+
+/* Free */
+void FreeBitMap(struct BitMap *bm);
 ```

+| Parameter | Description |
+|---|---|
+| `width` | Width in pixels |
+| `height` | Height in pixels |
+| `depth` | Number of bitplanes (1–8) |
+| `flags` | `BMF_*` flags |
+| `friend` | Optional "friend" BitMap for compatibility (usually `NULL` or a screen's BitMap) |
+
+> [!WARNING]
+> **Requires Chip RAM**: When `BMF_DISPLAYABLE` is set, `AllocBitMap()` allocates from Chip RAM (`MEMF_CHIP`). The Blitter, Copper, bitplane DMA, and sprite DMA cannot access Fast RAM. Pointing display hardware at a Fast RAM BitMap produces silent corruption.
+
+### Manual Allocation (OS 1.x Compatible)
+
+```c
+struct BitMap bm;
+InitBitMap(&bm, 4, 320, 256);
+for (int i = 0; i < 4; i++)
+    bm.Planes[i] = AllocRaster(320, 256);  /* AllocMem(..., MEMF_CHIP) */
+
+/* Free: */
+for (int i = 0; i < 4; i++)
+    FreeRaster(bm.Planes[i], 320, 256);
+```
+
+### RastPort Relationship
+
+A `RastPort` is the drawing context; it contains a pointer to a `BitMap` plus pen, draw mode, and layer state:
+
+```c
+/* graphics/rastport.h — NDK 3.9 */
+struct RastPort {
+    struct Layer *Layer;
+    struct BitMap *BitMap;    /* Target bitmap for drawing */
+    UWORD  cp_x, cp_y;        /* Current pen position */
+    UBYTE  DrawMode;          /* JAM1, JAM2, COMPLEMENT, INVERSVID */
+    UBYTE  AreaPtrn;          /* Areafill pattern pointer */
+    UBYTE  linpatcnt;
+    UBYTE  dummy;
+    UWORD  Flags;
+    UWORD  LinePtrn;
+    SHORT  cp_minx, cp_maxx;
+    SHORT  cp_miny, cp_maxy;
+    UBYTE  APen, BPen;
+    UBYTE  AlphaThreshold;
+    /* ... additional fields ... */
+};
+```
+
+```c
+/* Initialize a RastPort for a BitMap */
+struct RastPort rp;
+InitRastPort(&rp);
+rp.BitMap = myBitMap;
+
+/* Now draw: */
+SetAPen(&rp, 3);
+Move(&rp, 10, 10);
+Draw(&rp, 100, 50);   /* Line rendered into myBitMap */
+```
+
+---
+
+## Decision Guide: Standard vs Interleaved
+
+| Criterion | Standard Planar | Interleaved |
+|---|---|---|
+| **When to use** | Custom CPU rendering, per-plane effects, easy pointer math | Blitter-heavy code, scrolling, OS-friendly rendering |
+| **BytesPerRow** | `width / 8` (rounded up) | `width / 8 * depth` |
+| `AllocBitMap()` flag | None (default) | `BMF_INTERLEAVED` |
+| `ScrollRaster()` | Requires manual plane loop | Works with single call |
+| `BltBitMap()` | Multiple blits or per-plane loops | Single blit with modulo |
+| **CPU pixel access** | Simple: `Planes[p][offset]` | Complex: `base + row * bpr * depth + p * bpr` |
+| **Memory fragmentation** | `depth` separate allocations | One contiguous block |
+| **Display hardware** | Identical — DMA doesn't care | Identical — DMA doesn't care |
+
+---
+
+## Practical Examples
+
+### Example 1: Allocate and Clear a Displayable BitMap
+
+```c
+#include <graphics/gfx.h>
+#include <proto/graphics.h>
+
+struct BitMap *CreateDisplayBitMap(ULONG width, ULONG height, ULONG depth)
+{
+    struct BitMap *bm = AllocBitMap(width, height, depth,
+                                    BMF_CLEAR | BMF_DISPLAYABLE,
+                                    NULL);
+    if (!bm)
+    {
+        /* Out of Chip RAM — this is common on stock A500 */
+        return NULL;
+    }
+
+    /* Verify allocation succeeded for all requested planes */
+    for (int i = 0; i < depth; i++)
+    {
+        if (!bm->Planes[i])
+        {
+            FreeBitMap(bm);
+            return NULL;
+        }
+    }
+
+    return bm;
+}
+```
+
+### Example 2: CPU Pixel Plot (Standard Planar)
+
+```c
+void PutPixel(struct BitMap *bm, WORD x, WORD y, UBYTE color)
+{
+    if (x < 0 || x >= bm->BytesPerRow * 8 || y < 0 || y >= bm->Rows)
+        return;
+
+    UWORD byteOffset = y * bm->BytesPerRow + (x >> 3);
+    UBYTE bitMask    = 0x80 >> (x & 7);   /* bit 7 = leftmost pixel */
+
+    for (int p = 0; p < bm->Depth; p++)
+    {
+        if (color & (1 << p))
+            bm->Planes[p][byteOffset] |= bitMask;   /* Set bit */
+        else
+            bm->Planes[p][byteOffset] &= ~bitMask;  /* Clear bit */
+    }
+}
+```
+
+### Example 3: CPU Pixel Plot (Interleaved)
+
+```c
+void PutPixelInterleaved(struct BitMap *bm, WORD x, WORD y, UBYTE color)
+{
+    if (!(bm->Flags & BMF_INTERLEAVED)) return;
+
+    UWORD rowBytes  = bm->BytesPerRow;        /* total per row, all planes */
+    UWORD planeBytes = rowBytes / bm->Depth;   /* bytes per plane per row */
+    UWORD byteOffset = y * rowBytes + (x >> 3);
+    UBYTE bitMask    = 0x80 >> (x & 7);
+
+    for (int p = 0; p < bm->Depth; p++)
+    {
+        UBYTE *planePtr = bm->Planes[0] + byteOffset + p * planeBytes;
+        if (color & (1 << p))
+            *planePtr |= bitMask;
+        else
+            *planePtr &= ~bitMask;
+    }
+}
+```
+
+### Example 4: Blitter Copy Between BitMaps
+
+```c
+/* Copy a rectangle from source to destination */
+void CopyRect(struct BitMap *src, WORD sx, WORD sy,
+              struct BitMap *dst, WORD dx, WORD dy,
+              WORD width, WORD height)
+{
+    BltBitMap(src, sx, sy, dst, dx, dy, width, height,
+              0xC0,        /* minterm: D = C (straight copy) */
+              0x01,        /* mask: all planes */
+              NULL);       /* no temporary mask */
+}
+```
+
+### Example 5: Attach BitMap to ViewPort
+
+```c
+struct ViewPort vp;
+struct BitMap *bm = CreateDisplayBitMap(320, 256, 5);
+
+InitVPort(&vp);
+vp.DWidth  = 320;
+vp.DHeight = 256;
+vp.DxOffset = 0;
+vp.DyOffset = 0;
+vp.RasInfo = &ri;
+vp.Modes   = HIRES | SPRITES;
+
+ri.BitMap = bm;
+ri.RxOffset = 0;
+ri.RyOffset = 0;
+```
+
+---
+
+## When to Use / When NOT to Use
+
+### When to Use AllocBitMap
+
+| Scenario | Why It Fits |
+|---|---|
+| **OS 3.0+ application** | `AllocBitMap()` handles alignment, Chip RAM, and friend-BitMap compatibility |
+| **Off-screen buffers** | Allocate non-displayable BitMaps for pre-rendering, then blit to screen |
+| **Double buffering** | Two displayable BitMaps swapped per frame via `ChangeVPBitMap()` |
+| **Interleaved scrolling** | `BMF_INTERLEAVED` + `ScrollRaster()` is the correct path for smooth scroll |
+
+### When NOT to Use AllocBitMap / Manual BitMaps
+
+| Scenario | Problem | Better Approach |
+|---|---|---|
+| **Direct hardware banging** | `AllocBitMap()` may allocate structures you don't need | Direct `AllocMem(MEMF_CHIP)` and manual `Planes[]` setup |
+| **Copper-only displays** | If the CPU never draws, a raw bitplane array is sufficient | Manual `AllocRaster()` per plane |
+| **Chunky-to-planar rendering** | C2P output needs specific plane alignment; `AllocBitMap` may not match | Allocate manually with `MEMF_CHIP` and verify alignment |
+| **Custom DMA tricks** | Some demo effects need non-standard `BytesPerRow` or plane spacing | Manual allocation with exact sizes |
+
+---
+
+## Best Practices & Antipatterns
+
+### Best Practices
+
+1. **Always check `AllocBitMap()` return value** — Chip RAM exhaustion is common on stock machines.
+2. **Verify all `Planes[]` are non-NULL** — `BMF_MINPLANES` can leave upper planes unset.
+3. **Use `TypeOfMem(bm->Planes[0])` to confirm Chip RAM** when debugging DMA issues.
+4. **Round width up to 16-pixel boundaries** for Blitter efficiency: `width = ((width + 15) / 16) * 16`.
+5. **Prefer interleaved for scrolling games** — `ScrollRaster()` and the Blitter work optimally.
+6. **Use standard planar for per-plane effects** — color cycling, parallax, and palette tricks are easier.
+7. **Free with `FreeBitMap()`** if allocated with `AllocBitMap()` — do not mix manual and automatic free.
+8. **Set `friend` BitMap when possible** — improves compatibility with graphics cards and RTG systems.
+
+### Antipatterns
+
+#### 1. The Odd-Width Trap
+
+```c
+/* ANTIPATTERN — BytesPerRow not even */
+struct BitMap bm;
+InitBitMap(&bm, 4, 321, 200);   /* 321 pixels = 41 bytes (odd!) */
+/* Blitter requires even alignment; DMA may corrupt adjacent memory */
+
+/* CORRECT — round up to even bytes */
+InitBitMap(&bm, 4, 320, 200);   /* 40 bytes — even, safe */
+```
+
+#### 2. The Fast RAM BitMap
+
+```c
+/* ANTIPATTERN — allocating display BitMap in Fast RAM */
+struct BitMap bm;
+InitBitMap(&bm, 4, 320, 256);
+for (int i = 0; i < 4; i++)
+    bm.Planes[i] = AllocMem(10240, MEMF_ANY);  /* May return Fast RAM! */
+
+/* CORRECT — force Chip RAM for displayable bitmaps */
+for (int i = 0; i < 4; i++)
+    bm.Planes[i] = AllocMem(10240, MEMF_CHIP | MEMF_CLEAR);
+```
+
+#### 3. The Uninitialized RastPort
+
+```c
+/* ANTIPATTERN — using a RastPort without initialization */
+struct RastPort rp;
+rp.BitMap = myBitMap;
+Move(&rp, 0, 0);   /* rp contains garbage pen, mode, layer ptr → crash */
+
+/* CORRECT — always InitRastPort() */
+struct RastPort rp;
+InitRastPort(&rp);
+rp.BitMap = myBitMap;
+```
+
+---
+
+## Pitfalls & Common Mistakes
+
+### 1. Modulo Misalignment in Blitter Operations
+
+The Blitter uses **word-aligned** addressing. If `BytesPerRow` is odd, or if you compute offsets incorrectly for interleaved BitMaps, the Blitter wraps to the wrong memory location.
+
+```c
+/* PITFALL — wrong modulo for interleaved BitMap */
+struct BitMap *bm = AllocBitMap(320, 256, 4, BMF_INTERLEAVED, NULL);
+/* BytesPerRow = 40 * 4 = 160 */
+
+/* If you tell the Blitter modulo = 40 (per-plane), it advances
+   by 40 bytes per row — landing in the middle of the next plane. */
+
+/* CORRECT — modulo for interleaved is total row bytes */
+UWORD modulo = bm->BytesPerRow;   /* 160, not 40 */
+```
+
+### 2. Depth Mismatch in BltBitMap
+
+```c
+/* PITFALL — copying from 5-plane to 3-plane BitMap */
+BltBitMap(src, 0, 0, dst, 0, 0, 320, 200, 0xC0, 0x1F, NULL);
+/* If dst->Depth < 5, BltBitMap writes past Plane[] array → crash */
+
+/* CORRECT — ensure destination depth >= source depth, or mask
+   to only the planes that exist: */
+UBYTE planeMask = (1 << dst->Depth) - 1;
+BltBitMap(src, 0, 0, dst, 0, 0, 320, 200, 0xC0, planeMask, NULL);
+```
+
+### 3. Forgetting PlaneClear on Manual Allocations
+
+```c
+/* PITFALL — uninitialized planes contain garbage */
+struct BitMap bm;
+InitBitMap(&bm, 4, 320, 256);
+for (int i = 0; i < 4; i++)
+    bm.Planes[i] = AllocRaster(320, 256);
+/* Planes contain random data → visual garbage on first display */
+
+/* CORRECT — clear after allocation */
+for (int i = 0; i < 4; i++)
+    memset(bm.Planes[i], 0, bm.BytesPerRow * bm.Rows);
+```
+
+### 4. Width vs BytesPerRow Confusion
+
+```c
+/* PITFALL — using width in bytes where pixels are expected */
+UWORD widthInBytes = bm->BytesPerRow;   /* 40 bytes = 320 pixels */
+BltBitMap(src, 0, 0, dst, 0, 0, widthInBytes, 200, ...);
+/* BltBitMap expects PIXELS, not bytes! Copies only 40 pixels. */
+
+/* CORRECT */
+BltBitMap(src, 0, 0, dst, 0, 0, 320, 200, ...);
+```
+
+---
+
+## Use Cases
+
+| Software Pattern | BitMap Approach | Why |
+|---|---|---|
+| **Scrolling platformer** | Interleaved, `ScrollRaster()` | Single hardware scroll register update per frame |
+| **Double-buffered animation** | Two standard BitMaps + `ChangeVPBitMap()` | Clean VBlank swap with no tearing |
+| **Parallax background** | Multiple standard BitMaps at different depths | Independent scroll per layer via separate ViewPorts |
+| **Off-screen sprite preshift** | Standard planar, 16 copies per frame | CPU pre-renders shifted frames for fast Blitter copy |
+| **Chunky-rendered 3D** | Standard planar + C2P conversion | Render in Fast RAM chunky buffer, convert to BitMap planes |
+| **Color-cycling water/sky** | Standard planar, modify one plane only | Animate palette index via plane manipulation |
+
+---
+
+## Historical Context & Modern Analogies
+
+### Why Planar?
+
+In 1985, DRAM bandwidth was the bottleneck. A 320×200 display at 60 Hz requires:
+
+| Format | Bits/Pixel | Bytes/Frame | DMA Bandwidth |
+|---|---|---|---|
+| **Chunky 256-color** | 8 | 64,000 | ~3.8 MB/s |
+| **Planar 32-color (5 planes)** | 5 | 40,000 | ~2.4 MB/s |
+| **Planar 16-color (4 planes)** | 4 | 32,000 | ~1.9 MB/s |
+| **Planar 8-color (3 planes)** | 3 | 24,000 | ~1.4 MB/s |
+
+Planar layout reduces DMA bandwidth proportionally to color depth. It also enables tricks impossible in chunky:
+- **Color cycling**: Change palette entries, not pixels
+- **Parallax**: Scroll one plane while others stay fixed
+- **Transparency**: Omit a plane to see through to background
+
+### Modern Analogies
+
+| Amiga Planar Concept | Modern Equivalent | Shared Concept |
+|---|---|---|
+| `BitMap` + `Planes[]` | **OpenGL texture array / Vulkan image layers** | Multiple memory planes composited into final pixel |
+| Interleaved planar | **GPU tile-based render target** | Contiguous per-row layout for cache efficiency |
+| `BytesPerRow` | **Vulkan ` VkSubresourceLayout.rowPitch`** | Stride between scanlines, often larger than width |
+| `AllocBitMap(BMF_DISPLAYABLE)` | **`vkAllocateMemory` with `DEVICE_LOCAL_BIT`** | Explicit allocation in GPU-visible (DMA-able) memory |
+| `BltBitMap()` | **OpenGL `glBlitFramebuffer`** | Hardware-accelerated rectangular copy between surfaces |
+| Planar color cycling | **Palette-based texture animation** | Modify lookup table instead of texels for animated effects |
+
+### Where Analogies Break Down
+
+- **No alpha channel**: Planar pixels are color indices, not RGBA. Transparency requires hold-and-modify (HAM) or sprite overlay.
+- **No arbitrary pixel writes**: Drawing a single pixel requires RMW across all planes — no simple `framebuffer[y*w+x] = color`.
+- **Blitter as coprocessor**: Unlike modern GPUs, the Blitter is a fixed-function DMA engine with no shader programmability.
+
+---
+
+## FAQ
+
+**Q: Can I use `AllocBitMap()` on OS 1.3?**
+> No. `AllocBitMap()` was introduced in OS 2.0 (V36). On 1.3, use `InitBitMap()` + `AllocRaster()` / `AllocMem(MEMF_CHIP)`.
+
+**Q: How do I detect if a BitMap is interleaved?**
+> Check `bm->Flags & BMF_INTERLEAVED`. If set, `bm->BytesPerRow` is the total bytes per row across all planes.
+
+**Q: Why does my Blitter copy look garbled?**
+> Most likely causes: (1) `BytesPerRow` is odd, (2) source/destination BitMaps are not in Chip RAM, (3) modulo values are wrong for interleaved layout, (4) plane mask includes nonexistent planes.
+
+**Q: Can I draw directly into a screen's BitMap?**
+> Yes, but only through a `RastPort` obtained from the window (`win->RastPort`) or by creating your own RastPort pointing to the screen's BitMap. Never write to `screen->RastPort.BitMap` directly without proper locking — it bypasses layers clipping.
+
+**Q: What is the maximum BitMap size?**
+> Theoretical: 32767×32767 (16-bit Rows/BytesPerRow). Practical: Chip RAM limits. A 640×512×8 BitMap consumes 320 KB — large but feasible on a 2 MB Chip RAM AGA machine.
+
+---
+
+## Impact on FPGA/Emulation
+
+For MiSTer and emulator developers, planar BitMap emulation has specific requirements:
+
+- **Bitplane DMA must respect `BytesPerRow`**: The Agnus/Alice DMA controller fetches one row per bitplane per scanline using `BytesPerRow` as the stride. Emulators must implement this correctly, not assume contiguous layout.
+- **Interleaved is a software convention**: The hardware does not distinguish interleaved from standard — it only sees plane pointers. Interleaving is achieved by software setting `Planes[i]` to offsets within a single block.
+- **Alignment enforcement**: `AllocBitMap()` ensures even `BytesPerRow` and proper Chip RAM alignment. Emulators need not enforce this for manually constructed BitMaps, but should document that misaligned BitMaps produce undefined behavior on real hardware.
+- **AGA 64-bit fetches**: Alice can fetch 64 bits (8 bytes) per DMA cycle when `FMODE` is set. Emulators must support wider fetches for correct AGA high-resolution modes.
+
 ---

 ## References

- NDK39: `graphics/gfx.h`
- ADCD 2.1: `AllocBitMap`, `FreeBitMap`, `InitBitMap`
- HRM: *Amiga Hardware Reference Manual* — bitplane DMA chapter
- See also: [memory_types.md](../01_hardware/common/memory_types.md) — why bitmaps must be in Chip RAM (DMA accessibility)
+- NDK 3.9: `graphics/gfx.h`, `graphics/rastport.h`
+- ADCD 2.1: `AllocBitMap()`, `FreeBitMap()`, `InitBitMap()`, `InitRastPort()`, `BltBitMap()`
+- *Amiga Hardware Reference Manual* — Bitplane DMA chapter
+- See also: [memory_types.md](../01_hardware/common/memory_types.md) — Chip RAM requirements for DMA-visible BitMaps
+- See also: [blitter.md](blitter.md) — Blitter DMA operations on BitMaps
+- See also: [blitter_programming.md](blitter_programming.md) — Advanced Blitter minterms and cookie-cut
+- See also: [views.md](views.md) — Attaching BitMaps to ViewPorts for display
+- See also: [rastport.md](rastport.md) — RastPort drawing context and primitives
--- a/08_graphics/blitter_programming.md
+++ b/08_graphics/blitter_programming.md
@ -568,7 +568,7 @@ BltBitMap(srcBM, 0, 0,       /* source bitmap, x, y */
          NULL);             /* no temp buffer needed */

 /* Draw a filled rectangle (uses the Blitter internally): */
-SetAPen(rp, 3);              /* Set pen colour to index 3 */
+SetAPen(rp, 3);              /* Set pen color to index 3 */
 RectFill(rp, 10, 10, 100, 50); /* Filled rectangle */
 ```

--- a/08_graphics/copper.md
+++ b/08_graphics/copper.md
@ -4,7 +4,7 @@

 ## Overview

-The **Copper** is a simple coprocessor in the Amiga custom chips that executes a list of instructions synchronised to the video beam. It can write to any custom chip register at any beam position, enabling per-scanline colour changes, split screens, and hardware-level display effects without CPU intervention.
+The **Copper** is a simple coprocessor in the Amiga custom chips that executes a list of instructions synchronized to the video beam. It can write to any custom chip register at any beam position, enabling per-scanline color changes, split screens, and hardware-level display effects without CPU intervention.

 ---

@ -45,7 +45,7 @@ If the beam has already passed the specified position, skip the next instruction

 ## Standard Copper Patterns

-### Per-Scanline Colour Change (Rainbow)
+### Per-Scanline Color Change (Rainbow)

 ```
 WAIT   $2C01,$FFFE    ; wait for line $2C (44)
@ -75,7 +75,7 @@ The OS manages copper lists through `GfxBase`:

 | Pointer | Description |
 |---|---|
-| `GfxBase->copinit` | System initialisation copper list |
+| `GfxBase->copinit` | System initialization copper list |
 | `GfxBase->LOFlist` | Long-frame copper list (even fields) |
 | `GfxBase->SHFlist` | Short-frame copper list (odd fields, interlace) |

--- a/08_graphics/copper_programming.md
+++ b/08_graphics/copper_programming.md
@ -18,7 +18,7 @@ graph LR
        Beam["Beam Counter"]
    end
    subgraph Denise/Lisa ["Denise / Lisa Chip"]
-        Palette["Colour Registers"]
+        Palette["Color Registers"]
        BPL["Bitplane Control"]
        SPR["Sprite Control"]
    end
@ -37,14 +37,14 @@ graph LR
 **Key points:**
 - The Copper reads its program from **Chip RAM** via DMA — no CPU involvement
 - It writes directly to custom chip registers (the same `$DFF000–$DFF1FE` space)
- It synchronises with the **beam counter** — it knows exactly where the electron beam is
+- It synchronizes with the **beam counter** — it knows exactly where the electron beam is
 - The CPU can modify the copper list in memory at any time; changes take effect next frame

 ### What the Copper Can Do

 | Capability | How | Typical Use |
 |---|---|---|
-| **Per-line colour changes** | WAIT for line → MOVE to COLORxx | Gradient skies, rainbow bars, water effects |
+| **Per-line color changes** | WAIT for line → MOVE to COLORxx | Gradient skies, rainbow bars, water effects |
 | **Split-screen displays** | Change bitplane pointers mid-frame | Status bar + scrolling game area |
 | **Parallax scrolling** | Change BPLCON1 scroll offset at different lines | Multi-layer side-scrollers |
 | **Resolution mixing** | Change BPLCON0 mid-frame | HiRes title bar + LoRes gameplay |
@ -61,7 +61,7 @@ graph LR
 | No branching/loops | Executes linearly top-to-bottom; no jumps or calls |
 | No memory read | Can only WRITE to registers — cannot read anything |
 | No CPU memory access | Writes only to custom chip registers (`$DFF000`+), not RAM or CIA |
-| No sub-pixel timing | Horizontal resolution: 4 colour clocks (~8 low-res pixels) |
+| No sub-pixel timing | Horizontal resolution: 4 color clocks (~8 low-res pixels) |
 | V counter wraps at 255 | PAL lines 256–311 require a double-WAIT trick |
 | Chip RAM only | The copper list itself must reside in Chip RAM (DMA-accessible) |

@ -69,9 +69,9 @@ graph LR

 **AmigaOS** — `graphics.library` builds the system copper list automatically when you call `MakeVPort()` / `LoadView()`. This list sets up bitplane pointers, sprite pointers, display window, and palette for every ViewPort. User code adds instructions via `UCopList`.

-**Games (system takeover)** — Disable the OS display system, point COP1LC to your own copper list, and have total control. The copper list typically sets up the display, changes colours per line, and handles sprite multiplexing.
+**Games (system takeover)** — Disable the OS display system, point COP1LC to your own copper list, and have total control. The copper list typically sets up the display, changes colors per line, and handles sprite multiplexing.

-**Demos** — Push the Copper to its limits: hundreds of colour changes per frame, dynamic copper list generation, and tricks like "copper bars" (changing colours mid-scanline using horizontal WAITs).
+**Demos** — Push the Copper to its limits: hundreds of color changes per frame, dynamic copper list generation, and tricks like "copper bars" (changing colors mid-scanline using horizontal WAITs).

 ---

@ -150,15 +150,15 @@ Full PAL:   312 lines, but copper V wraps at 256

 ## Complete Examples

-### Example 1: Rainbow Bars (Colour Per Scanline)
+### Example 1: Rainbow Bars (Color Per Scanline)

 ```asm
-; copperlist.s — 256-colour rainbow using Copper
+; copperlist.s — 256-color rainbow using Copper
    SECTION copperlist,DATA_C    ; MUST be in Chip RAM!

 CopperList:
    ; Set up a basic display first
-    dc.w    $0100, $1200    ; BPLCON0: 1 bitplane, colour on
+    dc.w    $0100, $1200    ; BPLCON0: 1 bitplane, color on
    dc.w    $0092, $0038    ; DDFSTRT
    dc.w    $0094, $00D0    ; DDFSTOP
    dc.w    $008E, $2C81    ; DIWSTRT
@ -177,7 +177,7 @@ CopperList:
    dc.w    $2F01, $FFFE    ; WAIT line 47
    dc.w    $0180, $0C30    ; COLOR00 = yellow-orange

-    ; ... repeat for each line with incrementing colours ...
+    ; ... repeat for each line with incrementing colors ...

    dc.w    $FFFF, $FFFE    ; end of copper list

@ -275,8 +275,8 @@ FreeCopList(ucl);

 | Item | Cycles |
 |---|---|
-| Each Copper instruction | 4 colour clocks (= 8 low-res pixels) |
-| WAIT resolution (horizontal) | 4 colour clocks minimum |
+| Each Copper instruction | 4 color clocks (= 8 low-res pixels) |
+| WAIT resolution (horizontal) | 4 color clocks minimum |
 | Maximum instructions per line | ~112 (NTSC) / ~114 (PAL) |
 | PAL visible lines | 256 (lines 44–300) |
 | NTSC visible lines | 200 (lines 44–244) |
@ -304,7 +304,7 @@ Reposition sprites mid-frame to display more than 8 sprites:
 ### Copper-Driven Palette Animation

 ```asm
-    ; Animate copper list by modifying colour values each frame
+    ; Animate copper list by modifying color values each frame
    ; (DMA reads new values each frame automatically)
    ; Just update the data words in the copper list in Chip RAM
    move.w  d0, CopperList+6     ; modify the MOVE data word
--- a/08_graphics/display_modes.md
+++ b/08_graphics/display_modes.md
@ -4,7 +4,7 @@

 ## Overview

-The Amiga's display system evolved through three generations of custom chips: **OCS** (Original Chip Set, A1000/A500/A2000), **ECS** (Enhanced, A3000/A600), and **AGA** (Advanced Graphics Architecture, A1200/A4000). Each generation expanded resolution, colour depth, and display flexibility while maintaining backward compatibility.
+The Amiga's display system evolved through three generations of custom chips: **OCS** (Original Chip Set, A1000/A500/A2000), **ECS** (Enhanced, A3000/A600), and **AGA** (Advanced Graphics Architecture, A1200/A4000). Each generation expanded resolution, color depth, and display flexibility while maintaining backward compatibility.

 OS 3.0+ provides a **display database** that abstracts these capabilities. Applications query available modes by `ModeID` rather than hardcoding chipset-specific flags.

@ -15,15 +15,15 @@ OS 3.0+ provides a **display database** that abstracts these capabilities. Appli
 | Feature | OCS (Agnus/Denise) | ECS (Fat Agnus/Super Denise) | AGA (Alice/Lisa) |
 |---|---|---|---|
 | **Max Chip RAM** | 512 KB (8372) / 1 MB (8372A) | 2 MB (8375) | 2 MB (8374) |
-| **Bitplanes** | 6 (32 colours, lowres) | 6 | 8 (256 colours) |
+| **Bitplanes** | 6 (32 colors, lowres) | 6 | 8 (256 colors) |
 | **Palette entries** | 32 (4096 total, 12-bit RGB) | 32 (4096) | 256 (16.7M, 24-bit RGB) |
 | **Max lowres** | 320×256 (PAL) | 320×256 | 320×256 |
 | **Max hires** | 640×256 | 640×256 | 640×256 |
 | **Super hires** | — | 1280×256 | 1280×256 |
 | **Scan-doubled** | — | — | 640×512 non-interlaced |
-| **HAM** | HAM6 (4096 colours) | HAM6 | HAM8 (262,144 colours) |
-| **EHB** | EHB (64 colours) | EHB | EHB (superseded by 8 planes) |
-| **Sprites** | 8 × 16px × 3 colours | 8 × 16px × 3 colours | 8 × 16/32/64px × 3/15 colours |
+| **HAM** | HAM6 (4096 colors) | HAM6 | HAM8 (262,144 colors) |
+| **EHB** | EHB (64 colors) | EHB | EHB (superseded by 8 planes) |
+| **Sprites** | 8 × 16px × 3 colors | 8 × 16px × 3 colors | 8 × 16/32/64px × 3/15 colors |
 | **Fetch modes** | 1× | 1× | 1×, 2×, 4× (wider data bus) |
 | **Bandwidth** | 3.58 MHz pixel clock | 3.58/7.16/14.32 MHz | Up to 28.64 MHz (4× fetch) |

@ -40,8 +40,8 @@ Line frequency:     15,625 Hz
 Frame frequency:    50 Hz (25 Hz interlaced)
 Lines per frame:    312.5 (625 interlaced)
 Active lines:       ~256 (non-interlaced) / ~512 (interlaced)
-Colour clock:       3,546,895 Hz
-Pixel clock (lores): 7,093,790 Hz (1 pixel = 2 colour clocks)
+Color clock:       3,546,895 Hz
+Pixel clock (lores): 7,093,790 Hz (1 pixel = 2 color clocks)
 Pixel clock (hires): 14,187,580 Hz
 ```

@ -52,7 +52,7 @@ Line frequency:     15,734 Hz
 Frame frequency:    60 Hz (30 Hz interlaced)
 Lines per frame:    262.5 (525 interlaced)
 Active lines:       ~200 (non-interlaced) / ~400 (interlaced)
-Colour clock:       3,579,545 Hz
+Color clock:       3,579,545 Hz
 Pixel clock (lores): 7,159,090 Hz
 Pixel clock (hires): 14,318,180 Hz
 ```
@ -161,7 +161,7 @@ while ((modeID = NextDisplayInfo(modeID)) != INVALID_ID)
        GetDisplayInfoData(NULL, (UBYTE *)&mon, sizeof(mon),
                           DTAG_MNTR, modeID);

-        Printf("$%08lx: %ldx%ld, %ld colours, %s\n",
+        Printf("$%08lx: %ldx%ld, %ld colors, %s\n",
                modeID,
                dims.Nominal.MaxX - dims.Nominal.MinX + 1,
                dims.Nominal.MaxY - dims.Nominal.MinY + 1,
@ -203,7 +203,7 @@ The display system shares DMA bandwidth with other custom chips. Each scanline h
 | Blitter | Variable (steals from CPU) |
 | CPU | Whatever is left |

-> In high-resolution 4-plane mode, bitplane DMA alone consumes 80 words per line — nearly the entire available bandwidth. This is why OCS/ECS hires is limited to 4 planes (16 colours) and AGA needed wider fetch modes.
+> In high-resolution 4-plane mode, bitplane DMA alone consumes 80 words per line — nearly the entire available bandwidth. This is why OCS/ECS hires is limited to 4 planes (16 colors) and AGA needed wider fetch modes.

 ---

--- a/08_graphics/gfx_base.md
+++ b/08_graphics/gfx_base.md
@ -37,7 +37,7 @@ flowchart TD
 struct GfxBase {
    struct Library   LibNode;
    struct View     *ActiView;        /* currently active View */
-    struct copinit   *copinit;        /* system copper list initialisation */
+    struct copinit   *copinit;        /* system copper list initialization */
    LONG            *cia;             /* CIA base (deprecated) */
    LONG            *blitter;         /* blitter base (deprecated) */
    UWORD           *LOFlist;         /* long-frame copper list pointer */
@ -101,7 +101,7 @@ struct GfxBase *GfxBase = (struct GfxBase *)
 if (GfxBase->ChipRevBits0 & GFXF_AA_ALICE)
 {
    /* AGA chipset (A1200/A4000) */
-    /* 8-bit planar, 256 colours, 24-bit palette */
+    /* 8-bit planar, 256 colors, 24-bit palette */
 }
 else if (GfxBase->ChipRevBits0 & GFXF_HR_DENISE)
 {
@ -116,7 +116,7 @@ else if (GfxBase->ChipRevBits0 & GFXF_HR_AGNUS)
 else
 {
    /* OCS chipset (original A500/A1000/A2000) */
-    /* 512 KB Chip RAM, 4096 colour palette */
+    /* 512 KB Chip RAM, 4096 color palette */
 }
 ```

--- a/08_graphics/ham_ehb_modes.md
+++ b/08_graphics/ham_ehb_modes.md
@ -4,7 +4,7 @@

 ## Overview

-The Amiga offers two unique display modes that squeeze many more colours from limited bitplane hardware: **EHB** (Extra Half-Brite) and **HAM** (Hold-And-Modify). These modes have no direct equivalent on other platforms and are critical for understanding Amiga graphics capability and for FPGA implementation.
+The Amiga offers two unique display modes that squeeze many more colors from limited bitplane hardware: **EHB** (Extra Half-Brite) and **HAM** (Hold-And-Modify). These modes have no direct equivalent on other platforms and are critical for understanding Amiga graphics capability and for FPGA implementation.

 ---

@ -13,8 +13,8 @@ The Amiga offers two unique display modes that squeeze many more colours from li
 ### How It Works

 Uses **6 bitplanes** (64 possible values):
- Bitplane values 0–31: index into the 32-colour palette normally
- Bitplane values 32–63: display the colour from register (value − 32) at **half brightness** (all RGB components shifted right by 1)
+- Bitplane values 0–31: index into the 32-color palette normally
+- Bitplane values 32–63: display the color from register (value − 32) at **half brightness** (all RGB components shifted right by 1)

 ```mermaid
 flowchart LR
@ -32,12 +32,12 @@ flowchart LR
 Example pixel value = 37 (binary: 100101):
  Bit 5 = 1 → half-brite
  Bits 4-0 = 00101 = palette index 5
-  Output colour = palette[5] >> 1 (each R,G,B component halved)
+  Output color = palette[5] >> 1 (each R,G,B component halved)

 Example pixel value = 5 (binary: 000101):
  Bit 5 = 0 → normal
  Bits 4-0 = 00101 = palette index 5
-  Output colour = palette[5] (full brightness)
+  Output color = palette[5] (full brightness)
 ```

 ### Programming EHB
@ -51,16 +51,16 @@ struct Screen *scr = OpenScreenTags(NULL,
    SA_DisplayID, EXTRAHALFBRITE_KEY,
    TAG_DONE);

-/* Set the 32 base colours: */
-ULONG colours32[32 * 3 + 2];
-colours32[0] = 32 << 16;  /* count = 32, first = 0 */
+/* Set the 32 base colors: */
+ULONG colors32[32 * 3 + 2];
+colors32[0] = 32 << 16;  /* count = 32, first = 0 */
 /* ... fill RGB values ... */
-colours32[32 * 3 + 1] = 0;  /* terminator */
-LoadRGB32(&scr->ViewPort, colours32);
+colors32[32 * 3 + 1] = 0;  /* terminator */
+LoadRGB32(&scr->ViewPort, colors32);

 /* Pixels 0–31 use base palette directly.
   Pixels 32–63 are automatically half-brightness versions.
-   No need to set colours 32–63 — hardware does it. */
+   No need to set colors 32–63 — hardware does it. */
 ```

 ---
@ -111,25 +111,25 @@ flowchart LR
 ```

 > [!IMPORTANT]
-> **Each scanline starts fresh** — the first pixel of each line has no "previous pixel" to modify. The hardware resets to the background colour (register 0) at the start of each line. This is why HAM images often have a visible "colour ramp" at the left edge.
+> **Each scanline starts fresh** — the first pixel of each line has no "previous pixel" to modify. The hardware resets to the background color (register 0) at the start of each line. This is why HAM images often have a visible "color ramp" at the left edge.

 ### Practical Example — Encoding a HAM6 Scanline

-Suppose we want to display these colours on a scanline:
+Suppose we want to display these colors on a scanline:

 ```
 Target:  RGB(A,7,3) → RGB(A,7,F) → RGB(F,7,F) → RGB(F,0,8)

 Encoding:
-  Pixel 0: 00 xxxx (SET palette[n] = A,7,3)  → SET to base colour
+  Pixel 0: 00 xxxx (SET palette[n] = A,7,3)  → SET to base color
  Pixel 1: 01 1111 (MOD BLUE = F)            → A,7,3 → A,7,F  ✓
  Pixel 2: 10 1111 (MOD RED = F)              → A,7,F → F,7,F  ✓
-  Pixel 3: 00 xxxx (SET palette[m] = F,0,8)  → SET to nearest base colour
+  Pixel 3: 00 xxxx (SET palette[m] = F,0,8)  → SET to nearest base color

 Note: pixel 3 needs to change ALL THREE components.
 Since HAM can only modify ONE component per pixel, we must either:
  a) Use 3 pixels to transition (changing R, G, B separately) → "fringing"
-  b) Pick a base palette colour that's close to the target → "SET"
+  b) Pick a base palette color that's close to the target → "SET"
 ```

 ### The Fringing Problem
@ -150,24 +150,24 @@ flowchart LR
    style H4 fill:#ffcdd2,stroke:#c62828,color:#333
 ```

-Pixels H3 and H4 are **fringing artifacts** — wrong colours visible during the transition. The encoder must change R, G, B individually (one per pixel), so sharp multi-component transitions always produce visible intermediate colours.
+Pixels H3 and H4 are **fringing artifacts** — wrong colors visible during the transition. The encoder must change R, G, B individually (one per pixel), so sharp multi-component transitions always produce visible intermediate colors.

-The encoder (usually offline) optimises palette choice and pixel encoding to minimise fringing. Common strategies:
- Choose 16 base palette colours via **median-cut** from the image histogram
+The encoder (usually offline) optimizes palette choice and pixel encoding to minimize fringing. Common strategies:
+- Choose 16 base palette colors via **median-cut** from the image histogram
 - Use SET pixels at strong edges
 - Sequence MODIFY commands to approach target in fewest steps

 ```mermaid
 flowchart TD
    IMG["Source Image<br/>(24-bit RGB)"] --> HIST["Histogram Analysis"]
-    HIST --> MEDCUT["Median-Cut<br/>Select 16 base colours"]
+    HIST --> MEDCUT["Median-Cut<br/>Select 16 base colors"]
    MEDCUT --> PAL["Optimal 16-entry palette"]

    IMG --> SCAN["Process scanlines<br/>left to right"]
    PAL --> SCAN

    SCAN --> DECIDE{"Distance to target?"}
-    DECIDE -->|"Close base colour exists"| SET["SET command<br/>(no fringing)"]
+    DECIDE -->|"Close base color exists"| SET["SET command<br/>(no fringing)"]
    DECIDE -->|"Only 1 component differs"| MOD["MODIFY command<br/>(no fringing)"]
    DECIDE -->|"2-3 components differ"| FRINGE["2-3 MODIFY sequence<br/>(fringing visible)"]

@ -187,7 +187,7 @@ struct Screen *scr = OpenScreenTags(NULL,
    SA_DisplayID, HAM_KEY,
    TAG_DONE);

-/* Set the 16 base palette colours: */
+/* Set the 16 base palette colors: */
 ULONG hamPalette[16 * 3 + 2];
 hamPalette[0] = 16 << 16;  /* count=16, first=0 */
 /* Palette entry 0: R=$A0, G=$70, B=$30 (12-bit values scaled to 32-bit) */
@ -221,7 +221,7 @@ void SetHAMPixel(UBYTE *plane[], int x, int y, UBYTE cmd, UBYTE data)
    }
 }

-/* Example: SET colour 5, then modify blue to $F: */
+/* Example: SET color 5, then modify blue to $F: */
 SetHAMPixel(plane, 0, 0, 0x00, 5);    /* 00 0101 = SET palette[5] */
 SetHAMPixel(plane, 1, 0, 0x01, 0xF);  /* 01 1111 = MOD BLUE = $F */
 ```
@ -246,15 +246,15 @@ Uses **8 bitplanes**. Same principle, wider data:
 | Aspect | HAM6 | HAM8 |
 |---|---|---|
 | Base palette entries | 16 | 64 |
-| Colour component precision | 4-bit (16 levels) | 6-bit (64 levels) |
-| Total colour space | 12-bit (4,096) | 18-bit (262,144) |
-| Fringing severity | Severe | Mild (more base colours to SET from) |
+| Color component precision | 4-bit (16 levels) | 6-bit (64 levels) |
+| Total color space | 12-bit (4,096) | 18-bit (262,144) |
+| Fringing severity | Severe | Mild (more base colors to SET from) |
 | Memory per 320×256 screen | 6 × 40 × 256 = 60 KB | 8 × 40 × 256 = 80 KB |

 ### HAM8 Palette Setup

 ```c
-/* HAM8 uses 64 of the 256 AGA palette entries as base colours: */
+/* HAM8 uses 64 of the 256 AGA palette entries as base colors: */
 struct Screen *scr = OpenScreenTags(NULL,
    SA_Width,     320,
    SA_Height,    256,
@ -319,10 +319,10 @@ DMA slots consumed per bitplane per lowres line:
 The HAM decoder operates **one pixel clock behind** the bitplane data output:

 ```
-Bitplane DMA → Bitplane shift registers → HAM decoder → Colour register → DAC → Video out
+Bitplane DMA → Bitplane shift registers → HAM decoder → Color register → DAC → Video out
                                          ↑
                                    1-pixel delay
-                                    (needs previous pixel's colour)
+                                    (needs previous pixel's color)
 ```

 For FPGA implementation, the HAM decoder is a simple combinational circuit:
@ -359,20 +359,20 @@ end

 ## Standard Palette Modes — For Comparison

-### Setting Palette Colours (Non-HAM)
+### Setting Palette Colors (Non-HAM)

 ```c
 /* OS 3.0+ — 24-bit precision (AGA): */
-ULONG colours[3 * 3 + 2];  /* 3 colours */
-colours[0] = 3 << 16;  /* count=3, first entry=0 */
+ULONG colors[3 * 3 + 2];  /* 3 colors */
+colors[0] = 3 << 16;  /* count=3, first entry=0 */
 /* Entry 0: black */
-colours[1] = 0x00000000; colours[2] = 0x00000000; colours[3] = 0x00000000;
+colors[1] = 0x00000000; colors[2] = 0x00000000; colors[3] = 0x00000000;
 /* Entry 1: bright red */
-colours[4] = 0xFF000000; colours[5] = 0x00000000; colours[6] = 0x00000000;
+colors[4] = 0xFF000000; colors[5] = 0x00000000; colors[6] = 0x00000000;
 /* Entry 2: pure blue */
-colours[7] = 0x00000000; colours[8] = 0x00000000; colours[9] = 0xFF000000;
-colours[10] = 0;  /* terminator */
-LoadRGB32(vp, colours);
+colors[7] = 0x00000000; colors[8] = 0x00000000; colors[9] = 0xFF000000;
+colors[10] = 0;  /* terminator */
+LoadRGB32(vp, colors);

 /* OCS/ECS — 12-bit precision: */
 UWORD oldPalette[] = { 0x000, 0xF00, 0x00F };  /* 4 bits per channel */
@ -385,7 +385,7 @@ custom->color[2] = 0x00F;   /* $DFF184: COLOR02 */
 /* AGA: extra bits via BPLCON3 bank select */
 ```

-### Colour Cycling (Palette Animation)
+### Color Cycling (Palette Animation)

 ```c
 /* Rotate palette entries for animation — common demo/game technique: */
@ -413,7 +413,7 @@ void CyclePalette(struct ViewPort *vp, int first, int last)
 ```

 > [!TIP]
-> Colour cycling is extremely cheap — only palette registers change, not pixel data. A single `SetRGB32` call costs a few microseconds vs redrawing the entire screen. This is why palette animation was so popular on the Amiga.
+> Color cycling is extremely cheap — only palette registers change, not pixel data. A single `SetRGB32` call costs a few microseconds vs redrawing the entire screen. This is why palette animation was so popular on the Amiga.

 ---

@ -423,9 +423,9 @@ void CyclePalette(struct ViewPort *vp, int first, int last)
 |---|---|---|---|---|
 | Bitplanes | 5 | 6 | 6 | 8 |
 | Chipset | OCS/ECS/AGA | OCS/ECS/AGA | OCS/ECS/AGA | AGA only |
-| Programmable colours | 32 | 32 | 16 | 64 |
+| Programmable colors | 32 | 32 | 16 | 64 |
 | Total on-screen | 32 | 64 | 4,096 | 262,144 |
-| Colour depth | 12-bit (OCS) / 24-bit (AGA) | 12/24-bit | 12-bit | 18-bit |
+| Color depth | 12-bit (OCS) / 24-bit (AGA) | 12/24-bit | 12-bit | 18-bit |
 | Fringing | None | None | Significant | Mild |
 | Good for | GUI, games | GUI with shadows | Photos, static art | Photos, video stills |
 | Bad for | Photo-realism | Limited palette control | Animation, scrolling | Memory: 80 KB/frame |
--- a/08_graphics/pixel_conversion.md
+++ b/08_graphics/pixel_conversion.md
@ -111,7 +111,7 @@ A chunky buffer is the **natural intermediate format** for a GPU-style rendering

 ### Chunky (Packed Pixel)

-Every pixel's complete colour index is stored contiguously. For 8-bit (256 colour) pixels:
+Every pixel's complete color index is stored contiguously. For 8-bit (256 color) pixels:

 ```
 Address:  $0000  $0001  $0002  $0003  $0004  $0005  $0006  $0007
@ -123,7 +123,7 @@ Each byte = one pixel. Linear, simple, cache-friendly for rendering. This is how

 ### Planar (Bitplane)

-Each pixel's colour index is **split across N separate memory regions** (bitplanes). For 8-bit pixels (8 bitplanes), each bitplane stores one bit of every pixel:
+Each pixel's color index is **split across N separate memory regions** (bitplanes). For 8-bit pixels (8 bitplanes), each bitplane stores one bit of every pixel:

 ```
 Bitplane 0: 1 0 1 1 0 0 1 0  ← bit 0 of pixels 0–7
@ -136,7 +136,7 @@ Bitplane 6: 0 0 1 0 0 0 0 1  ← bit 6
 Bitplane 7: 0 0 0 0 1 0 1 0  ← bit 7
 ```

-To read pixel 0's colour: collect bit 0 from each of the 8 planes → `10101100` = `$AC`. The 8 planes are **not interleaved** in standard Amiga layout — each is a separate contiguous memory block.
+To read pixel 0's color: collect bit 0 from each of the 8 planes → `10101100` = `$AC`. The 8 planes are **not interleaved** in standard Amiga layout — each is a separate contiguous memory block.

 > [!WARNING]
 > The Amiga's planar format means memory addresses in bitplane memory don't correspond to pixel positions linearly. Plane 0 byte 0 contains bits for pixels 0–7. Plane 1 byte 0 contains bits for the same pixels 0–7. The byte offset for pixel N is `(N / 8)` in **every** plane. The bit position is `7 - (N mod 8)`. This is the fundamental indirection all planar-format API developers must internalize.
@ -817,10 +817,15 @@ The CD32's Akiko chip implements C2P in dedicated silicon. The CPU feeds 8 longw
 | CPU load | 100% | 100% | ~50% (register I/O) | **2x CPU freed** |
 | 320x256x8bpl | ~1.1 s | ~35 ms | ~35 ms | **~31x** |

-Akiko's throughput is approximately the same as optimised software C2P on the 68020 because both are limited by the Chip RAM bus bandwidth (~3.5 MB/s shared). On faster CPUs (68040/060), software C2P **outperforms** Akiko because the CPU can process data faster than the register interface can shuttle it.
+Akiko's throughput is approximately the same as optimized software C2P on the 68020 because both are limited by the Chip RAM bus bandwidth (~3.5 MB/s shared). On faster CPUs (68040/060), software C2P **outperforms** Akiko because the CPU can process data faster than the register interface can shuttle it.

 Full Akiko protocol: [Akiko — CD32 C2P Hardware](../01_hardware/aga_a1200_a4000/akiko_cd32.md#chunky-to-planar-c2p-conversion)

+> [!NOTE]
+> **FPGA Implementation**: On MiSTer, Akiko C2P must be implemented as a state machine triggered by register writes to `$B80030`. The CPU writes 8 longwords to the same address; the state machine reads them sequentially, performs bit transposition in hardware, and presents the 8 planar longwords on subsequent reads from `$B80030`. Throughput is bounded by Chip RAM bus bandwidth (~3.5 MB/s shared), not by the state machine speed — a naive FGPA Akiko implementation that runs at bus speed is already cycle-accurate.
+>
+> **Reference**: MiSTer Minimig-AGA Akiko implementation — [`rtl/akiko.v`](https://github.com/MiSTer-devel/Minimig-AGA_MiSTer/blob/MiSTer/rtl/akiko.v) (Verilog)
+
 ---

 ## Solution 4 — Blitter-Assisted C2P
@ -912,6 +917,9 @@ Most games used a hybrid: 1-2 bitplanes for UI/HUD elements, reserving `COLOR00`
 > [!NOTE]
 > Copper Chunky and C2P are not mutually exclusive. Some demos use Copper Chunky for one screen region while simultaneously using C2P for another. The Copperlist can intermix WAIT/MOVE instructions with normal bitplane display controls.

+> [!WARNING]
+> **FPGA/Emulation Timing Sensitivity**: Copper Chunky is extremely sensitive to Copper timing accuracy. Each `WAIT` must compare against the exact beam counter value, and each `MOVE` to `COLOR00` must take effect at the correct pixel position. DMA contention between Copper and bitplane fetches shifts pixel placement, and emulators must model the Copper's 2-cycle instruction latency (WAIT=2 cycles, MOVE=2 cycles). A one-pixel offset produces visible image shearing. The Minimig-AGA core on MiSTer implements this, but early UAE versions did not — if your Copper Chunky output shows "striped" patterns under emulation, test on MiSTer or real hardware before debugging the algorithm.
+
 ---

 ## Solution 5 — WriteChunkyPixels (AmigaOS)
@ -1038,8 +1046,8 @@ The Amiga's planar format is **SoA**: each bitplane is an array of one field (on
 | **Amiga graphics** | Bitplanes (Agnus DMA) | Chunky pixel buffer (CPU render) | C2P algorithm |
 | **GPU compute shaders** | SoA buffer layouts (SSBO) | Vertex attributes (interleaved VBO) | Shader transpose |
 | **SIMD / AVX-512** | Separate float arrays (vectorisable) | Struct arrays (gather/scatter) | `_mm512_transpose` intrinsics |
-| **Database engines** | Columnar storage (Parquet, Arrow) | Row-oriented storage (MySQL) | Column↔row materialisation |
-| **Image compression** | Colour planes (JPEG YCbCr) | RGB pixels (BMP) | MCU block decomposition |
+| **Database engines** | Columnar storage (Parquet, Arrow) | Row-oriented storage (MySQL) | Column↔row materialization |
+| **Image compression** | Color planes (JPEG YCbCr) | RGB pixels (BMP) | MCU block decomposition |
 | **GPU texture memory** | Block-compressed (BC/ASTC) | Linear RGBA | Hardware texture unit decode |
 | **Neural network inference** | NCHW tensor layout (channels first) | NHWC (channels last) | Layout transposition kernel |

@ -1047,10 +1055,10 @@ The Amiga's planar format is **SoA**: each bitplane is an array of one field (on

 | Layout | Optimal For | Reason |
 |---|---|---|
-| **SoA / Planar** | Streaming one field across many elements | Maximises cache line utilisation, enables SIMD vectorisation |
+| **SoA / Planar** | Streaming one field across many elements | Maximizes cache line utilization, enables SIMD vectorization |
 | **AoS / Chunky** | Random-access to complete elements | All fields of one element in one cache line |

-The Amiga's custom DMA engine streams bitplane data to the display sequentially — plane 0 for the whole line, then plane 1, etc. This is a **SoA access pattern**, perfectly matched by the planar layout. The CPU, which wants to set a single pixel's complete colour, has the opposite need — it wants **AoS**.
+The Amiga's custom DMA engine streams bitplane data to the display sequentially — plane 0 for the whole line, then plane 1, etc. This is a **SoA access pattern**, perfectly matched by the planar layout. The CPU, which wants to set a single pixel's complete color, has the opposite need — it wants **AoS**.

 ### Modern Hardware Parallels

@ -1059,7 +1067,7 @@ The Amiga's custom DMA engine streams bitplane data to the display sequentially
 | **Akiko C2P register** | GPU texture swizzle unit | Hardware layout transposition |
 | **Blitter + merge algorithm** | CUDA shared memory transpose kernel | CPU/coprocessor-assisted transpose |
 | **RTG (planar bypass)** | Unified chunky framebuffer (since VGA) | Eliminates the problem entirely |
-| **Copper palette cycling** | GPU palette shader / LUT texture | Colour manipulation without pixel writes |
+| **Copper palette cycling** | GPU palette shader / LUT texture | Color manipulation without pixel writes |
 | **FMODE (fetch width)** | GPU memory bus width (256/384/512-bit) | Wider bus = more data per DMA cycle |

 ### GPU Texture Swizzle — The Modern Akiko
@ -1086,7 +1094,7 @@ When you call `glTexImage2D()` or `vkCmdCopyBufferToImage()`, the GPU driver per
 | A1200 (1992, 14 MHz 68020) | C2P 320×256×8bpp | ~1.5 MB/s | CPU merge, 8 planes |
 | CD32 (1993, 14 MHz + Akiko) | C2P 320×256×8bpp | ~1.5 MB/s | Akiko hardware |
 | 486 DX2/66 (1992) | No conversion needed | N/A | VGA Mode 13h = chunky |
-| Pentium MMX (1997) | Colour space (YUV→RGB) | ~200 MB/s | MMX SIMD |
+| Pentium MMX (1997) | Color space (YUV→RGB) | ~200 MB/s | MMX SIMD |
 | GTX 1080 (2016) | Texture swizzle (linear→tiled) | ~300 GB/s | Hardware TMU |
 | Apple M2 (2022) | SoA↔AoS for ML tensors | ~100 GB/s | Hardware AMX |

@ -1100,9 +1108,9 @@ The throughput gap tells the story: what consumed 100% of a 68020's capability i
 |---|---|
 | 1985 | Amiga launches with planar display. C2P not needed — all software renders directly to bitplanes |
 | 1989 | First 3D demos appear (Juggler, etc.). Rendering in chunky buffers starts |
-| 1991 | Demoscene coders develop first optimised C2P routines for 68000 |
+| 1991 | Demoscene coders develop first optimized C2P routines for 68000 |
 | 1992 | AGA ships (A1200/A4000). 8 bitplanes = C2P problem gets 2× harder |
-| 1993 | CD32 ships with Akiko — first hardware C2P. Mikael Kalms publishes optimised CPU routines |
+| 1993 | CD32 ships with Akiko — first hardware C2P. Mikael Kalms publishes optimized CPU routines |
 | 1994 | Kalms C2P library becomes the de facto standard. Multiple variants for 020/030/040/060 |
 | 1995 | RTG cards (Picasso II, CyberVision 64) begin to make C2P irrelevant for productivity |
 | 1996 | CyberVision 64 ships with Roxxler P2C chip — the reverse problem, solved in hardware |
@ -1432,41 +1440,6 @@ ULONG measure_c2p_time(void) {
 }
 ```

---
-
-## Impact on FPGA/Emulation — MiSTer & UAE Developers
-
-Since this knowledge base targets MiSTer FPGA core developers, here are implementation concerns specific to hardware reproduction:
-
-### C2P in FPGA Cores
-
-The Minimig-AGA core on MiSTer provides both:
- **Native planar output** — matches real Amiga bitplane DMA timing
- **RTG framebuffer via uaegfx** — chunky framebuffer in DDR memory, no C2P needed
-
-When running software that uses C2P on the MiSTer:
-1. The CPU merge algorithm runs on the emulated 68020 (TG68K or fx68k core)
-2. Memory timing must accurately model Chip RAM vs Fast RAM contention
-3. The Blitter must be cycle-accurate for Blitter-assisted C2P variants
-4. Akiko C2P must be implemented as a state machine triggered by register writes to `$B80030`
-
-### Copper Chunky Accuracy
-
-Copper Chunky is extremely sensitive to Copper timing:
- Each WAIT must compare against the exact beam counter value
- MOVE to COLOR00 must take effect at the correct pixel
- DMA contention between Copper and bitplane fetches affects pixel placement
- Emulators must model the Copper's 2-cycle instruction latency
-
-### 68040/060 Cache Coherency
-
-On FPGA cores implementing 68040+, the data cache must be coherent with DMA writes:
- `MOVE16` writes should bypass or update the data cache
- `CACR` flush instructions must invalidate cache lines matching DMA-visible addresses
- Missed coherency bugs manifest as "shimmering" pixels in C2P output
-
---
-
 ## FAQ

 ### Why not just use the Blitter for C2P?
@ -1479,7 +1452,7 @@ Bitplane modulo calculations on non-aligned rows force the display DMA controlle

 ### Can I use Akiko on non-CD32 hardware?

-No. Akiko is a custom ASIC that physically only exists in the CD32; it is integrated with the CD-ROM controller on the same die. There is no expansion card addressing `$B80000` on any other Amiga model. On MiSTer, Akiko can be implemented as a soft peripheral in the FPGA core.
+No. Akiko is a custom ASIC that physically only exists in the CD32; it is integrated with the CD-ROM controller on the same die. There is no expansion card addressing `$B80000` on any other Amiga model. On MiSTer, Akiko can be implemented as a soft peripheral in the FPGA core — see the FPGA implementation note in [Solution 3](#solution-3--akiko-hardware-c2p-cd32-only).

 ### Why doesn't C2P scale linearly with 68060 clock speed?

--- a/08_graphics/rastport.md
+++ b/08_graphics/rastport.md
@ -4,7 +4,7 @@

 ## Overview

-`RastPort` is the primary drawing context in AmigaOS — the equivalent of a "device context" (Windows) or "graphics context" (X11). All graphics primitives (pixel, line, rectangle, polygon, text) operate through a RastPort, which bundles together a target `BitMap`, drawing pen colours, patterns, font, draw mode, and an optional `Layer` for clipping.
+`RastPort` is the primary drawing context in AmigaOS — the equivalent of a "device context" (Windows) or "graphics context" (X11). All graphics primitives (pixel, line, rectangle, polygon, text) operate through a RastPort, which bundles together a target `BitMap`, drawing pen colors, patterns, font, draw mode, and an optional `Layer` for clipping.

 Every Intuition window and screen has its own RastPort. When you draw into a window, you're drawing through its RastPort.

@ -44,8 +44,8 @@ struct RastPort {
    struct AreaInfo *AreaInfo;   /* area fill vertex buffer */
    struct GelsInfo *GelsInfo;   /* GEL (BOB/VSprite) list */
    UBYTE           Mask;        /* plane mask (which planes to draw to) */
-    BYTE            FgPen;       /* foreground pen colour index */
-    BYTE            BgPen;       /* background pen colour index */
+    BYTE            FgPen;       /* foreground pen color index */
+    BYTE            BgPen;       /* background pen color index */
    BYTE            AOlPen;      /* area outline pen */
    BYTE            DrawMode;    /* JAM1, JAM2, COMPLEMENT, INVERSVID */
    BYTE            AreaPtSz;    /* area pattern size (log2) */
@ -111,12 +111,12 @@ flowchart LR
 ### Pen and Position Setup

 ```c
-/* Set pen colours: */
-SetAPen(rp, 1);       /* foreground = colour register 1 */
-SetBPen(rp, 0);       /* background = colour register 0 */
+/* Set pen colors: */
+SetAPen(rp, 1);       /* foreground = color register 1 */
+SetBPen(rp, 0);       /* background = color register 0 */
 SetDrMd(rp, JAM1);    /* transparent background mode */

-/* OS 3.0+ — use named pen for correct Workbench colours: */
+/* OS 3.0+ — use named pen for correct Workbench colors: */
 SetAPen(rp, screen->RastPort.BitMap->Depth > 1 ?
    ObtainBestPen(screen->ViewPort.ColorMap,
                  0xFF000000, 0x00000000, 0x00000000,  /* red */
@ -153,7 +153,7 @@ Draw(rp, 10, 10);     /* close: left edge */

 /* Single pixel: */
 WritePixel(rp, 160, 120);
-LONG colour = ReadPixel(rp, 160, 120);
+LONG color = ReadPixel(rp, 160, 120);

 /* Dashed lines: */
 SetDrPt(rp, 0xF0F0);  /* 16-bit pattern: 1111000011110000 */
@ -240,8 +240,8 @@ FreeRaster(tmpRasData, 320, 256);
 /* Flood fill from a seed point: */
 /* Requires TmpRas (same setup as area fills) */
 Flood(rp, 1, 50, 50);
-/* mode 1 = fill until FgPen colour boundary */
-/* mode 0 = fill all connected pixels of same colour as seed */
+/* mode 1 = fill until FgPen color boundary */
+/* mode 0 = fill all connected pixels of same color as seed */
 ```

 ### Blitting (Block Transfer)
@ -384,9 +384,9 @@ rp->Mask = 0xFF;  /* all planes — default */
 rp->Mask = 0x01;  /* only plane 0 — fast for single-plane effects */
 rp->Mask = 0x03;  /* planes 0 and 1 only */

-/* Use case: draw a 2-colour overlay without disturbing other planes: */
+/* Use case: draw a 2-color overlay without disturbing other planes: */
 rp->Mask = 0x04;  /* only plane 2 */
-SetAPen(rp, 4);   /* colour index with bit 2 set */
+SetAPen(rp, 4);   /* color index with bit 2 set */
 RectFill(rp, 0, 0, 319, 255);
 rp->Mask = 0xFF;  /* restore */
 ```
--- a/08_graphics/sprites.md
+++ b/08_graphics/sprites.md
@ -4,7 +4,7 @@

 ## Overview

-The Amiga has **8 hardware sprites**, each 16 pixels wide with 3 colours + transparent. Sprites are entirely DMA-driven — the custom chips fetch sprite data from Chip RAM and composite them over the playfield with zero CPU overhead. The Copper reloads sprite pointers every frame.
+The Amiga has **8 hardware sprites**, each 16 pixels wide with 3 colors + transparent. Sprites are entirely DMA-driven — the custom chips fetch sprite data from Chip RAM and composite them over the playfield with zero CPU overhead. The Copper reloads sprite pointers every frame.

 Sprite 0 is reserved by Intuition for the **mouse pointer**. Sprites 1–7 are available for application use.

@ -19,7 +19,7 @@ flowchart LR
    subgraph "Custom Chips (Denise/Lisa)"
        DMA["Sprite DMA<br/>(8 channels)"] --> MUX["Priority MUX"]
        PF["Playfield<br/>(bitplane data)"] --> MUX
-        MUX --> DAC["Colour DAC<br/>→ Video Out"]
+        MUX --> DAC["Color DAC<br/>→ Video Out"]
    end

    SD0 --> DMA
@ -58,15 +58,15 @@ Each sprite is stored as a contiguous block in Chip RAM:
 └──────────────────────────────────────────┘
 ```

-### Pixel Colour Encoding
+### Pixel Color Encoding

 ```
-Pixel colour = (DATB_bit << 1) | DATA_bit
+Pixel color = (DATB_bit << 1) | DATA_bit

  00 = transparent (playfield shows through)
-  01 = sprite colour 1
-  10 = sprite colour 2
-  11 = sprite colour 3
+  01 = sprite color 1
+  10 = sprite color 2
+  11 = sprite color 3
 ```

 ### Header Bit Layout
@ -93,11 +93,11 @@ Word 1 (SPRxCTL):

 ---

-## Sprite Colour Palette
+## Sprite Color Palette

-Each pair of sprites shares 3 colour registers (colour 0 = transparent for all):
+Each pair of sprites shares 3 color registers (color 0 = transparent for all):

-| Sprite Pair | Colour Registers | Custom Addresses | Notes |
+| Sprite Pair | Color Registers | Custom Addresses | Notes |
 |---|---|---|---|
 | 0–1 | `COLOR17`–`COLOR19` | `$DFF1A2`–`$DFF1A6` | Pair with mouse pointer |
 | 2–3 | `COLOR21`–`COLOR23` | `$DFF1AA`–`$DFF1AE` | |
@ -105,7 +105,7 @@ Each pair of sprites shares 3 colour registers (colour 0 = transparent for all):
 | 6–7 | `COLOR29`–`COLOR31` | `$DFF1BA`–`$DFF1BE` | |

 ```c
-/* Set sprite 0-1 colours directly: */
+/* Set sprite 0-1 colors directly: */
 custom->color[17] = 0xF00;  /* red */
 custom->color[18] = 0x0F0;  /* green */
 custom->color[19] = 0xFFF;  /* white */
@ -113,24 +113,24 @@ custom->color[19] = 0xFFF;  /* white */

 ---

-## Attached Sprites — 15 Colours
+## Attached Sprites — 15 Colors

-Two sprites from the same pair can be **attached** to form a single 15-colour (+ transparent) sprite:
+Two sprites from the same pair can be **attached** to form a single 15-color (+ transparent) sprite:

 ```mermaid
 flowchart LR
    subgraph "Normal (2 independent sprites)"
-        S0["Sprite 0<br/>3 colours"] --- S1["Sprite 1<br/>3 colours"]
+        S0["Sprite 0<br/>3 colors"] --- S1["Sprite 1<br/>3 colors"]
    end

-    subgraph "Attached (1 wide-colour sprite)"
-        SA["Sprites 0+1 attached<br/>4 bits per pixel<br/>15 colours + transparent"]
+    subgraph "Attached (1 wide-color sprite)"
+        SA["Sprites 0+1 attached<br/>4 bits per pixel<br/>15 colors + transparent"]
    end

    style SA fill:#c8e6c9,stroke:#2e7d32,color:#333
 ```

-When attached, the even sprite provides bits 0–1 and the odd sprite provides bits 2–3 of the colour index. The 4-bit value indexes into colour registers 16–31.
+When attached, the even sprite provides bits 0–1 and the odd sprite provides bits 2–3 of the color index. The 4-bit value indexes into color registers 16–31.

 ```c
 /* Enable attachment: set bit 0 of odd sprite's CTL word */
@ -178,8 +178,8 @@ The Copper waits for a line after one sprite ends, then reprograms the sprite po
 | Feature | OCS/ECS | AGA |
 |---|---|---|
 | Width | 16 pixels | 16, 32, or 64 pixels (via FMODE) |
-| Colours (single) | 3 + transparent | 3 + transparent |
-| Colours (attached) | 15 + transparent | 15 + transparent |
+| Colors (single) | 3 + transparent | 3 + transparent |
+| Colors (attached) | 15 + transparent | 15 + transparent |
 | Horizontal resolution | Low-res ÷ 2 | Same (unchanged) |

 ```c
--- a/08_graphics/views.md
+++ b/08_graphics/views.md
@ -26,7 +26,7 @@ struct ViewPort {
    struct ColorMap *ColorMap;   /* palette for this viewport */
    struct CopList  *DspIns;     /* display copper instructions */
    struct CopList  *SprIns;     /* sprite copper instructions */
-    struct CopList  *ClrIns;     /* colour copper instructions */
+    struct CopList  *ClrIns;     /* color copper instructions */
    struct CopList  *UCopIns;    /* user copper instructions */
    WORD            DWidth;      /* display width */
    WORD            DHeight;     /* display height */
@ -54,8 +54,8 @@ struct RasInfo {
 /* graphics/view.h */
 #define HIRES       0x8000   /* 640 pixel mode */
 #define LACE        0x0004   /* interlaced */
-#define HAM         0x0800   /* Hold-And-Modify (4096 colours) */
-#define EXTRA_HALFBRITE 0x0080 /* Extra Half-Brite (64 colours) */
+#define HAM         0x0800   /* Hold-And-Modify (4096 colors) */
+#define EXTRA_HALFBRITE 0x0080 /* Extra Half-Brite (64 colors) */
 #define DUALPF      0x0400   /* dual playfield */
 #define PFBA        0x0040   /* playfield B has priority */
 #define SUPERHIRES  0x0020   /* 1280 pixel mode (ECS+) */
@ -81,7 +81,7 @@ vp.DWidth = 320;
 vp.DHeight = 256;
 vp.Modes = 0;   /* lores */

-/* Build colour map: */
+/* Build color map: */
 vp.ColorMap = GetColorMap(32);

 /* Compile to copper: */