Restructure - dedicated for copper and blitter separate subfolders to have the room to expand

This commit is contained in:
Ilia Sharin 2026-05-31 13:52:45 -04:00
parent 616add20cf
commit a0fc3e05db
25 changed files with 1578 additions and 43 deletions

View file

@ -366,6 +366,6 @@ AGA has 256 colors but still only 32 color registers visible at a time. To load
## References
- HRM: *Copper* chapter — authoritative register descriptions
- [copper.md](../../08_graphics/copper.md) — graphics.library UCopList API
- [copper_programming.md](../../08_graphics/copper_programming.md) — additional examples
- [copper.md](../../08_graphics/copper/copper.md) — graphics.library UCopList API
- [copper_programming.md](../../08_graphics/copper/copper_programming.md) — additional examples
- [copper.md](../ocs_a500/copper.md) — OCS-level register reference

View file

@ -6,7 +6,7 @@
The Amiga's memory architecture is fundamentally different from any other home computer of its era. Rather than treating all RAM as equal, the system divides memory into **distinct classes** based on which hardware can access it. This division exists because the custom chipset (Agnus/Alice, Denise/Lisa, Paula) has its own DMA engine that operates on a dedicated bus — and that bus only reaches certain RAM.
Understanding this distinction is not optional. It determines where screen buffers live, why games run faster with expansion RAM, why the [Blitter](../../08_graphics/blitter_programming.md) can't touch Fast RAM, and why a $50 accelerator card with 8 MB of Fast RAM can feel like a new machine.
Understanding this distinction is not optional. It determines where screen buffers live, why games run faster with expansion RAM, why the [Blitter](../../08_graphics/blitter/blitter_programming.md) can't touch Fast RAM, and why a $50 accelerator card with 8 MB of Fast RAM can feel like a new machine.
> [!WARNING]
> The 68000 is **Big-Endian**. All multi-byte values in memory (pointers, word-sized registers, structure fields) are stored most-significant byte first. Modern developers working with Amiga memory dumps or binary formats will misread data if they assume little-endian layout.
@ -419,7 +419,7 @@ A: WHDLoad patches old games that assume all memory is Chip RAM. It redirects al
- See also: [address_space.md](address_space.md) — full 24-bit/32-bit address map
- See also: [chip_ram_expansion.md](../ecs_a600_a3000/chip_ram_expansion.md) — 2 MB Chip RAM with Super Agnus
- See also: [zorro_bus.md](zorro_bus.md) — Zorro II/III expansion bus (Fast RAM cards)
- See also: [blitter_programming.md](../../08_graphics/blitter_programming.md) — Blitter DMA (Chip RAM only)
- See also: [blitter_programming.md](../../08_graphics/blitter/blitter_programming.md) — Blitter DMA (Chip RAM only)
- See also: [exec_memory.md](../../06_exec_os/exec_memory.md) — AmigaOS memory management API
- See also: [dma_architecture.md](dma_architecture.md) — DMA slot allocation, bus arbitration, why Chip RAM bandwidth matters
- See also: [bus_architecture.md](bus_architecture.md) — Bus hierarchy, Chip↔Fast RAM transfer techniques, cache coherency

View file

@ -15,7 +15,7 @@ This article documents **the complete signal path** from crystal to screen — t
> - **Clock derivation and signal generation** — primary coverage here
> - **DMA slot allocation and bandwidth** — see [DMA Architecture](dma_architecture.md)
> - **ModeID selection and OS display API** — see [Display Modes](../../08_graphics/display_modes.md)
> - **Copper instruction programming** — see [Copper Programming](../../08_graphics/copper_programming.md)
> - **Copper instruction programming** — see [Copper Programming](../../08_graphics/copper/copper_programming.md)
---
@ -732,8 +732,8 @@ A: For basic display, ±10 ns per clock edge is sufficient. For genlock compatib
- [DMA Architecture](dma_architecture.md) — scanline slot allocation, bus arbitration, bandwidth calculations
- [Display Modes](../../08_graphics/display_modes.md) — ModeID system, OS display API, chipset comparison
- [Copper Programming](../../08_graphics/copper_programming.md) — beam-synchronized register writes
- [Copper — UCopList](../../08_graphics/copper.md) — system copper list management
- [Copper Programming](../../08_graphics/copper/copper_programming.md) — beam-synchronized register writes
- [Copper — UCopList](../../08_graphics/copper/copper.md) — system copper list management
- [ECS Productivity Modes](../ecs_a600_a3000/productivity_modes.md) — BEAMCON0 programming examples
- [CIA Chips](cia_chips.md) — E-clock, timers, TOD counter
- [Memory Types](memory_types.md) — Chip RAM vs Fast RAM, DMA visibility

View file

@ -156,5 +156,5 @@ WaitBlit(); /* graphics.library — waits and resets to safe state */
## See Also
- [DMA Architecture](../common/dma_architecture.md) — Blitter-Nasty (BLTPRI), bus arbitration, CPU starvation mechanics
- [Blitter Programming](../../08_graphics/blitter_programming.md) — Advanced minterms, cookie-cut, area fill
- [Blitter Programming](../../08_graphics/blitter/blitter_programming.md) — Advanced minterms, cookie-cut, area fill
- [AGA Blitter](../aga_a1200_a4000/aga_blitter.md) — 64-bit FMODE blitter

View file

@ -12,10 +12,10 @@ The Amiga graphics system is built on custom DMA-driven hardware (Agnus/Alice +
| [bitmap.md](bitmap.md) | BitMap structure, planar layout, allocation |
| [display_modes.md](display_modes.md) | Full chipset comparison, ModeID selection flowchart, CRT vs flat-panel, interlace/progressive tradeoffs, named antipatterns, FPGA/MiSTer impact, historical context, modern analogies, FAQ |
| [ham_ehb_modes.md](ham_ehb_modes.md) | HAM6/HAM8 encoding pipeline, EHB half-brite, fringing, palette programming, FPGA decoder logic |
| [copper.md](copper.md) | Copper coprocessor, instruction format, UCopList |
| [copper_programming.md](copper_programming.md) | Copper deep dive: architecture, copper list construction, gradient and raster effects |
| [blitter.md](blitter.md) | Blitter DMA engine, minterms, BltBitMap |
| [blitter_programming.md](blitter_programming.md) | Blitter deep dive: minterms, cookie-cut masking, line draw, fill mode |
| [copper.md](copper/copper.md) | Copper coprocessor, instruction format, UCopList |
| [copper_programming.md](copper/copper_programming.md) | Copper deep dive: architecture, copper list construction, gradient and raster effects |
| [blitter.md](blitter/blitter.md) | Blitter DMA engine, minterms, BltBitMap |
| [blitter_programming.md](blitter/blitter_programming.md) | Blitter deep dive: minterms, cookie-cut masking, line draw, fill mode |
| [sprites.md](sprites.md) | Hardware sprites: DMA engine, data format, attached 15-color sprites, multiplexing, AGA enhancements, priority control |
| [rastport.md](rastport.md) | RastPort drawing context: draw modes, patterns, layer clipping, text pipeline, blitter minterms |
| [views.md](views.md) | View + ViewPort pipeline: 3-stage Mermaid diagram, ViewPort chaining (split screens), ColorMap/LoadRGB4, named antipatterns, Copper vs Views decision flowchart, modern GPU analogies |

View file

@ -592,8 +592,8 @@ For MiSTer and emulator developers, planar BitMap emulation has specific require
- ADCD 2.1: `AllocBitMap()`, `FreeBitMap()`, `InitBitMap()`, `InitRastPort()`, `BltBitMap()`
- *Amiga Hardware Reference Manual* — Bitplane DMA chapter
- See also: [memory_types.md](../01_hardware/common/memory_types.md) — Chip RAM requirements for DMA-visible BitMaps
- See also: [blitter.md](blitter.md) — Blitter DMA operations on BitMaps
- See also: [blitter_programming.md](blitter_programming.md) — Advanced Blitter minterms and cookie-cut
- See also: [blitter.md](blitter/blitter.md) — Blitter DMA operations on BitMaps
- See also: [blitter_programming.md](blitter/blitter_programming.md) — Advanced Blitter minterms and cookie-cut
- See also: [views.md](views.md) — Attaching BitMaps to ViewPorts for display
- See also: [rastport.md](rastport.md) — RastPort drawing context and primitives
- See also: [dma_architecture.md](../01_hardware/common/dma_architecture.md) — bitplane DMA slot budget, DDFSTRT/DDFSTOP registers, bandwidth calculations

View file

@ -0,0 +1,109 @@
[← Home](../README.md) · [Graphics](README.md)
# Blitter — DMA Engine, Minterms, BltBitMap
## Overview
The **Blitter** is a DMA engine in the custom chips that performs bulk memory operations: block copies, line drawing, area fills, and arbitrary boolean combinations of up to three source bitmaps. It operates independently of the CPU, freeing the 68k for other work.
---
## Channels
The blitter has four DMA channels:
| Channel | Name | Direction | Description |
|---|---|---|---|
| A | Source A | Read | First source bitmap |
| B | Source B | Read | Second source (often mask/pattern) |
| C | Source C | Read | Third source (typically destination for read-modify-write) |
| D | Destination | Write | Output |
Each channel has: pointer register, modulo register, shift register (A/B only), and first/last word masks (A only).
---
## Minterm Logic
The blitter combines A, B, C inputs using an 8-bit **minterm** value. Each bit selects whether the output is 1 for a specific combination:
| Bit | A | B | C | Common Use |
|---|---|---|---|---|
| 7 | 1 | 1 | 1 | — |
| 6 | 1 | 1 | 0 | — |
| 5 | 1 | 0 | 1 | — |
| 4 | 1 | 0 | 0 | — |
| 3 | 0 | 1 | 1 | — |
| 2 | 0 | 1 | 0 | — |
| 1 | 0 | 0 | 1 | — |
| 0 | 0 | 0 | 0 | — |
Common minterm values:
| Minterm | Hex | Operation |
|---|---|---|
| `$F0` | `A` | Copy A to D (straight copy) |
| `$CA` | `AB + ~AC` | Cookie-cut: A=mask, B=source, C=background |
| `$3C` | `A XOR C` | XOR blit (sprite toggle) |
| `$0F` | `NOT A` | Invert source |
| `$00` | `0` | Clear destination |
| `$FF` | `1` | Fill destination with 1s |
---
## Register Map
| Address | Name | Description |
|---|---|---|
| `$DFF040` | `BLTCON0` | Control: use-channels + minterm + shift-A |
| `$DFF042` | `BLTCON1` | Control: direction, fill mode, line mode |
| `$DFF044` | `BLTAFWM` | First word mask for channel A |
| `$DFF046` | `BLTALWM` | Last word mask for channel A |
| `$DFF048` | `BLTCPT` | Channel C pointer (high word) |
| `$DFF04A` | `BLTCPT` | Channel C pointer (low word) |
| `$DFF04C` | `BLTBPT` | Channel B pointer |
| `$DFF050` | `BLTAPT` | Channel A pointer |
| `$DFF054` | `BLTDPT` | Channel D pointer (destination) |
| `$DFF058` | `BLTSIZE` | Size and start: `(height << 6) | width_words` |
| `$DFF064` | `BLTCMOD` | Channel C modulo |
| `$DFF062` | `BLTBMOD` | Channel B modulo |
| `$DFF060` | `BLTAMOD` | Channel A modulo |
| `$DFF066` | `BLTDMOD` | Channel D modulo |
---
## OS-Level Blitter Functions
```c
/* graphics.library */
/* Copy rectangular region between bitmaps: */
LONG BltBitMap(
struct BitMap *srcBM, WORD srcX, WORD srcY,
struct BitMap *dstBM, WORD dstX, WORD dstY,
WORD sizeX, WORD sizeY,
UBYTE minterm, /* usually $C0 = copy */
UBYTE mask, /* plane mask */
APTR tempA /* temp buffer or NULL */
);
/* Blit into RastPort (clips to layer): */
void BltBitMapRastPort(struct BitMap *src, WORD srcX, WORD srcY,
struct RastPort *rp, WORD dstX, WORD dstY,
WORD sizeX, WORD sizeY, UBYTE minterm);
/* Wait for blitter completion: */
void WaitBlit(void); /* must call before freeing blit buffers */
/* Gain exclusive blitter access: */
void OwnBlitter(void);
void DisownBlitter(void);
```
---
## References
- HRM: *Amiga Hardware Reference Manual* — Blitter chapter
- NDK39: `hardware/blit.h`, `graphics/gfx.h`
- ADCD 2.1: `BltBitMap`, `BltBitMapRastPort`, `OwnBlitter`

View file

@ -0,0 +1,981 @@
[← Home](../README.md) · [Graphics](README.md)
# Blitter Programming — Deep Dive
## Overview
The **[Blitter](../../01_hardware/ocs_a500/blitter.md)** (Block Image Transferrer) is a DMA coprocessor inside the Agnus chip that performs raster operations on rectangular memory blocks at bus speed — **without CPU involvement**. While the 68000 executes game logic, physics, or AI, the Blitter simultaneously clears screens, copies bitmap regions, composites masked sprites ("cookie-cut"), draws lines, and fills polygons. This parallelism is fundamental to why the Amiga could deliver arcade-quality 2D graphics on a 7 MHz processor with 512 KB of RAM.
The Blitter operates on up to **4 DMA channels** (A, B, C → D) using a programmable **8-bit minterm** truth table that encodes any Boolean function of three inputs. Combined with per-channel shift, modulo, and first/last word masking, this makes the Blitter a general-purpose 2D rasterization engine — not merely a memory copier.
> [!WARNING]
> The Blitter can **only** access Chip RAM. Pointing any channel register at Fast RAM causes silent data corruption or system crashes. Always allocate blitter-visible memory with `AllocMem(size, MEMF_CHIP)`.
```
Channel A ──→ ┐
Channel B ──→ ├──→ Minterm Logic ──→ Channel D (output)
Channel C ──→ ┘
A = mask/pattern (e.g., cookie shape, font glyph)
B = source image data
C = background / destination read-back
D = output destination
```
---
## Architecture
The Blitter sits inside **Agnus** (OCS/ECS) or **Alice** (AGA), sharing the DMA bus with the Copper, bitplane fetches, sprite DMA, disk, and audio. It accesses memory through 4 independent DMA channels, each with its own pointer and modulo register:
```mermaid
graph LR
subgraph "Agnus / Alice"
A["Channel A<br/>(mask/pattern)"] --> ML["Minterm Logic<br/>(8-bit truth table)"]
B["Channel B<br/>(source data)"] --> ML
C["Channel C<br/>(background read-back)"] --> ML
ML --> D["Channel D<br/>(output)"]
end
CRAM["Chip RAM"] --> A
CRAM --> B
CRAM --> C
D --> CRAM
style ML fill:#fff9c4,stroke:#f9a825
style CRAM fill:#e8f4fd,stroke:#2196f3
```
The **Minterm Logic** block is the Blitter's core innovation. It takes the current bit from channels A, B, and C (three Boolean inputs) and produces one output bit for channel D according to a programmable **8-bit truth table** stored in BLTCON0 bits 70. Since 3 inputs have 8 possible combinations (2³), the 8-bit minterm encodes **any** Boolean function of three variables — that's 256 possible logic operations in a single register write. This is what lets one piece of hardware do copies (`D=A`, minterm `$F0`), clears (`D=0`, minterm `$00`), cookie-cut compositing (`D=A·B+¬A·C`, minterm `$CA`), XOR highlighting (`D=A⊕C`, minterm `$5A`), and any other combination — all without changing hardware, just the 8-bit minterm value. See [Minterm Logic](#minterm-logic) below for the full truth table and common values.
Each channel reads (or writes, for D) from a different memory pointer with independent modulo, allowing operations on sub-rectangles within larger bitmaps. **Writing to `BLTSIZE` ($DFF058) starts the blit immediately** — always configure all other registers first.
### Channel Roles
| Channel | DMA Direction | Typical Role | Has Shift? | Has Mask? |
|---|---|---|---|---|
| **A** | Read | Mask, cookie shape, font glyph, line texture | Yes (ASH, 015 px) | Yes (BLTAFWM/BLTALWM) |
| **B** | Read | Source image data | Yes (BSH, 015 px) | No |
| **C** | Read | Background / destination read-back | No | No |
| **D** | Write | Output destination | No | No |
> [!NOTE]
> Any channel can be disabled per operation via BLTCON0 bits 118 (USEA/B/C/D). Disabling unused channels **saves DMA cycles** — a D-only clear (1 channel) runs 4× faster than a full ABCD blit.
### CPU / Blitter Bus Interaction
The Blitter and the 68000 CPU share the **Chip RAM bus** — they cannot access it simultaneously. Agnus arbitrates access on a cycle-by-cycle basis:
```
┌────────────────────────────────────────────────────────────┐
│ Chip RAM Bus (16-bit) │
├──────────┬──────────┬──────────┬──────────┬────────────────┤
│ Bitplane │ Sprite │ Copper │ Blitter │ CPU (left- │
│ DMA │ DMA │ DMA │ DMA │ over slots) │
├──────────┴──────────┴──────────┴──────────┴────────────────│
│ Fixed priority (high → low) │
└────────────────────────────────────────────────────────────┘
```
- **Without `BLTPRI`**: The Blitter gets every other free DMA slot. The CPU gets the remaining slots. Both run at roughly half speed on the Chip RAM bus.
- **With `BLTPRI` (nasty mode)**: The Blitter takes **all** free DMA slots. The CPU is completely frozen on any Chip RAM access until the blit completes. The CPU can still execute from Fast RAM or ROM — but any Chip RAM read/write stalls.
- **Display DMA always wins**: Bitplane, sprite, and audio DMA have fixed priority above the Blitter. In high-resolution modes, display DMA alone consumes most of the bus, leaving few slots for blitter operations.
### Chip RAM vs. Fast RAM
The Blitter is physically wired to the Chip RAM bus inside Agnus. It has **no connection** to the Fast RAM (Zorro) bus:
| Memory Type | Blitter Access? | CPU Access? | Notes |
|---|---|---|---|
| **Chip RAM** (first 512 KB2 MB) | ✓ Yes | ✓ Yes (contended) | Screen buffers, audio, sprites, all DMA-visible data |
| **Fast RAM** (Zorro II/III) | ✗ No | ✓ Yes (uncontended) | Code, variables, non-DMA data |
| **ROM** ($F80000$FFFFFF) | ✗ No | ✓ Yes | Kickstart, libraries |
This creates the key optimization opportunity on accelerated Amigas (A1200, A3000, A4000): **the CPU can execute code and access Fast RAM at full speed while the Blitter simultaneously works on Chip RAM**. On a stock A500 with only Chip RAM, the CPU and Blitter always contend for the same bus.
> [!IMPORTANT]
> There is no hardware error when pointing blitter registers at Fast RAM addresses. The Blitter's 22-bit address lines (OCS/ECS) simply wrap into Chip RAM space — producing silent data corruption at an unpredictable Chip RAM location.
## Minterm Logic
The minterm is an **8-bit value** stored in BLTCON0 (bits 70) that tells the Blitter what to do with each pixel. Think of it as a tiny program: for every pixel position, the Blitter reads the current bit from channels A, B, and C, looks up the answer in the minterm, and writes that answer to channel D (destination memory).
Since there are 3 inputs (A, B, C), each either 0 or 1, there are exactly **8 possible input combinations**. The 8-bit minterm has one bit for each combination — that bit decides whether the output pixel is on (1) or off (0):
| Minterm Bit | Input A (mask) | Input B (source) | Input C (background) | "If these inputs look like this…" |
|---|---|---|---|---|
| Bit 7 | 1 | 1 | 1 | …mask on, source on, background on |
| Bit 6 | 1 | 1 | 0 | …mask on, source on, background off |
| Bit 5 | 1 | 0 | 1 | …mask on, source off, background on |
| Bit 4 | 1 | 0 | 0 | …mask on, source off, background off |
| Bit 3 | 0 | 1 | 1 | …mask off, source on, background on |
| Bit 2 | 0 | 1 | 0 | …mask off, source on, background off |
| Bit 1 | 0 | 0 | 1 | …mask off, source off, background on |
| Bit 0 | 0 | 0 | 0 | …mask off, source off, background off |
Each bit is a simple yes/no: **"should the output pixel be on for this combination?"**
### Worked Example: Cookie-Cut (`$CA`)
The most important minterm is `$CA` — the cookie-cut blit used for sprite compositing. In binary, `$CA` = `11001010`. Let's read each bit:
| Bit | A (mask) | B (source) | C (background) | `$CA` bit value | Output pixel | Why |
|---|---|---|---|---|---|---|
| 7 | on | on | on | **1** | **on** | Inside the shape, source pixel is on → show it |
| 6 | on | on | off | **1** | **on** | Inside the shape, source pixel is on → show it |
| 5 | on | off | on | **0** | **off** | Inside the shape, source pixel is off → show it (it's dark) |
| 4 | on | off | off | **0** | **off** | Inside the shape, source pixel is off → show it |
| 3 | off | on | on | **1** | **on** | Outside the shape → keep background (it's on) |
| 2 | off | on | off | **0** | **off** | Outside the shape → keep background (it's off) |
| 1 | off | off | on | **1** | **on** | Outside the shape → keep background (it's on) |
| 0 | off | off | off | **0** | **off** | Outside the shape → keep background (it's off) |
The pattern: **where the mask (A) is set → take the source pixel (B). Where the mask is clear → keep the background pixel (C).** That's a sprite draw with transparency — exactly what every Amiga game uses.
### Common Minterms
| Minterm | Hex | Operation | Description | Real-World Use Case |
|---|---|---|---|---|
| `D = A` | `$F0` | Copy A | Output is a copy of channel A — every A-set pixel appears in D | **Block copy**: duplicate a screen region, copy a font glyph to the display |
| `D = B` | `$CC` | Copy B | Output is a copy of channel B regardless of A and C | **Shifted copy**: B has a barrel shift, so this copies with pixel-level repositioning |
| `D = C` | `$AA` | Copy C | Output is a copy of the destination read-back | **No-op / readback**: useful for fill mode where C→D with fill carry toggling |
| `D = A·B + ¬A·C` | `$CA` | Cookie-cut | Where mask (A) is 1: show source (B). Where mask is 0: show background (C) | **Sprite compositing**: draw a player character with transparency onto the game world |
| `D = 0` | `$00` | Clear | Output is always 0 regardless of inputs | **Screen clear**: zero out a bitplane, erase a region |
| `D = $FFFF` | `$FF` | Set all | Output is always 1 | **Fill with 1s**: set all pixels in a region (useful for masks) |
| `D = A XOR C` | `$5A` | XOR | Output toggles wherever A has a set bit | **Cursor blink**: XOR the cursor shape to toggle it on/off without saving background |
| `D = A OR C` | `$FA` | OR | Output is set wherever either A or C has a set bit | **Overlay**: stamp a shape onto the background without erasing existing pixels |
| `D = ¬A AND C` | `$0A` | Mask out | Output keeps C pixels only where A is clear — erases through the mask | **Erase shape**: cut a hole in the background matching the mask shape (first pass of two-pass sprite draw) |
| `D = A AND B` | `$C0` | AND | Output is set only where both A and B agree | **Masked pattern**: apply a fill pattern (B) clipped to a shape (A) |
| `D = A XOR B` | `$3C` | XOR (A,B) | Output toggles between A and B differences | **Difference detection**: find which pixels changed between two frames |
| `D = NOT A` | `$0F` | Invert | Output is the bitwise complement of A | **Mask inversion**: generate a negative mask from a positive one |
### Cookie-Cut Explained
```
A = mask (1 = sprite pixel, 0 = transparent)
B = sprite image data
C = background
D = result
Minterm $CA:
Where A=1: D = B (show sprite)
Where A=0: D = C (show background)
```
---
## Register Reference
| Address | Name | R/W | Description |
|---------|------|-----|-------------|
| `$DFF040` | BLTCON0 | W | Control: ASH (bits 1512), channel enables (bits 118), minterm (bits 70) |
| `$DFF042` | BLTCON1 | W | Control: BSH (bits 1512), fill/line mode (bits 40) |
| `$DFF044` | BLTAFWM | W | First word mask for channel A |
| `$DFF046` | BLTALWM | W | Last word mask for channel A |
| `$DFF048` | BLTCPTH/L | W | Channel C pointer (32-bit) |
| `$DFF04C` | BLTBPTH/L | W | Channel B pointer (32-bit) |
| `$DFF050` | BLTAPTH/L | W | Channel A pointer (32-bit) |
| `$DFF054` | BLTDPTH/L | W | Channel D pointer (32-bit) |
| `$DFF058` | BLTSIZE | W | Blit dimensions + **START** (write triggers blit!) |
| `$DFF05A` | BLTSIZV | W | Blit height — **AGA only** (15-bit, up to 32768 lines) |
| `$DFF05C` | BLTSIZH | W | Blit width + START — **AGA only** (11-bit, up to 2048 words) |
| `$DFF060` | BLTCMOD | W | Channel C modulo (bytes to skip per row) |
| `$DFF062` | BLTBMOD | W | Channel B modulo |
| `$DFF064` | BLTAMOD | W | Channel A modulo |
| `$DFF066` | BLTDMOD | W | Channel D modulo |
| `$DFF070` | BLTCDAT | W | Channel C data register (preload) |
| `$DFF072` | BLTBDAT | W | Channel B data register (preload) |
| `$DFF074` | BLTADAT | W | Channel A data register (preload / line texture) |
| `$DFF002` | DMACONR | R | DMA status — bit 14 (BBUSY) = blitter busy |
### BLTCON0 Encoding
```
Bits 1512: ASH — A channel barrel shift (015 pixels right)
Bit 11: USEA — enable channel A DMA
Bit 10: USEB — enable channel B DMA
Bit 9: USEC — enable channel C DMA
Bit 8: USED — enable channel D DMA (almost always 1)
Bits 70: LF — minterm (logic function truth table)
```
### BLTCON1 Encoding
```
Bits 1512: BSH — B channel barrel shift (015 pixels right)
Bit 4: IFE — inclusive fill enable
Bit 3: EFE — exclusive fill enable
Bit 2: FCI — fill carry input (initial state)
Bit 1: DESC — descending mode (blit bottom-right → top-left)
Bit 0: LINE — line draw mode
```
### BLTSIZE Encoding (OCS/ECS)
```
Bits 156: Height in lines (11024, 0 = 1024)
Bits 50: Width in words (164, 0 = 64)
```
> [!WARNING]
> **Writing BLTSIZE starts the blit!** Always configure all other registers (pointers, modulos, control, masks) before writing BLTSIZE. On AGA, write BLTSIZV first, then BLTSIZH (which triggers the blit).
### Ascending vs. Descending Mode
When source and destination overlap in memory, the blit direction determines whether data is corrupted:
```
Ascending (default, DESC=0):
Reads/writes top-left → bottom-right
Use when: dest address > source address
Descending (DESC=1):
Reads/writes bottom-right → top-left
Use when: dest address < source address
Pointers must be set to the LAST word of the block
Modulos are subtracted instead of added
```
This is critical for **scrolling** — shifting the screen contents by a few pixels requires an overlapping copy, and using the wrong direction produces garbage.
### Shift and Alignment
The Blitter is a **word-aligned** (16-bit) processor. Moving objects to arbitrary pixel positions requires the barrel shifter:
- **ASH** (channel A shift) and **BSH** (channel B shift) shift data 015 pixels to the right
- A rectangle N pixels wide at a non-aligned X position spans `⌈(N + shift) / 16⌉` words — one more than aligned
- **BLTAFWM** (first word mask) and **BLTALWM** (last word mask) prevent the shifted data from corrupting pixels outside the target area
---
## Complete Examples
### Example 1: Clear Screen (320×256, 1 bitplane)
```asm
lea $DFF000,a5
; Wait for blitter idle:
.bwait:
btst #14,$002(a5) ; DMACONR bit 14 = BBUSY
bne.s .bwait
; D channel only, minterm $00 (clear):
move.l #$01000000,$040(a5) ; BLTCON0: USED=1, minterm=$00
clr.w $042(a5) ; BLTCON1: 0
move.l #ScreenMem,$054(a5) ; BLTDPT
clr.w $066(a5) ; BLTDMOD: 0 (contiguous)
move.w #(256<<6)|20,$058(a5) ; BLTSIZE: 256 lines × 20 words (320/16)
; Blit is now running!
```
### Example 2: Block Copy (No Shift)
```asm
; Copy 64×64 pixel block from source to dest (1 bitplane)
; Source and dest are in contiguous bitmap, 320 pixels wide
; Width = 64 pixels = 4 words
; Modulo = (320 - 64) / 16 = 16 words = 32 bytes
lea $DFF000,a5
.bwait:
btst #14,$002(a5)
bne.s .bwait
move.l #$09F00000,$040(a5) ; BLTCON0: USEA+USED, minterm=$F0 (A→D)
clr.w $042(a5) ; BLTCON1
move.w #$FFFF,$044(a5) ; BLTAFWM = all bits
move.w #$FFFF,$046(a5) ; BLTALWM = all bits
move.l #SourceAddr,$050(a5) ; BLTAPT
move.l #DestAddr,$054(a5) ; BLTDPT
move.w #32,$064(a5) ; BLTAMOD = 32 bytes
move.w #32,$066(a5) ; BLTDMOD = 32 bytes
move.w #(64<<6)|4,$058(a5) ; BLTSIZE: 64 lines × 4 words GO!
```
### Example 3: Cookie-Cut Blit (Masked Sprite)
```asm
; Blit a 16×16 masked sprite onto background
; A = mask, B = sprite data, C = background, D = destination
lea $DFF000,a5
.bwait:
btst #14,$002(a5)
bne.s .bwait
move.l #$0FCA0000,$040(a5) ; BLTCON0: A+B+C+D, minterm=$CA
clr.w $042(a5) ; BLTCON1
move.w #$FFFF,$044(a5) ; BLTAFWM
move.w #$FFFF,$046(a5) ; BLTALWM
move.l #MaskData,$050(a5) ; BLTAPT = mask
move.l #SpriteData,$04C(a5) ; BLTBPT = sprite imagery
move.l #ScreenPos,$048(a5) ; BLTCPT = background (read-back)
move.l #ScreenPos,$054(a5) ; BLTDPT = same as C (overwrite)
clr.w $064(a5) ; BLTAMOD = 0 (mask is 16px = 1 word wide)
clr.w $062(a5) ; BLTBMOD = 0
move.w #38,$060(a5) ; BLTCMOD = (320-16)/8 = 38 bytes
move.w #38,$066(a5) ; BLTDMOD = 38
move.w #(16<<6)|1,$058(a5) ; BLTSIZE: 16 lines × 1 word GO!
```
### Example 4: Line Drawing
```asm
; Draw a line from (x1,y1) to (x2,y2) using blitter line mode
; This is complex — blitter line mode uses a Bresenham-style algorithm
; implemented in hardware
; BLTCON1 bit 0 = LINE mode
; Channel A = single word (texture pattern)
; Channel C/D = destination bitmap
; See HRM for the full algorithm; here's the concept:
move.l #$0B4A0000,$040(a5) ; BLTCON0: A+C+D, minterm=$4A (XOR), ASH=dx
move.w #$0001,$042(a5) ; BLTCON1: LINE=1, octant bits set per slope
move.w #$8000,$074(a5) ; BLTADAT: single pixel pattern
move.w #$FFFF,$044(a5) ; BLTAFWM
move.l #StartPos,$048(a5) ; BLTCPT: line start position in bitmap
move.l #StartPos,$054(a5) ; BLTDPT: same
move.w #Modulo,$060(a5) ; BLTCMOD
move.w #Modulo,$066(a5) ; BLTDMOD
move.w #(len<<6)|2,$058(a5) ; BLTSIZE: length × 2 GO!
```
---
## Advanced Use Cases & Cookbook
### Use Case 1: Shifted BOB (Sprite at Arbitrary X Position)
The most common real-world blitter task: draw a 16×16 sprite at pixel position (x, y) on a 320-pixel-wide screen. Since x may not be word-aligned, the barrel shifter handles sub-word positioning:
```asm
; Draw 16×16 BOB at pixel (x, y) on a 320px wide screen
; Inputs: d0.w = x position, d1.w = y position
; a0 = mask data, a1 = sprite data, a2 = screen base
lea $DFF000,a5
; Calculate screen byte offset:
move.w d1,d2
mulu #40,d2 ; y × 40 bytes/row (320 pixels / 8)
move.w d0,d3
lsr.w #3,d3 ; x / 8 = byte offset in row
and.w #$FFFE,d3 ; word-align (drop bit 0)
add.w d3,d2 ; total byte offset into screen
lea (a2,d2.w),a3 ; a3 = screen pointer for this BOB
; Calculate shift amount:
move.w d0,d3
and.w #$000F,d3 ; shift = x mod 16 (015 pixels)
ror.w #4,d3 ; move to bits 1512 for BLTCON0
or.w #$0FCA,d3 ; channels A+B+C+D, minterm $CA
.bwait:
btst #14,$002(a5)
bne.s .bwait
move.w d3,$040(a5) ; BLTCON0: shift + channels + minterm
clr.w $042(a5) ; BLTCON1: ascending, no fill
move.w #$FFFF,$044(a5) ; BLTAFWM: all bits in first word
move.w #$0000,$046(a5) ; BLTALWM: mask off last word (shift overflow)
move.l a0,$050(a5) ; BLTAPT = mask
move.l a1,$04C(a5) ; BLTBPT = sprite imagery
move.l a3,$048(a5) ; BLTCPT = background read-back
move.l a3,$054(a5) ; BLTDPT = write back to same position
clr.w $064(a5) ; BLTAMOD = 0 (mask is 1 word wide)
clr.w $062(a5) ; BLTBMOD = 0 (sprite is 1 word wide)
move.w #36,$060(a5) ; BLTCMOD = 40 - (2 words × 2) = 36 bytes
move.w #36,$066(a5) ; BLTDMOD = 36
move.w #(16<<6)|2,$058(a5) ; BLTSIZE: 16 lines × 2 words (1 extra for shift) GO!
```
**Key insight**: the blit is 2 words wide even though the sprite is only 16 pixels (1 word). The barrel shift pushes bits into the second word, so we need that extra word — and `BLTALWM=$0000` masks it so we don't corrupt adjacent pixels.
### Use Case 2: Hardware Scroll (Left by N Pixels)
Scrolling the screen left means the destination is at a lower address than the source — we must use **descending mode** to avoid overwriting source data:
```asm
; Scroll 320×256 screen left by 16 pixels (1 word = fastest case)
; Source: screen + 2 bytes (one word right)
; Dest: screen base
; No shift needed for 16-pixel increments
lea $DFF000,a5
.bwait:
btst #14,$002(a5)
bne.s .bwait
move.l #$09F00000,$040(a5) ; BLTCON0: A+D, minterm $F0 (copy)
clr.w $042(a5) ; BLTCON1: ascending (dest > source is OK here)
move.w #$FFFF,$044(a5) ; BLTAFWM
move.w #$FFFF,$046(a5) ; BLTALWM
move.l #Screen+2,$050(a5) ; BLTAPT: source is 1 word to the right
move.l #Screen,$054(a5) ; BLTDPT: destination is screen start
clr.w $064(a5) ; BLTAMOD = 0 (full-width rows)
clr.w $066(a5) ; BLTDMOD = 0
move.w #(256<<6)|20,$058(a5) ; BLTSIZE: 256 lines × 20 words GO!
; After blit: draw new column at right edge (column 19)
```
For sub-word scrolling (115 pixels), combine this with the barrel shifter and draw the new edge column from tile data.
### Use Case 3: Area Fill (Filled Polygon)
The blitter's fill mode is a two-step process: (1) draw the polygon outline with XOR lines, (2) fill the region. This is how games like *Carrier Command* and *Starglider 2* achieved real-time filled 3D:
```asm
; Step 1: Draw polygon edges using blitter line mode (XOR, single-bit)
; (Repeat for each edge of the polygon)
; Use minterm $4A (A XOR C) and BLTCON1 bit 0 = LINE, bit 1 = SING
; Step 2: Fill the outlined region
; Fill works RIGHT-TO-LEFT, BOTTOM-TO-TOP — requires descending mode
; Pointers must point to the LAST word of the bitmap region
lea $DFF000,a5
.bwait:
btst #14,$002(a5)
bne.s .bwait
; Set up inclusive fill (IFE):
move.l #$09F00000,$040(a5) ; BLTCON0: A+D, minterm $F0 (copy with fill)
move.w #$000A,$042(a5) ; BLTCON1: DESC=1 (bit 1), IFE=1 (bit 3)
; IFE = inclusive fill enable
move.w #$FFFF,$044(a5) ; BLTAFWM
move.w #$FFFF,$046(a5) ; BLTALWM
; Pointers to LAST word of the fill region (descending!):
move.l #FillBufferEnd,$050(a5) ; BLTAPT: last word of source
move.l #FillBufferEnd,$054(a5) ; BLTDPT: last word of dest (same buffer)
clr.w $064(a5) ; BLTAMOD = 0
clr.w $066(a5) ; BLTDMOD = 0
move.w #(Height<<6)|Width,$058(a5) ; BLTSIZE GO!
```
**How it works**: the fill carry bit (`FCI`) toggles on every set pixel. Between two outline pixels on the same scanline, the carry stays on — filling the interior. This is why the outline must use **single-bit mode** (SING=1) — otherwise double-width line pixels break the fill toggle.
### Use Case 4: Interleaved Bitplane BOBs
Standard bitplane layout stores all of plane 0, then all of plane 1, etc. **Interleaved** layout stores one row of plane 0, then one row of plane 1, alternating. This allows a single blit to draw a BOB across all bitplanes at once:
```asm
; Interleaved screen layout:
; Row 0, Plane 0 (40 bytes)
; Row 0, Plane 1 (40 bytes)
; Row 0, Plane 2 (40 bytes)
; Row 0, Plane 3 (40 bytes)
; Row 0, Plane 4 (40 bytes)
; Row 1, Plane 0 (40 bytes)
; ...
; Blit a 16×16 cookie-cut BOB across all 5 bitplanes in ONE operation:
; Height = 16 lines × 5 planes = 80 rows
; Modulo = 40 - 2 = 38 bytes per interleaved row (skip rest of scanline row)
; BOB data is also stored interleaved
lea $DFF000,a5
.bwait:
btst #14,$002(a5)
bne.s .bwait
move.l #$0FCA0000,$040(a5) ; BLTCON0: A+B+C+D, minterm $CA
clr.w $042(a5) ; BLTCON1
move.w #$FFFF,$044(a5) ; BLTAFWM
move.w #$FFFF,$046(a5) ; BLTALWM
move.l #BOBMask,$050(a5) ; BLTAPT (interleaved mask: same mask for all planes)
move.l #BOBData,$04C(a5) ; BLTBPT (interleaved sprite data)
move.l a3,$048(a5) ; BLTCPT (screen position)
move.l a3,$054(a5) ; BLTDPT (same)
clr.w $064(a5) ; BLTAMOD = 0 (mask repeats)
clr.w $062(a5) ; BLTBMOD = 0
move.w #38,$060(a5) ; BLTCMOD = 38 (skip to next interleaved row)
move.w #38,$066(a5) ; BLTDMOD = 38
move.w #(80<<6)|1,$058(a5) ; BLTSIZE: 80 rows (16×5) × 1 word GO!
```
**Why this matters**: without interleaving, drawing one BOB on a 5-plane screen requires **5 separate blits** (one per plane), each with its own WaitBlit + register setup overhead. Interleaving does it in **1 blit** — 5× less setup time, critical when drawing 15+ BOBs per frame.
### Use Case 5: Double-Buffered Game Loop
The standard pattern for flicker-free game rendering:
```asm
MainLoop:
; --- Wait for vertical blank ---
bsr WaitVBL ; Wait for beam to reach line 0
; --- Swap display buffer ---
; Copper list points to the currently visible buffer
; We draw into the hidden back buffer
move.l BackBuffer,a0
move.l FrontBuffer,a1
move.l a0,FrontBuffer ; Back buffer becomes front (display)
move.l a1,BackBuffer ; Old front becomes new back (draw target)
; Update Copper list bitplane pointers to show new front buffer:
bsr UpdateCopperBPLPTRs
; --- Clear back buffer ---
bsr WaitBlit
move.l #$01000000,$040(a5) ; D-only, minterm $00
clr.w $042(a5)
move.l a1,$054(a5) ; BLTDPT = back buffer
clr.w $066(a5)
move.w #(256<<6)|20,$058(a5) ; Clear 320×256 GO!
; --- Draw all BOBs ---
; CPU can process game logic while the clear blit runs!
bsr UpdateGameLogic ; Physics, AI, input — runs on CPU
bsr WaitBlit ; Wait for clear to finish
bsr DrawAllBOBs ; Chain of cookie-cut blits
bra MainLoop
```
**Key optimization**: `UpdateGameLogic` runs on the CPU *while* the screen clear runs on the Blitter. This is the core of the Amiga's parallelism — ~1.5 ms of free CPU time per frame from a single D-only clear.
### Use Case 6: GUI Window Drag (System-Friendly)
Workbench and applications use `graphics.library` for window dragging, icon rendering, and menu drawing. The OS handles Blitter synchronization:
```c
#include <graphics/gfx.h>
#include <graphics/rastport.h>
/* Scroll a window's contents up by 8 pixels (text scroll): */
ScrollRaster(rp, /* RastPort */
0, 8, /* dx=0, dy=8 (scroll up by 8 pixels) */
0, 0, /* top-left corner of scroll area */
319, 199); /* bottom-right */
/* The OS automatically uses an ascending/descending blit, sets modulos, */
/* and clears the exposed bottom strip. */
/* Copy a rectangular region between two bitmaps: */
BltBitMap(srcBM, 0, 0, /* source bitmap, x, y */
dstBM, 100, 50, /* dest bitmap, x, y */
64, 32, /* width, height */
0xC0, /* minterm: A AND B → masked copy */
0xFF, /* all bitplanes */
NULL); /* no temp buffer needed */
/* Draw a filled rectangle (uses the Blitter internally): */
SetAPen(rp, 3); /* Set pen color to index 3 */
RectFill(rp, 10, 10, 100, 50); /* Filled rectangle */
```
### Use Case 7: Tile Map Renderer
Games like *The Settlers*, *Cannon Fodder*, and most platformers render backgrounds from tile maps. Each tile is a 16×16 (or 32×32) block blitted to screen coordinates:
```asm
; Render a 20×16 tile map (320×256 screen, 16×16 tiles)
; TileMap: array of 320 bytes (20×16), each byte = tile index
; TileGfx: tile graphics, 16×16 pixels × 5 planes, interleaved
lea TileMap,a0
lea Screen,a2
moveq #16-1,d7 ; 16 tile rows
.tilerow:
moveq #20-1,d6 ; 20 tiles per row
.tilecol:
moveq #0,d0
move.b (a0)+,d0 ; Get tile index
mulu #16*5*2,d0 ; Tile data offset (16 rows × 5 planes × 2 bytes)
lea TileGfx,a1
add.l d0,a1 ; a1 = tile graphics pointer
bsr WaitBlit
move.l #$09F00000,$040(a5) ; BLTCON0: A+D, minterm $F0 (straight copy)
clr.w $042(a5) ; BLTCON1
move.w #$FFFF,$044(a5) ; BLTAFWM
move.w #$FFFF,$046(a5) ; BLTALWM
move.l a1,$050(a5) ; BLTAPT = tile data (interleaved)
move.l a2,$054(a5) ; BLTDPT = screen position
clr.w $064(a5) ; BLTAMOD = 0 (tile data is contiguous)
move.w #38,$066(a5) ; BLTDMOD = 40 - 2 = 38 (interleaved screen)
move.w #(80<<6)|1,$058(a5) ; BLTSIZE: 80 rows (16×5) × 1 word GO!
addq.l #2,a2 ; Next tile position (1 word right)
dbf d6,.tilecol
; Move to next tile row: advance screen pointer by 16 scanlines × 5 planes × 40 bytes
add.l #16*5*40-40,a2 ; Subtract the 40 bytes already advanced by 20 tiles
dbf d7,.tilerow
```
---
## Good and Bad Patterns
### ✓ Pattern: "Blit and Compute" — Overlap CPU and Blitter Work
```asm
; Start a blit, then do CPU work while it runs:
bsr SetupAndStartBlit ; Triggers BLTSIZE write
bsr UpdatePlayerPhysics ; CPU work — runs in parallel!
bsr ProcessInput ; More CPU work
bsr WaitBlit ; NOW wait for blit to finish
bsr SetupNextBlit ; Safe to touch registers
```
This is the **entire point** of having a Blitter. Any code that busy-waits immediately after starting a blit wastes the Amiga's key advantage.
### ✗ Antipattern: "The Busy-Wait Hog"
```asm
; ✗ BAD: Wait immediately after every blit — wastes CPU cycles
bsr StartBlit
.wait1: btst #14,$002(a5)
bne.s .wait1 ; CPU does NOTHING while blitter runs
bsr StartNextBlit
.wait2: btst #14,$002(a5)
bne.s .wait2 ; More wasted time
```
### ✓ Pattern: "Batch Then Wait" — Chain Setup, Single Sync Point
```asm
; Process all game logic FIRST:
bsr RunAI
bsr RunPhysics
bsr AnimateFrames
; THEN start the rendering blits in sequence:
bsr WaitBlit
bsr BlitBOB1
bsr WaitBlit
bsr BlitBOB2
bsr WaitBlit
bsr BlitBOB3
; The CPU-intensive work happened during the previous frame's display time
```
### ✗ Antipattern: "The Single-Plane-At-A-Time"
```asm
; ✗ BAD: Blit each bitplane separately (5× setup overhead)
lea Plane0,a0
bsr BlitBOBOnePlane
lea Plane1,a0
bsr BlitBOBOnePlane
lea Plane2,a0
bsr BlitBOBOnePlane
lea Plane3,a0
bsr BlitBOBOnePlane
lea Plane4,a0
bsr BlitBOBOnePlane ; 5 blits, 5 WaitBlit calls, 5× register setup
```
```asm
; ✓ GOOD: Use interleaved bitplanes — ONE blit for all planes
bsr BlitBOBInterleaved ; 1 blit, 1 WaitBlit, 1× register setup
```
### ✗ Antipattern: "System-Unfriendly Direct Access"
```c
/* ✗ BAD: Hit blitter registers directly from a Workbench app */
custom.bltcon0 = 0x09F00000;
/* The OS may be using the blitter RIGHT NOW for window operations */
```
```c
/* ✓ GOOD: Use OwnBlitter/DisownBlitter for exclusive access */
OwnBlitter(); /* Wait for and lock the blitter */
WaitBlit(); /* Ensure previous blit is done */
/* ... safe to program registers directly ... */
DisownBlitter(); /* Release for OS use */
```
### ✗ Antipattern: "Hardcoded 320-Pixel Modulo"
```asm
; ✗ BAD: Assumes screen width is always 320 pixels (modulo = 40 - blit_width*2)
move.w #36,$066(a5) ; BLTDMOD = 36 (hardcoded for 320px)
```
Many Amiga programs run on PAL overscan (352 or 384 pixels), productivity modes (640+), or RTG screens. Always calculate modulo from the actual screen byte width:
```asm
; ✓ GOOD: Calculate modulo from actual bitmap width
move.w ScreenBytesPerRow,d0
sub.w BlitWidthBytes,d0
move.w d0,$066(a5) ; BLTDMOD = dynamic
```
### ✗ Antipattern: "Ignoring the DMA Budget"
The Blitter shares the DMA bus with display, audio, and disk. In high-bandwidth display modes, there are fewer free DMA slots:
| Display Mode | DMA Slots Used by Display | Remaining for Blitter | Effect |
|---|---|---|---|
| Lores 320×256 × 5 planes | ~100 per line | ~126 per line | Full blitter speed |
| Hires 640×256 × 4 planes | ~160 per line | ~66 per line | Blitter runs at ~50% speed |
| Super Hires 1280 × 4 planes | ~200+ per line | ~26 per line | Blitter barely runs |
| HAM8 (AGA) | ~200 per line | ~26 per line | Blitter barely runs |
**Rule of thumb**: if your game stutters in hires modes, it's probably DMA contention, not CPU speed.
---
## Practical Limitations
| Limitation | Detail | Workaround |
|---|---|---|
| **Max blit size (OCS/ECS)** | 1024 lines × 64 words (1024×1024 pixels) | Split into multiple blits |
| **Max blit size (AGA)** | 32768 lines × 2048 words (BLTSIZV/BLTSIZH) | Rarely a practical issue |
| **Word alignment** | Blitter operates on 16-bit word boundaries only | Use barrel shift + masks for sub-word positioning; costs 1 extra word of width |
| **No scaling** | Cannot scale or rotate — purely rectangular block ops | Use CPU for affine transforms, then blit the result |
| **No clipping** | Blitter will happily write outside the screen bitmap | Implement clipping in software before setting up the blit |
| **Single operation at a time** | Only one blit can run at a time — no queue | Pipeline setup: compute next blit's parameters on CPU while current blit runs |
| **Chip RAM only** | All 4 channels must point to Chip RAM | Use `MEMF_CHIP` for all blitter-visible allocations; see [memory_types.md](../../01_hardware/common/memory_types.md) |
| **Fill carry direction** | Fill mode only works right-to-left (descending) | Always use DESC=1 with fill; set pointers to the end of the data |
| **No transparency levels** | Boolean operations only — 1-bit masking, no alpha | Dithering or multiple passes for graduated transparency |
| **Line mode limitations** | Lines drawn with SING=1 for fill prep are single-dot-per-row — visible gaps on steep angles | Use non-SING mode for visible lines, SING only for fill boundaries |
---
## Performance Analysis
### DMA Cycle Costs
The Blitter consumes DMA cycles proportional to the number of **active channels**. Each active channel adds 1 DMA cycle per word per row:
| Channels Active | Cycles/Word | Example Operation | Time for 320×256 (1 plane) |
|---|---|---|---|
| D only | 1 cycle | Screen clear | ~0.3 ms |
| A + D | 2 cycles | Simple copy (A→D) | ~0.6 ms |
| A + B + D | 3 cycles | Masked copy | ~0.9 ms |
| A + B + C + D | 4 cycles | Cookie-cut blit | ~1.3 ms |
> At 3.58 MHz DMA clock, 1 cycle ≈ 280 ns. A full 320×256×5-plane screen clear takes ~1.5 ms (D-only × 5 planes).
### CPU vs. Blitter Crossover
The Blitter is not always faster than the 68000:
| Operation Size | Winner | Why |
|---|---|---|
| < ~40 words | CPU (68000) | Blitter setup overhead (~20 cycles) exceeds the DMA savings |
| 40200 words | Tie | Depends on whether CPU needs the bus |
| > 200 words | Blitter | DMA runs independently; CPU can compute in parallel |
| Any size (A1200) | **Measure** | 68020 can access 32-bit Fast RAM while Blitter uses Chip RAM bus — often faster to do both |
### Nasty Mode (`BLTPRI`)
Setting bit 10 of DMACON (`BLTPRI`) gives the Blitter absolute DMA priority — the CPU is **frozen** until the blit completes. This maximizes blitter throughput but:
- Disables all interrupt servicing during the blit
- Breaks timing-sensitive code (audio, serial)
- Most professional software avoids it; demos use it freely
---
## When to Use / When NOT to Use
### When to Use the Blitter
- **Screen clearing** — D-only blit at 1 cycle/word is unbeatable
- **BOB/sprite compositing** — cookie-cut blit is the standard technique for all Amiga game objects
- **Scrolling** — overlapping copy with correct ascending/descending mode
- **Polygon filling** — exclusive/inclusive fill after boundary line drawing
- **Large memory copies** — any block > ~40 words benefits from DMA parallelism
- **Line drawing** — hardware Bresenham is faster than any software implementation on 68000
### When NOT to Use
- **Small copies (< 40 words)** — 68000 `MOVEM` or `MOVE.L` loop is faster due to blitter setup overhead
- **Fast RAM operations** — the Blitter cannot access Fast RAM at all; use CPU
- **Pixel-level operations** — the Blitter works on word-aligned rectangles; per-pixel logic requires CPU
- **A1200/A4000 with Fast RAM** — the 68020/030 running from 32-bit Fast RAM can often outperform the Blitter on Chip RAM, especially if you can overlap CPU work with display DMA
### Applicability Ranges
- **BOBs**: Practical limit ~1520 per frame at 320×256×5 planes before exhausting DMA bandwidth
- **Fill mode**: Works on single bitplanes only — filling a 5-plane display requires 5 passes
- **Line mode**: Maximum line length limited by BLTSIZE height field (1024 on OCS/ECS, 32768 on AGA)
---
## Historical Context — The 1985 Competitive Landscape
The Blitter was architecturally unprecedented in 1985. No competing home computer shipped with a comparable 2D rasterization coprocessor:
| Feature | Amiga (1985) | Atari ST (1985) | PC EGA (1984) | Mac 128K (1984) | C64 (1982) |
|---|---|---|---|---|---|
| **Hardware blitter** | Yes — 4-channel DMA with minterm logic | No (added in Mega ST/STE, 1987 — 1 source only) | No | No | No |
| **Channels** | 3 source + 1 dest | 1 source + 1 dest (STE) | — | — | — |
| **Boolean ops** | 256 minterms (arbitrary 3-input logic) | 16 logic ops (STE) | — | — | — |
| **Line drawing** | Hardware Bresenham | No | No | No | No |
| **Area fill** | Hardware inclusive/exclusive fill | No | No | No | No |
| **Shift/mask** | Per-channel barrel shift + first/last word masks | Shift + endmask (STE) | — | — | — |
| **CPU relief** | Full DMA — CPU free during blit | Partial — CPU still involved (STE) | CPU does everything | CPU does everything | CPU does everything |
### Pros (in 1985 context)
- **Parallelism**: The 68000 could execute game logic while the Blitter handled all rendering — this was the Amiga's key advantage over every competitor
- **Generality**: 256 minterm combinations meant any Boolean compositing operation was a single register write, not a software loop
- **Integration**: Shared DMA bus with Copper and sprites meant the entire display pipeline was hardware-driven
- **Line + fill in hardware**: Enabled real-time filled polygon rendering (used in games like Carrier Command, Starglider 2) that was impossible on competing platforms
### Cons (in 1985 context)
- **Chip RAM only**: All blitter-visible data had to live in the first 512 KB (later 12 MB), competing with screen memory, audio, and disk buffers
- **Word alignment**: Sub-pixel positioning required shift + extra word width + masking — complex setup for simple operations
- **No scaling/rotation**: Purely rectangular block operations; affine transforms required CPU
- **DMA contention**: Heavy blitter use starved the CPU of bus cycles even without nasty mode
---
## Modern Analogies
| Amiga Blitter Concept | Modern Equivalent | Notes |
|---|---|---|
| 4-channel minterm blit | GPU blend equations (Vulkan `VkBlendOp`) | The minterm is a fixed-function Boolean blend; modern GPUs use programmable shaders but the concept of combining sources through a logic function is identical |
| Cookie-cut (A·B + ¬A·C) | Alpha compositing / Porter-Duff `SrcOver` | The Amiga used 1-bit masks; modern systems use 8-bit alpha channels, but the compositing algebra is the same |
| DMA-driven blit | `vkCmdCopyImage` / `MTLBlitCommandEncoder` | Modern GPUs have dedicated DMA/copy engines that run asynchronously, exactly like the Blitter ran independently of the 68000 |
| OwnBlitter/DisownBlitter | Vulkan queue submission / Metal command buffer | Exclusive access to a shared hardware resource, then release — the synchronization pattern is identical |
| BLTPRI (nasty mode) | GPU preemption priority | Giving the transfer engine absolute bus priority at the cost of starving other consumers |
| Fill mode | GPU rasterizer fill | Hardware polygon fill is now done by the rasterizer stage; the Blitter's XOR-toggle fill was a clever 1985 approximation |
| BLTSIZE triggers blit | Command buffer submission | Writing the final register starts execution — analogous to `vkQueueSubmit` or `[commandBuffer commit]` |
| Barrel shift + word masks | Texture sampling with sub-texel offset | Achieving sub-pixel positioning through hardware shift and masking |
---
## Pitfalls & Common Mistakes
### Pitfall 1: "The Silent Corruption" — Fast RAM Pointers
```asm
; ✗ BAD: Buffer allocated in Fast RAM
move.l #FastRAMBuffer,$054(a5) ; BLTDPT points to Fast RAM
move.w #(256<<6)|20,$058(a5) ; Blit runs... but writes garbage
```
The Blitter's DMA engine is wired to the Chip RAM bus only. Fast RAM addresses silently alias to Chip RAM addresses or produce random data. **There is no error signal** — the blit completes "successfully" with corrupt output.
```asm
; ✓ GOOD: Buffer in Chip RAM
move.l #ChipRAMBuffer,$054(a5) ; Allocated with MEMF_CHIP
```
### Pitfall 2: "The Race Condition" — Missing WaitBlit
```asm
; ✗ BAD: Start a new blit without waiting for previous one
move.l #$09F00000,$040(a5) ; Overwrite BLTCON0 while previous blit runs!
move.l #NewSource,$050(a5) ; Corrupt the in-progress blit
move.w #(64<<6)|4,$058(a5) ; Start another blit undefined behavior
```
Modifying blitter registers while a blit is in progress produces unpredictable results — partial data, corrupted pointers, or system crashes.
```asm
; ✓ GOOD: Always wait
.bwait:
btst #14,$002(a5) ; Test BBUSY in DMACONR
bne.s .bwait
; Now safe to set up the next blit
```
### Pitfall 3: "The Wrong Direction" — Overlapping Copy Corruption
```asm
; ✗ BAD: Scrolling left (dest < source) with ascending mode
; Source at offset 2, dest at offset 0 — ascending overwrites source data
; before it's read, producing smeared garbage
```
```asm
; ✓ GOOD: Use descending mode when dest < source
move.w #$0002,$042(a5) ; BLTCON1: DESC=1
; Set pointers to LAST word of block, not first
```
### Pitfall 4: "The Off-By-One Word" — Forgetting Shift Width Expansion
```asm
; ✗ BAD: 32-pixel wide blit at non-aligned X — width still set to 2 words
; Shifted data spills into adjacent word, corrupting neighboring pixels
move.w #(16<<6)|2,$058(a5) ; Only 2 words wide but shift needs 3!
```
```asm
; ✓ GOOD: Add 1 word when shift > 0
move.w #(16<<6)|3,$058(a5) ; 3 words: 2 for data + 1 for shift overflow
move.w #$FFF0,$046(a5) ; BLTALWM masks off the rightmost 4 pixels
```
### Pitfall 5: "The Stale Pointer" — Reusing Registers After a Blit
After a blit completes, all pointer registers have advanced to the **end** of the data. A second blit with the same pointers starts where the first one left off — not at the original position.
```asm
; ✓ GOOD: Always reload all pointers before each blit
move.l #SourceAddr,$050(a5) ; Reload BLTAPT
move.l #DestAddr,$054(a5) ; Reload BLTDPT
```
---
## Impact on FPGA/Emulation
The Blitter is one of the most complex subsystems to reproduce accurately in an FPGA core:
- **DMA slot timing**: The Blitter shares DMA slots with bitplane, sprite, Copper, disk, and audio DMA. Incorrect slot allocation produces visible glitches in demos that count cycles
- **Barrel shifter pipeline**: The A and B channel shifts operate on a word pipeline — off-by-one in the shift register produces 1-pixel horizontal offset errors visible in scrolling
- **Fill mode carry propagation**: The fill carry bit (`FCI`) must propagate correctly from right to left within each word and across word boundaries; errors produce "zebra stripe" artifacts
- **Line mode octant handling**: The Bresenham algorithm implementation requires precise handling of 8 octants with correct sign and direction — many emulators get diagonal lines wrong in edge cases
- **BLTSIZE write-trigger**: The blit must start on the exact cycle that BLTSIZE is written, not one cycle later; demos that chain blits back-to-back depend on this timing
- **Nasty mode interaction**: `BLTPRI` must correctly freeze the CPU *and* still allow DMA from other sources (Copper, bitplanes) — freezing everything breaks display output
---
## Real-World Software Usage
| Software | Blitter Usage | Notes |
|---|---|---|
| **Deluxe Paint** | Brush compositing, flood fill, line tools | Canonical use of BltBitMap + BltMaskBitMapRastPort through the OS |
| **Shadow of the Beast** | Multi-layer parallax scrolling | Custom blitter routines for layer compositing, bypasses OS |
| **Carrier Command** | Filled polygon 3D rendering | Blitter line draw + fill mode for real-time vector graphics |
| **Lemmings** | Terrain destruction, character compositing | Cookie-cut blits for each lemming; XOR blits for terrain modification |
| **Workbench** | Window dragging, icon rendering, menu drawing | All through graphics.library — system-friendly blitter usage |
| **Demo scene** | Virtually everything | Chunky-to-planar conversion, texture mapping, copper+blitter co-programming |
---
## Best Practices
1. **Always call `WaitBlit()` or poll BBUSY before touching any blitter register**
2. **Write BLTSIZE last** — it triggers the blit; all other registers must be configured first
3. **Use `OwnBlitter()`/`DisownBlitter()`** for system-friendly code — never assume you have exclusive access
4. **Disable unused channels** — fewer channels = fewer DMA cycles = faster blit
5. **Set BLTAFWM and BLTALWM to `$FFFF`** for word-aligned blits — forgetting this produces partial-word masking bugs
6. **Account for shift width expansion** — non-aligned blits are 1 word wider than you expect
7. **Choose ascending/descending correctly** for overlapping copies — test both scroll directions
8. **Interleave CPU work with blitter operations** — the whole point of DMA is parallelism; don't busy-wait when you could be computing
9. **Profile before choosing Blitter vs CPU** — on accelerated Amigas, the 68020+ with Fast RAM often wins
---
## References
- HRM: *Amiga Hardware Reference Manual* — Blitter chapter (complete register descriptions and timing)
- NDK 3.9: `hardware/blit.h`, `hardware/custom.h`, `graphics/gfx.h`
- ADCD 2.1: Hardware Manual — [Blitter chapter](http://amigadev.elowar.com/read/ADCD_2.1/Hardware_Manual_guide/node006D.html)
- See also: [blitter.md](../../01_hardware/ocs_a500/blitter.md) — hardware register reference
- See also: [animation.md](../animation.md) — GEL system (BOBs use the Blitter internally)
- See also: [copper.md](../copper/copper.md) — Copper coprocessor (often co-programmed with the Blitter)
- See also: [rastport.md](../rastport.md) — RastPort drawing context (uses Blitter for all draw operations)
- See also: [display_modes.md](../display_modes.md) — DMA slot budget (Blitter competes for bus bandwidth)
- See also: [Akiko — CD32 C2P](../../01_hardware/aga_a1200_a4000/akiko_cd32.md) — hardware Chunky-to-Planar conversion (CD32 alternative to CPU/Blitter C2P)
- **Scoopex Amiga Hardware Programming** (Photon) — [YouTube: Blitter episodes](https://www.youtube.com/playlist?list=PLc3ltHgmiidpK-s0eP5hTKJnjdTHz0_bW) — Video walkthroughs of Blitter setup, cookie-cut masking, line draw, and fill mode. Companion articles: [coppershade.org](http://coppershade.org/articles/)

View file

@ -0,0 +1,124 @@
[← Home](../README.md) · [Graphics](README.md)
# Copper — Coprocessor Instructions and UCopList
## Overview
The **Copper** is a simple coprocessor in the Amiga custom chips that executes a list of instructions synchronized to the video beam. It can write to any custom chip register at any beam position, enabling per-scanline color changes, split screens, and hardware-level display effects without CPU intervention.
---
## Instruction Format
The Copper has only three instructions, each 32 bits (one longword):
### MOVE — Write a Register
```
[register_offset (9 bits)] [value (16 bits)]
Bit layout: 0RRRRRRRR00000000 VVVVVVVVVVVVVVVV
```
- Register offset is relative to `$DFF000` (custom chip base)
- Only even registers can be written (bit 0 = 0)
- Example: `$0180, $0FFF` → write `$0FFF` to `COLOR00` (`$DFF180`)
### WAIT — Wait for Beam Position
```
[vpos (8 bits)] [hpos (7 bits)] [1] [vmask (7 bits)] [hmask (7 bits)] [0]
Bit layout: VVVVVVVVHHHHHH01 vvvvvvvvhhhhhhh0
```
- Pauses until the beam reaches at least the specified (vpos, hpos)
- Masks allow waiting on partial positions (e.g. any horizontal, specific vertical)
### SKIP — Conditional Skip
```
Same as WAIT but bit 0 of second word = 1
```
If the beam has already passed the specified position, skip the next instruction.
---
## Standard Copper Patterns
### Per-Scanline Color Change (Rainbow)
```
WAIT $2C01,$FFFE ; wait for line $2C (44)
MOVE $0180,$0F00 ; COLOR00 = red
WAIT $2D01,$FFFE ; wait for line $2D (45)
MOVE $0180,$00F0 ; COLOR00 = green
WAIT $2E01,$FFFE ; wait for line $2E (46)
MOVE $0180,$000F ; COLOR00 = blue
...
WAIT $FFDF,$FFFE ; wait past line 255 (enables access to lines 256+)
WAIT $FFFF,$FFFE ; end-of-list (impossible position = halt)
```
### End of Copper List
```
$FFFF, $FFFE ; WAIT for beam position $FFFF — never reached
```
This is the standard "stop" marker. The Copper loops back to the start on the next vertical blank.
---
## System Copper Lists
The OS manages copper lists through `GfxBase`:
| Pointer | Description |
|---|---|
| `GfxBase->copinit` | System initialization copper list |
| `GfxBase->LOFlist` | Long-frame copper list (even fields) |
| `GfxBase->SHFlist` | Short-frame copper list (odd fields, interlace) |
---
## UCopList — User Copper Instructions
Applications can inject copper instructions into the system list via `UCopList`:
```c
struct UCopList *ucl = AllocMem(sizeof(struct UCopList), MEMF_PUBLIC|MEMF_CLEAR);
CINIT(ucl, 100); /* init, max 100 instructions */
CWAIT(ucl, 44, 0); /* wait for line 44 */
CMOVE(ucl, *((UWORD *)0xDFF180), 0x0F00); /* COLOR00 = red */
CWAIT(ucl, 100, 0);
CMOVE(ucl, *((UWORD *)0xDFF180), 0x000F); /* COLOR00 = blue */
CEND(ucl); /* end of list */
viewport->UCopIns = ucl;
RethinkDisplay(); /* rebuild system copper list */
```
---
## Custom Chip Register Addresses (Copper-Relevant)
| Address | Name | Description |
|---|---|---|
| `$DFF180``$DFF1BE` | `COLOR00``COLOR31` | OCS/ECS palette (12-bit RGB) |
| `$DFF100` | `BPLCON0` | Bitplane control (depth, resolution) |
| `$DFF102` | `BPLCON1` | Scroll offsets |
| `$DFF104` | `BPLCON2` | Priority control |
| `$DFF08E` | `DIWSTRT` | Display window start |
| `$DFF090` | `DIWSTOP` | Display window stop |
| `$DFF092` | `DDFSTRT` | Data fetch start |
| `$DFF094` | `DDFSTOP` | Data fetch stop |
| `$DFF0E0``$DFF0FE` | `BPL1PT``BPL8PT` | Bitplane pointers |
---
## References
- HRM: *Amiga Hardware Reference Manual* — Copper chapter
- NDK39: `graphics/copper.h`, `graphics/gfxmacros.h`
- ADCD 2.1: `CINIT`, `CMOVE`, `CWAIT`, `CEND`

View file

@ -0,0 +1,321 @@
[← Home](../README.md) · [Graphics](README.md)
# Copper Programming — Deep Dive
## What Is the Copper?
The **Copper** (Co-Processor) is a tiny DMA-driven programmable engine inside Agnus (OCS/ECS) or Alice (AGA). It executes a list of instructions — the **copper list** — in lockstep with the video beam as it sweeps across the CRT. Its sole purpose is to write values to custom chip registers at precise screen positions.
Despite having only **3 instructions** and no arithmetic, branching, or memory read capability, the Copper is what gives the Amiga its distinctive visual character.
### Where It Lives in the System
```mermaid
graph LR
subgraph Agnus/Alice ["Agnus / Alice Chip"]
Copper["Copper<br/>(reads copper list via DMA)"]
DMA["DMA Controller"]
Beam["Beam Counter"]
end
subgraph Denise/Lisa ["Denise / Lisa Chip"]
Palette["Color Registers"]
BPL["Bitplane Control"]
SPR["Sprite Control"]
end
subgraph Memory
CopList["Copper List<br/>(Chip RAM only!)"]
end
Beam -- "current V,H position" --> Copper
CopList -- "DMA fetch" --> Copper
Copper -- "register writes" --> Palette
Copper -- "register writes" --> BPL
Copper -- "register writes" --> SPR
Copper -- "register writes" --> DMA
```
**Key points:**
- The Copper reads its program from **Chip RAM** via DMA — no CPU involvement
- It writes directly to custom chip registers (the same `$DFF000$DFF1FE` space)
- It synchronizes with the **beam counter** — it knows exactly where the electron beam is
- The CPU can modify the copper list in memory at any time; changes take effect next frame
### What the Copper Can Do
| Capability | How | Typical Use |
|---|---|---|
| **Per-line color changes** | WAIT for line → MOVE to COLORxx | Gradient skies, rainbow bars, water effects |
| **Split-screen displays** | Change bitplane pointers mid-frame | Status bar + scrolling game area |
| **Parallax scrolling** | Change BPLCON1 scroll offset at different lines | Multi-layer side-scrollers |
| **Resolution mixing** | Change BPLCON0 mid-frame | HiRes title bar + LoRes gameplay |
| **Sprite multiplexing** | Repoint sprite DMA pointers after sprite ends | 24+ sprites using 8 physical slots |
| **Palette animation** | CPU modifies copper list words each frame | Cycling water, fire, lava |
| **Display window shaping** | Change DIWSTRT/DIWSTOP | Overscan, borders, letterbox |
| **DMA scheduling** | Enable/disable bitplane/sprite DMA per line | Hide artifacts during setup |
### What the Copper Cannot Do
| Limitation | Detail |
|---|---|
| No arithmetic | Cannot add, subtract, multiply, or compare values |
| No branching/loops | Executes linearly top-to-bottom; no jumps or calls |
| No memory read | Can only WRITE to registers — cannot read anything |
| No CPU memory access | Writes only to custom chip registers (`$DFF000`+), not RAM or CIA |
| No sub-pixel timing | Horizontal resolution: 4 color clocks (~8 low-res pixels) |
| V counter wraps at 255 | PAL lines 256311 require a double-WAIT trick |
| Chip RAM only | The copper list itself must reside in Chip RAM (DMA-accessible) |
### How the System Uses It
**AmigaOS** — `graphics.library` builds the system copper list automatically when you call `MakeVPort()` / `LoadView()`. This list sets up bitplane pointers, sprite pointers, display window, and palette for every ViewPort. User code adds instructions via `UCopList`.
**Games (system takeover)** — Disable the OS display system, point COP1LC to your own copper list, and have total control. The copper list typically sets up the display, changes colors per line, and handles sprite multiplexing.
**Demos** — Push the Copper to its limits: hundreds of color changes per frame, dynamic copper list generation, and tricks like "copper bars" (changing colors mid-scanline using horizontal WAITs).
---
## Instruction Set
The Copper has exactly **3 instructions**, each 32 bits (2 words):
### MOVE — Write to Register
```
Word 1: RRRRRRRR R0000000 R = register address (9 bits)
Word 2: DDDDDDDD DDDDDDDD D = 16-bit data value
Constraints:
- Register address must be even ($000$1FE range)
- Registers below COPCON threshold ($040) are protected by default
- COPCON ($02E) bit 1 can unlock dangerous registers ($000$03E)
```
**Example:** Set COLOR00 to red
```
dc.w $0180, $0F00 ; MOVE $0F00 → COLOR00 ($DFF180)
```
### WAIT — Wait for Beam Position
```
Word 1: VVVVVVVV HHHHHHHH V = vertical pos (8 bits), H = horizontal pos
Word 2: MMMMMMMM MMMMMM01 M = mask bits, bit 0 = 1 (WAIT marker)
If bit 15 of word 2 is 0: also blitter-finished wait
Default mask: $FFFE (match all V and H bits except bit 0)
```
**Example:** Wait for line 100, any horizontal position
```
dc.w $6401, $FFFE ; WAIT V=$64 (100), H=$01
```
### SKIP — Conditional Skip
```
Word 1: VVVVVVVV HHHHHHHH same format as WAIT
Word 2: MMMMMMMM MMMMMM11 bit 0 = 1, bit 1 = 1 (SKIP marker)
If beam position ≥ specified position, skip next instruction.
```
---
## Beam Position Encoding
```
Vertical: bits 158 of word 1 = V7V0 (range 0255)
For PAL lines > 255, use WAIT twice or the LOF bit
Horizontal: bits 71 of word 1 = H8H1 (range 0$E2, step 2)
Bit 0 always 0 in WAIT word 1 (distinguishes from MOVE)
Full PAL: 312 lines, but copper V wraps at 256
Lines 0255: V = line number
Lines 256+: V wraps; use two WAITs:
WAIT for V=$FF (end of first field)
WAIT for actual line - 256
```
---
## Copper List Termination
```
; End-of-list sentinel (wait for impossible position):
dc.w $FFFF, $FFFE ; WAIT $FF,$FF — never reached
```
---
## Complete Examples
### Example 1: Rainbow Bars (Color Per Scanline)
```asm
; copperlist.s — 256-color rainbow using Copper
SECTION copperlist,DATA_C ; MUST be in Chip RAM!
CopperList:
; Set up a basic display first
dc.w $0100, $1200 ; BPLCON0: 1 bitplane, color on
dc.w $0092, $0038 ; DDFSTRT
dc.w $0094, $00D0 ; DDFSTOP
dc.w $008E, $2C81 ; DIWSTRT
dc.w $0090, $2CC1 ; DIWSTOP
; Line 44 ($2C): start of visible display
dc.w $2C01, $FFFE ; WAIT line 44
dc.w $0180, $0F00 ; COLOR00 = bright red
dc.w $2D01, $FFFE ; WAIT line 45
dc.w $0180, $0E10 ; COLOR00 = red-orange
dc.w $2E01, $FFFE ; WAIT line 46
dc.w $0180, $0D20 ; COLOR00 = orange
dc.w $2F01, $FFFE ; WAIT line 47
dc.w $0180, $0C30 ; COLOR00 = yellow-orange
; ... repeat for each line with incrementing colors ...
dc.w $FFFF, $FFFE ; end of copper list
SECTION code,CODE
start:
move.l 4.w,a6 ; SysBase
lea $DFF000,a5 ; custom chips base
; Install copper list:
move.l #CopperList,$080(a5) ; COP1LCH/COP1LCL
move.w #0,$088(a5) ; COPJMP1 — strobe to restart
; Enable DMA:
move.w #$8280,$096(a5) ; DMACON: SET + COPEN + DMAEN
.wait:
btst #6,$BFE001 ; left mouse button
bne.s .wait
rts
```
### Example 2: Split Screen (Two Different Backgrounds)
```asm
SplitCopperList:
; Top half: blue background
dc.w $0180, $000F ; COLOR00 = blue
; Wait for middle of screen (line 128)
dc.w $8001, $FFFE ; WAIT line $80 = 128
; Bottom half: green background
dc.w $0180, $00F0 ; COLOR00 = green
dc.w $FFFF, $FFFE ; end
```
### Example 3: Parallax Scrolling via Copper
```asm
ParallaxCopperList:
; Background layer: scroll position 0
dc.w $0102, $0000 ; BPLCON1 = no scroll
; Wait for horizon line
dc.w $6001, $FFFE ; WAIT line 96
; Middle layer: scroll by 2 pixels
dc.w $0102, $0022 ; BPLCON1 = scroll 2px both playfields
; Wait for ground layer
dc.w $A001, $FFFE ; WAIT line 160
; Ground layer: scroll by 4 pixels
dc.w $0102, $0044 ; BPLCON1 = scroll 4px
dc.w $FFFF, $FFFE
```
---
## System-Friendly Copper (via graphics.library)
For programs that coexist with the OS:
```c
#include <graphics/copper.h>
/* Allocate a user copper list: */
struct UCopList *ucl = AllocMem(sizeof(struct UCopList), MEMF_CLEAR);
/* Build instructions: */
CINIT(ucl, 100); /* init, max 100 instructions */
CWAIT(ucl, 0, 0); /* wait for top of display */
CMOVE(ucl, custom.color[0], 0x00F); /* set COLOR00 = blue */
CWAIT(ucl, 128, 0); /* wait for line 128 */
CMOVE(ucl, custom.color[0], 0x0F0); /* set COLOR00 = green */
CEND(ucl); /* end */
/* Install on ViewPort: */
vp->UCopIns = ucl;
RethinkDisplay(); /* rebuild system copper list with our additions */
/* Cleanup: */
vp->UCopIns = NULL;
RethinkDisplay();
FreeVPortCopLists(vp);
FreeCopList(ucl);
```
---
## Copper Timing
| Item | Cycles |
|---|---|
| Each Copper instruction | 4 color clocks (= 8 low-res pixels) |
| WAIT resolution (horizontal) | 4 color clocks minimum |
| Maximum instructions per line | ~112 (NTSC) / ~114 (PAL) |
| PAL visible lines | 256 (lines 44300) |
| NTSC visible lines | 200 (lines 44244) |
---
## Advanced Techniques
### Copper-Driven Sprite Multiplexing
Reposition sprites mid-frame to display more than 8 sprites:
```asm
; Display sprite 0 at Y=50
dc.w $3001, $FFFE ; WAIT line 48 (before sprite)
dc.w $0120, SprData1>>16 ; SPR0PTH
dc.w $0122, SprData1&$FFFF ; SPR0PTL
; After sprite 0 finishes at Y=66, reuse for position Y=100
dc.w $6801, $FFFE ; WAIT line 104
dc.w $0120, SprData2>>16 ; SPR0PTH — repoint to different data
dc.w $0122, SprData2&$FFFF ; SPR0PTL
```
### Copper-Driven Palette Animation
```asm
; Animate copper list by modifying color values each frame
; (DMA reads new values each frame automatically)
; Just update the data words in the copper list in Chip RAM
move.w d0, CopperList+6 ; modify the MOVE data word
```
---
## References
- HRM: *Copper* chapter — complete instruction encoding
- [copper.md](../../01_hardware/ocs_a500/copper.md) — register-level reference
- [copper.md](copper.md) — graphics.library UCopList API
- [Video Signal & Timing](../../01_hardware/common/video_timing.md) — beam counters, scanline anatomy, clock tree
- **Scoopex Amiga Hardware Programming** (Photon) — [YouTube: Copper tutorials](https://www.youtube.com/playlist?list=PLc3ltHgmiidpK-s0eP5hTKJnjdTHz0_bW) — Video walkthroughs of copper list construction, copper bars, and raster effects. Companion articles: [coppershade.org](http://coppershade.org/articles/)

View file

@ -639,7 +639,7 @@ LACE alternates two 256-line fields at 25/30 Hz each to produce 512 visible line
- HRM: *Amiga Hardware Reference Manual* — Display, Custom Chip chapters
- ADCD 2.1: `NextDisplayInfo`, `GetDisplayInfoData`, `BestModeIDA`, MonitorSpec autodocs
- See also: [views.md](views.md) — ViewPort and View construction, mode transitions at ViewPort boundaries
- See also: [copper.md](copper.md) — Copper display list programming, mid-screen mode switching
- See also: [copper.md](copper/copper.md) — Copper display list programming, mid-screen mode switching
- See also: [ham_ehb_modes.md](ham_ehb_modes.md) — HAM6/HAM8/EHB in depth
- See also: [AGA Display Modes](../01_hardware/aga_a1200_a4000/aga_display_modes.md) — AGA-specific capabilities
- See also: [Chipset AGA](../01_hardware/aga_a1200_a4000/chipset_aga.md) — Lisa chip, FMODE, 24-bit palette

View file

@ -233,5 +233,5 @@ for (node = GfxBase->TextFonts.lh_Head;
- NDK39: `graphics/gfxbase.h`
- ADCD 2.1: graphics.library autodocs
- See also: [views.md](views.md) — View/ViewPort construction
- See also: [copper.md](copper.md) — Copper coprocessor
- See also: [copper.md](copper/copper.md) — Copper coprocessor
- See also: [display_modes.md](display_modes.md) — display mode database

View file

@ -400,4 +400,4 @@ rp->Mask = 0xFF; /* restore */
- See also: [bitmap.md](bitmap.md) — BitMap structure and allocation
- See also: [layers.md](../../11_libraries/layers.md) — layers.library detailed reference
- See also: [text_fonts.md](text_fonts.md) — font loading and rendering
- See also: [blitter.md](blitter.md) — hardware Blitter used by BltBitMap
- See also: [blitter.md](blitter/blitter.md) — hardware Blitter used by BltBitMap

View file

@ -703,6 +703,6 @@ A: `OpenFont()` (graphics.library) only finds fonts already loaded in memory. `O
- [diskfont.md](../11_libraries/diskfont.md) — font file format, disk loading pipeline, ColorFont memory layout
- [rastport.md](rastport.md) — RastPort text rendering context and drawing modes
- [bitmap.md](bitmap.md) — BitMap structure (font glyph data is a BitMap)
- [blitter.md](blitter.md) — hardware that performs the actual glyph blitting
- [blitter.md](blitter/blitter.md) — hardware that performs the actual glyph blitting
- [console.md](../10_devices/console.md) — console text rendering uses these fonts
- [utility.md](../11_libraries/utility.md) — `Hook` structure (used by font enumeration)

View file

@ -462,8 +462,8 @@ The Amiga shipped with a hardware-composited, multi-resolution desktop in 1985.
- ADCD 2.1: `InitView`, `InitVPort`, `MakeVPort`, `MrgCop`, `LoadView`, `LoadRGB4`, `FreeVPortCopLists`, `FreeCprList`, `GetColorMap`, `FreeColorMap`
- HRM: *Amiga Hardware Reference Manual* — Display Construction chapter, Copper chapter
- See also: [GfxBase — Graphics Library Global State](gfx_base.md) — View lifecycle, ActiView, system Copper lists
- See also: [Copper — Coprocessor Instructions and UCopList](copper.md) — `CMOVE`/`CWAIT` macros, UCopList construction, instruction format
- See also: [Copper Programming — Deep Dive](copper_programming.md) — gradient effects, raster bars, mid-frame palette changes
- See also: [Copper — Coprocessor Instructions and UCopList](copper/copper.md) — `CMOVE`/`CWAIT` macros, UCopList construction, instruction format
- See also: [Copper Programming — Deep Dive](copper/copper_programming.md) — gradient effects, raster bars, mid-frame palette changes
- See also: [Display Modes](display_modes.md) — ModeID system, resolution/color-depth combinations, DMA slot budget
- See also: [BitMap](bitmap.md) — planar allocation, interleaved layout, `BMF_DISPLAYABLE`
- See also: [Screens — Intuition](screens.md) — OS-managed screen lifecycle built on Views

View file

@ -12,7 +12,7 @@ This is one of the defining features of the Amiga. No other mainstream personal
## How Screens Work — The Copper Connection
The hardware trick that makes Amiga screens possible is the **[Copper](../08_graphics/copper.md)** (coprocessor), a DMA-driven programmable sequencer inside the Agnus/Alice chip. The Copper executes a list of instructions synchronized to the CRT electron beam position (see also: [Copper Programming](../08_graphics/copper_programming.md), [OCS Copper](../01_hardware/ocs_a500/copper.md), [AGA Copper](../01_hardware/aga_a1200_a4000/aga_copper.md)):
The hardware trick that makes Amiga screens possible is the **[Copper](../08_graphics/copper/copper.md)** (coprocessor), a DMA-driven programmable sequencer inside the Agnus/Alice chip. The Copper executes a list of instructions synchronized to the CRT electron beam position (see also: [Copper Programming](../08_graphics/copper/copper_programming.md), [OCS Copper](../01_hardware/ocs_a500/copper.md), [AGA Copper](../01_hardware/aga_a1200_a4000/aga_copper.md)):
| Instruction | Action |
|---|---|
@ -982,8 +982,8 @@ A: `BestModeID()` requires ECS or later to support queries beyond the basic mode
### Related Knowledge Base Articles
- [Copper](../08_graphics/copper.md) — the hardware mechanism that makes multi-screen possible
- [Copper Programming](../08_graphics/copper_programming.md) — writing Copper lists
- [Copper](../08_graphics/copper/copper.md) — the hardware mechanism that makes multi-screen possible
- [Copper Programming](../08_graphics/copper/copper_programming.md) — writing Copper lists
- [Views](../08_graphics/views.md) — View/ViewPort/ViewPortExtra hierarchy
- [Bitmap](../08_graphics/bitmap.md) — screen bitmap layout and bitplane organization
- [Windows](windows.md) — windows live on screens

View file

@ -735,5 +735,5 @@ A: No — layers manage overlapping regions on a single `BitMap`. For off-screen
- [Screens](../09_intuition/screens.md) — Intuition screen layer management, LayerInfo deadlock
- [Windows](../09_intuition/windows.md) — every window is a layer
- [Bitmaps](../08_graphics/bitmap.md) — the pixel storage that layers clip onto
- [Blitter](../08_graphics/blitter.md) — hardware that performs the actual copy operations for layers
- [Blitter](../08_graphics/blitter/blitter.md) — hardware that performs the actual copy operations for layers
- [Utility](utility.md) — `Hook` structure used by backfill hooks

View file

@ -1014,7 +1014,7 @@ else
- UAE RTG source: `uaegfx.card` — reference virtual card implementation
- Village Tronic Picasso IV documentation — scan doubler and video scaler internals
- See also: [display_modes.md](../08_graphics/display_modes.md) — native display modes
- See also: [blitter.md](../08_graphics/blitter.md) — native Blitter (not RTG blitter)
- See also: [blitter.md](../08_graphics/blitter/blitter.md) — native Blitter (not RTG blitter)
- See also: [device_driver_basics.md](device_driver_basics.md) — general Amiga device driver framework
- See also: [screens.md](../09_intuition/screens.md) — Intuition screen management

View file

@ -857,8 +857,8 @@ A: A voxel (volume pixel) display renders 3D data as a grid of points. On the Am
### Related Knowledge Base Articles
- [Pixel Conversion](../08_graphics/pixel_conversion.md) — C2P algorithms (Kalms, Blitter, Akiko)
- [Blitter Programming](../08_graphics/blitter_programming.md) — Fill mode, line draw, minterms
- [Blitter](../08_graphics/blitter.md) — Blitter hardware architecture
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Fill mode, line draw, minterms
- [Blitter](../08_graphics/blitter/blitter.md) — Blitter hardware architecture
- [Bitmap](../08_graphics/bitmap.md) — Bitplane memory layout, interleaving
- [Copper Effects](copper_effects.md) — Copper-driven display effects
- [Timing Optimization](timing_optimization.md) — Cycle counting, Blitter-CPU interleaving

View file

@ -121,10 +121,10 @@ The key insight of demoscene coding: **the Copper, Blitter, Sprites, and Audio a
### Related Knowledge Base Articles
- [Copper](../08_graphics/copper.md) — Copper coprocessor hardware: instruction format, UCopList
- [Copper Programming](../08_graphics/copper_programming.md) — Building copper lists, gradients, raster effects
- [Blitter](../08_graphics/blitter.md) — Blitter DMA engine: channels, minterms
- [Blitter Programming](../08_graphics/blitter_programming.md) — Cookie-cut masking, line draw, fill mode
- [Copper](../08_graphics/copper/copper.md) — Copper coprocessor hardware: instruction format, UCopList
- [Copper Programming](../08_graphics/copper/copper_programming.md) — Building copper lists, gradients, raster effects
- [Blitter](../08_graphics/blitter/blitter.md) — Blitter DMA engine: channels, minterms
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Cookie-cut masking, line draw, fill mode
- [Sprites](../08_graphics/sprites.md) — Hardware sprites: DMA, attached sprites, multiplexing
- [Pixel Conversion](../08_graphics/pixel_conversion.md) — Chunky↔Planar: Kalms, Copper Chunky, Akiko
- [Video Timing](../01_hardware/common/video_timing.md) — Scanline anatomy, beam counters, per-frame budgets

View file

@ -6,7 +6,7 @@
The Copper is the single most important tool in the demoscene coder's arsenal. With only three instructions — `WAIT`, `MOVE`, and `SKIP` — it can repaint the entire screen 50 times per second, changing color registers, bitplane pointers, scroll offsets, and sprite positions at exact scanline boundaries. Every iconic Amiga demo effect, from the rainbow copper bars in [Red Sector's **Megademo**](https://www.pouet.net/prod.php?which=3119) (1989) to the sinus-scrolling message waves in [**Desert Dream**](https://www.pouet.net/prod.php?which=1483) (1993, [Demozoo](https://demozoo.org/productions/142/)), traces back to someone figuring out how to make the Copper do something Commodore's engineers never intended.
This article covers the specific techniques demoscene coders developed for the Copper: classic copper bars, raster splits for multi-resolution screens, gradient shading, sine-based color cycling, and advanced tricks like copper-generated chunky pixels and mid-frame copper list swaps. For the Copper's hardware architecture and basic programming model, see [Copper](../08_graphics/copper.md) and [Copper Programming](../08_graphics/copper_programming.md).
This article covers the specific techniques demoscene coders developed for the Copper: classic copper bars, raster splits for multi-resolution screens, gradient shading, sine-based color cycling, and advanced tricks like copper-generated chunky pixels and mid-frame copper list swaps. For the Copper's hardware architecture and basic programming model, see [Copper](../08_graphics/copper/copper.md) and [Copper Programming](../08_graphics/copper/copper_programming.md).
```mermaid
graph TB
@ -807,7 +807,7 @@ See [pixel_tricks.md](pixel_tricks.md) for the full technique.
A: Yes, with caveats. AGA adds an 8-bit horizontal position (vs OCS 7-bit), controlled by the `BPC` bit in `FMODE`. AGA also has 256-color registers (`COLOR00``COLOR255`) instead of 32, allowing much more complex copper effects. However, the higher bandwidth of AGA bitplane DMA leaves fewer slots for the Copper.
**Q: Can I use copper effects from AmigaOS?**
A: Yes, via `UCopList` — the user copper list attached to a `ViewPort`. Intuition merges your copper instructions with its own. See [Copper Programming](../08_graphics/copper_programming.md) for the OS-friendly approach. For full copper control (demos), you take over the hardware directly.
A: Yes, via `UCopList` — the user copper list attached to a `ViewPort`. Intuition merges your copper instructions with its own. See [Copper Programming](../08_graphics/copper/copper_programming.md) for the OS-friendly approach. For full copper control (demos), you take over the hardware directly.
**Q: What happens if the Copper runs past the end of a scanline before finishing?**
A: The Copper simply continues executing on the next scanline. There is no error or trap. The WAIT instruction's purpose is to synchronize — if you don't WAIT, the Copper runs as fast as DMA allows. Effects that don't need per-line synchronization can skip WAITs entirely.
@ -818,8 +818,8 @@ A: The Copper simply continues executing on the next scanline. There is no error
### Related Knowledge Base Articles
- [Copper](../08_graphics/copper.md) — Copper coprocessor hardware: instruction format, UCopList
- [Copper Programming](../08_graphics/copper_programming.md) — Building copper lists, gradients, raster effects
- [Copper](../08_graphics/copper/copper.md) — Copper coprocessor hardware: instruction format, UCopList
- [Copper Programming](../08_graphics/copper/copper_programming.md) — Building copper lists, gradients, raster effects
- [Pixel Conversion](../08_graphics/pixel_conversion.md) — Copper chunky technique, C2P algorithms
- [Sprites](../08_graphics/sprites.md) — Sprite multiplexing (Copper repositions sprites)
- [Video Timing](../01_hardware/common/video_timing.md) — Scanline anatomy, beam counters

View file

@ -656,7 +656,7 @@ A: EHB is a 6-bitplane mode where the 32nd palette entry is automatically genera
- [Copper Effects](copper_effects.md) — Copper bars, gradients, raster splits
- [Display Modes](../08_graphics/display_modes.md) — ModeID, BPLCON0 settings
- [Video Timing](../01_hardware/common/video_timing.md) — Scanline structure, beam position
- [Blitter Programming](../08_graphics/blitter_programming.md) — Blitter fill for HAM rendering
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Blitter fill for HAM rendering
### External Resources

View file

@ -567,7 +567,7 @@ A: `CLXCON` (`$DFF098`) configures which sprite and bitplane bits are included i
- [AGA Sprites](../01_hardware/aga_a1200_a4000/aga_sprites.md) — AGA sprite enhancements: 32/64px, FMODE, color banks
- [Copper Effects](copper_effects.md) — Copper-driven sprite repositioning
- [Pixel Tricks](pixel_tricks.md) — Chunky pixel techniques using sprites
- [Blitter Programming](../08_graphics/blitter_programming.md) — BOB rendering (sprite alternative)
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — BOB rendering (sprite alternative)
- [Animation](../08_graphics/animation.md) — GEL system: VSprites (software sprites)
- [DMA Architecture](../01_hardware/common/dma_architecture.md) — DMA slot allocation

View file

@ -725,7 +725,7 @@ A: **Blitter-CPU interleaving**. On a stock A500, the Blitter and CPU share the
- [Bus Architecture](../01_hardware/common/bus_architecture.md) — CPU/DMA bus sharing, wait states
- [Cache Management](../15_fpu_mmu_cache/cache_management.md) — CACR, CacheClearU, coherency
- [68040/68060 Libraries](../15_fpu_mmu_cache/68040_68060_libraries.md) — Line-F trap handlers, FPU emulation
- [Blitter Programming](../08_graphics/blitter_programming.md) — Fill mode, line draw, Blitter timing
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Fill mode, line draw, Blitter timing
- [3D Rendering](3d_rendering.md) — Fixed-point math, C2P costs
- [Copper Effects](copper_effects.md) — Copper list timing, DMA budgets

View file

@ -38,7 +38,7 @@ The Amiga's documentation was scattered across out-of-print manuals, Usenet post
|---|---|
| **New to Amiga** | [History & chipsets](00_overview/history.md) → [Boot sequence](02_boot_sequence/cold_boot.md) → [Exec kernel](06_exec_os/exec_base.md) |
| **Writing code** | [Toolchain setup](13_toolchain/gcc_amiga.md) → [Calling conventions](04_linking_and_libraries/register_conventions.md) → [.fd files](04_linking_and_libraries/fd_files.md) |
| **Doing hardware** | [Address space](01_hardware/common/address_space.md) → [Memory types](01_hardware/common/memory_types.md) → [Custom registers](01_hardware/ocs_a500/custom_registers.md) → [Copper programming](08_graphics/copper_programming.md) |
| **Doing hardware** | [Address space](01_hardware/common/address_space.md) → [Memory types](01_hardware/common/memory_types.md) → [Custom registers](01_hardware/ocs_a500/custom_registers.md) → [Copper programming](08_graphics/copper/copper_programming.md) |
| **Reverse engineering** | [RE methodology](05_reversing/methodology.md) → [Game RE](05_reversing/games/game_reversing.md) → [IDA/Ghidra setup](05_reversing/ida_setup.md) |
| **Building an FPGA core** | [Hardware models](00_overview/hardware_models.md) → [AGA chipset](01_hardware/aga_a1200_a4000/chipset_aga.md) → [68040/060 libs](15_fpu_mmu_cache/68040_68060_libraries.md) |
@ -208,10 +208,10 @@ The Amiga's documentation was scattered across out-of-print manuals, Usenet post
|---|---|
| [gfx_base.md](08_graphics/gfx_base.md) | GfxBase, chip detection, PAL/NTSC |
| [bitmap.md](08_graphics/bitmap.md) | Planar BitMap, pixel layout, allocation |
| [copper.md](08_graphics/copper.md) | Copper coprocessor, MOVE/WAIT/SKIP, UCopList |
| [copper_programming.md](08_graphics/copper_programming.md) | Copper deep dive: architecture, system diagram, examples |
| [blitter.md](08_graphics/blitter.md) | Blitter DMA, minterms, BltBitMap |
| [blitter_programming.md](08_graphics/blitter_programming.md) | Blitter deep dive: cookie-cut, fill, line draw |
| [copper.md](08_graphics/copper/copper.md) | Copper coprocessor, MOVE/WAIT/SKIP, UCopList |
| [copper_programming.md](08_graphics/copper/copper_programming.md) | Copper deep dive: architecture, system diagram, examples |
| [blitter.md](08_graphics/blitter/blitter.md) | Blitter DMA, minterms, BltBitMap |
| [blitter_programming.md](08_graphics/blitter/blitter_programming.md) | Blitter deep dive: cookie-cut, fill, line draw |
| [sprites.md](08_graphics/sprites.md) | Hardware sprites, SimpleSprite |
| [display_modes.md](08_graphics/display_modes.md) | ModeID selection flowchart, CRT vs flat-panel, interlace/progressive tradeoffs, named antipatterns, FPGA/MiSTer impact, historical context, modern analogies, FAQ |
| [ham_ehb_modes.md](08_graphics/ham_ehb_modes.md) | HAM6/HAM8 encoding pipeline, EHB half-brite, fringing, palette programming, FPGA decoder logic |