# 3D Rendering — Fixed-Point Math, Blitter Polygons, Rotozoom, Dot Tunnels, and Voxel Space
## Overview
In 1985, the Amiga had no 3D hardware. No matrix engine, no floating-point unit, no texture mapper — just a 7 MHz integer-only 68000, a Blitter that could copy rectangles, and a Copper that could change display registers. Yet by 1990, demoscene coders were rendering real-time filled 3D objects, and by 1994 they were flying through voxel landscapes at playable framerates. The entire 3D pipeline — projection, clipping, rasterization, fill — was built from scratch in hand-tuned 68000 assembly using fixed-point arithmetic.
This article covers the demoscene 3D rendering techniques that made it possible: fixed-point math, Blitter-filled polygons, texture-mapped rotozoom, dot tunnels, and voxel space rendering. Each technique maps to a specific hardware capability — and the demoscene's creative abuse of it.
```mermaid
graph TB
subgraph "Math Foundation"
FP["Fixed-Point<br/>16.16 arithmetic"]
MATRIX["Matrix Ops<br/>Rotation × projection"]
SINCOS["Sine/Cosine<br/>Table lookup"]
end
subgraph "Rendering"
FILL["Filled Polygons<br/>Blitter line + fill"]
TEXTURE["Rotozoom<br/>Affine texture map"]
DOT["Dot Tunnel<br/>Z-ordered circles"]
VOXEL["Voxel Space<br/>Raycast heightmap"]
end
subgraph "Optimization"
C2P["Chunky-to-Planar<br/>Kalms/Blitter C2P"]
DIV["Fast Division<br/>Reciprocal table"]
CLIP["Screen-Space Clip<br/>Cohen-Sutherland"]
end
FP --> FILL
FP --> TEXTURE
MATRIX --> FILL
SINCOS --> MATRIX
FILL --> DOT
TEXTURE --> VOXEL
C2P --> TEXTURE
DIV --> VOXEL
```
---
## Foundation: Fixed-Point Arithmetic
The 68000 has no floating-point unit. All 3D math must use integers. The solution is **fixed-point** — encoding fractional values as integers with an implicit decimal point.
*sy = SCREEN_CY - FIXED_TO_INT(fixed_mul(y, scale)); /* Y flipped */
} else {
*sx = SCREEN_CX; /* Behind camera */
*sy = SCREEN_CY;
}
}
```
---
## Technique 1: Blitter-Filled Polygons
The Blitter's **line-draw + fill mode** combination is the foundation of Amiga 3D rendering. The process:
1. Draw polygon edges using Blitter line mode (sets pixels at boundaries)
2. Activate Blitter fill mode (fills between set pixels, even→odd fill rule)
3. Result: a filled polygon with zero CPU pixel writing
```mermaid
sequenceDiagram
participant CPU as 68000 CPU
participant Blitter as Blitter
participant Bitmap as Bitplane Memory
Note over CPU: Polygon: 4 vertices
CPU->>Blitter: Configure line mode (BLTCON0=$0B7A)
CPU->>Blitter: Draw edge V1→V2
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V2→V3
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V3→V4
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V4→V1
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Configure fill mode (BLTCON0=$01F2)
CPU->>Blitter: Fill bitmap (even-odd rule)
Blitter->>Bitmap: Fill interior pixels
Note over Bitmap: Filled polygon ready
```
### Blitter Fill Mode Detail
The Blitter fill uses the **inclusive-odd** fill rule: scanning left to right, it inverts pixels each time it encounters a set bit. This means it fills between pairs of edge pixels:
```asm
; blit_fill.asm — Blitter fill for a single bitplane
; Assumes edges already drawn in the bitmap
; Fill from top to bottom of polygon
lea $DFF000,a0 ; Custom registers base
; Set up fill-mode blit
move.w #$01F2,BLTCON0(a0) ; Fill mode: A→D, fill enabled
move.w #$0000,BLTCON1(a0) ; No line mode, ascending
move.w #$FFFF,BLTAFWM(a0) ; First word mask = all bits
move.w #$FFFF,BLTALWM(a0) ; Last word mask = all bits
; Source A = the bitmap data (edge pixels)
move.l bitmap_start,BLTAPTH(a0) ; Source address
; Destination = same bitmap (fill in-place)
move.l bitmap_start,BLTDPTH(a0) ; Dest = source
; Blit size: height × width
; height = number of scanlines, width = words per line
move.w #(HEIGHT<<6)|(WORDS_PER_LINE),BLTSIZE(a0)
; Blitter starts immediately
```
### Multiple-Face Sorting (Painter's Algorithm)
For solid 3D objects, faces must be drawn back-to-front (painter's algorithm):
```c
/* face_sort.c — Sort polygon faces by depth for painter's algorithm */
typedef struct {
WORD num_vertices;
WORD *vertices; /* Index into vertex array */
FIXED avg_z; /* Average Z depth (for sorting) */
UWORD color; /* Face color */
} Face;
int compare_faces(const void *a, const void *b) {
FIXED za = ((const Face *)a)->avg_z;
FIXED zb = ((const Face *)b)->avg_z;
/* Sort far-to-near (painter's algorithm) */
if (za > zb) return -1; /* a is farther, draw first */
Rotozoom renders a texture-mapped rectangle that can rotate and scale in real-time. The name comes from **rotate + zoom**. It works by computing a texture coordinate (U,V) for each screen pixel using an affine transform.
### Algorithm
For each screen pixel (x, y), compute texture coordinates:
```
U = U_start + x × dU_dx + y × dU_dy
V = V_start + x × dV_dx + y × dV_dy
```
Where `dU_dx`, `dV_dx`, `dU_dy`, `dV_dy` are the rotation+scale matrix coefficients.
The 68000 assembly inner loop is highly optimized. The key insight: texture coordinates wrap at power-of-2 boundaries, so masking with `$FF` (256-wide texture) is free using byte-level addressing:
```asm
; rotozoom_inner.asm — Optimized inner loop
; a0 = destination (chunky buffer)
; a1 = texture base (256×256)
; d0 = U (16.16 fixed)
; d1 = V (16.16 fixed)
; d2 = dU/dx (16.16 fixed)
; d3 = dV/dx (16.16 fixed)
; d4 = loop counter (SCREEN_W)
.roto_inner:
move.l d0,d5 ; Copy U
swap d5 ; d5.w = integer part of U
move.l d1,d6 ; Copy V
swap d6 ; d6.w = integer part of V
; Texture lookup: tex[v & 0xFF][u & 0xFF]
and.w #$FF,d5 ; U mask (free wrap)
and.w #$FF,d6 ; V mask (free wrap)
lsl.w #8,d6 ; V × 256 (row offset)
move.b (a1,d5.w),d7 ; Texel = tex[v*256+u]
move.b d7,(a0)+ ; Write to chunky buffer
add.l d2,d0 ; U += dU/dx
add.l d3,d1 ; V += dV/dx
dbra d4,.roto_inner
```
---
## Technique 3: Dot Tunnel
The dot tunnel renders concentric rings that appear to fly toward the viewer, creating the illusion of traveling through a tunnel. Each ring is a circle rendered at a specific Z-depth.
### Algorithm
```c
/* dot_tunnel.c — Z-ordered ring tunnel effect */
#define NUM_RINGS 30
#define MAX_Z 1024
#define RING_POINTS 32
typedef struct {
FIXED z; /* Depth (0=near, far=background) */
FIXED radius; /* Apparent radius (decreases with z) */
WORD cx, cy; /* Center (can be animated) */
UWORD color; /* Ring color */
} Ring;
void render_dot_tunnel(Ring *rings, int num_rings, ULONG frame) {
FIXED sa = fast_sin(angle * TRIG_TABLE_SIZE / 360);
FIXED ca = fast_cos(angle * TRIG_TABLE_SIZE / 360);
WORD px = cx + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), sa));
WORD py = cy + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), ca));
/* Plot pixel or draw Blitter circle at (px, py) */
plot_dot(px, py, rings[i].color);
}
}
}
```
---
## Technique 4: Voxel Space
Voxel space renders a 3D landscape from a 2D heightmap and colormap. The algorithm casts rays from the viewer, one per screen column, and draws vertical strips of pixels. The result is a fly-over landscape effect, as seen in the 1994 demo "Space Rangers" by Rebels.
### Algorithm (Column-Based Raycasting)
```mermaid
sequenceDiagram
participant CPU as 68000 CPU
participant HM as Heightmap
participant CM as Colormap
participant Screen as Screen Buffer
Note over CPU: For each screen column x (0-319):
CPU->>CPU: Calculate ray direction for column x
CPU->>HM: Sample height at (ray_x, ray_z)
HM-->>CPU: height value h
CPU->>CPU: Project h to screen Y: y = horizon - h/z
CPU->>CM: Get color at (ray_x, ray_z)
CM-->>CPU: color value
CPU->>Screen: Draw vertical line from y to previous y in color
Using floating-point math on the 68000. The 68881 FPU is optional — most Amigas don't have one. Software floating-point emulation is **100× slower** than fixed-point.
**Broken:**
```c
/* Don't do this — requires FPU or slow software emulation */
float x = sin(angle) * distance;
float y = cos(angle) * distance;
```
**Fixed:**
```c
/* Use fixed-point with pre-calculated tables */
FIXED x = fixed_mul(fast_sin(angle), distance);
FIXED y = fixed_mul(fast_cos(angle), distance);
```
### 2. The Per-Pixel Divide
Calling `fixed_div()` for every pixel in a rotozoom or voxel renderer. Division is the most expensive operation on the 68000 (~140 cycles for 16-bit, ~280 for 32-bit).
**Broken:**
```c
for (x = 0; x <320;x++){
for (y = 0; y <256;y++){
FIXED u = fixed_div(x, z); /* DIVIDE PER PIXEL! */
FIXED v = fixed_div(y, z);
}
}
```
**Fixed:**
```c
/* Pre-compute step values (multiply instead of divide) */
FIXED du_dx = fixed_mul(scale, inv_z); /* One divide per frame */
FIXED dv_dy = fixed_mul(scale, inv_z);
for (y = 0; y <256;y++){
FIXED u = u_start;
for (x = 0; x <320;x++){
u += du_dx; /* ADD, not multiply/divide */
}
u_start += du_dy;
}
```
### 3. The Backface Cull Miss
Skipping backface culling for convex objects. Every polygon drawn behind other polygons is wasted Blitter time. A simple dot-product test rejects ~50% of faces.
**Broken:**
```c
/* Draw all faces — 50% are facing away! */
for (i = 0; i <num_faces;i++){
draw_filled_polygon(&faces[i]); /* Wastes time on hidden faces */
}
```
**Fixed:**
```c
for (i = 0; i <num_faces;i++){
/* Backface cull: if face normal points away, skip it */
FIXED nx = compute_normal_x(&faces[i]);
FIXED nz = compute_normal_z(&faces[i]);
if (nz <0)continue;/*Facingawayfromcamera*/
draw_filled_polygon(&faces[i]);
}
```
### 4. The Unsorted Z-Fight
Drawing faces in random order without depth sorting. Overlapping polygons flicker as they overwrite each other unpredictably each frame.
**Broken:**
```c
/* Draw faces in arbitrary order → z-fighting */
for (i = 0; i <num_faces;i++){
draw_filled_polygon(&faces[i]);
}
```
**Fixed:**
```c
/* Sort by average Z depth (painter's algorithm) */
Using a naive chunky-to-planar conversion for rotozoom/voxel output. The naive method processes each pixel individually with bit shifts, taking over 1 second per frame on a stock 68000.
| **Blitter fill timing** | Fill must use exact inclusive-odd rule | Emulators must match Blitter fill behavior precisely |
| **Line-draw accuracy** | Blitter Bresenham must match real hardware | Affects polygon edge positions |
| **C2P pipeline** | Chunky→Planar timing affects frame rate | Must be accounted for in demo timing |
| **Fixed-point overflow** | 68000 MULS.L/DIVS.L edge cases | 32-bit overflow behavior must match hardware |
| **Blitter-CPU interleaving** | BLTPRI affects CPU stall duration | Must match real Blitter busy-wait timing |
---
## FAQ
**Q: Why not use the FPU for 3D math?**
A: The 68881/68882 FPU is optional hardware that most Amiga models don't have. Software FPU emulation is 50-100× slower than fixed-point integer math. Only 68030/040/060 accelerated Amigas typically have an FPU, and even then, fixed-point is faster for many operations because the 68000's integer multiply is well-optimized.
**Q: What is the fastest C2P for rotozoom?**
A: The Kalms C2P is the standard. For AGA machines with 32-bit Blitter access, a Blitter-assisted C2P can be even faster. For RTG cards, C2P is unnecessary — write directly to chunky VRAM. See [Pixel Conversion](../08_graphics/pixel_conversion.md) for benchmarks.
**Q: How do I handle polygon clipping?**
A: For simple 3D objects, screen-space clipping (Cohen-Sutherland or Sutherland-Hodgman) is sufficient. Clip against the four screen edges. For objects that can go behind the camera, you need near-plane clipping in 3D space — this is much more complex and most demos avoid it by keeping objects in front of the camera.
**Q: Can I do texture-mapped polygons (not just rotozoom)?**
A: Yes, but affine texture mapping (per-polygon UV) produces visible distortion on large polygons. Correct perspective texture mapping requires per-pixel division, which is too slow on a 68000. Most demos use subdivision (split large polygons into smaller ones) or simply use rotozoom for the entire screen.
**Q: What is a dot matrix / voxel display?**
A: A voxel (volume pixel) display renders 3D data as a grid of points. On the Amiga, this typically means rendering heightmap terrain as vertical columns (voxel space) or rendering 3D point clouds. The Blitter's line-draw mode can efficiently render individual dots.