amiga-bootcamp/17_demoscene/3d_rendering.md

875 lines
31 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

[← Home](../README.md) · [Demoscene Techniques](README.md)
# 3D Rendering — Fixed-Point Math, Blitter Polygons, Rotozoom, Dot Tunnels, and Voxel Space
## Overview
In 1985, the Amiga had no 3D hardware. No matrix engine, no floating-point unit, no texture mapper — just a 7 MHz integer-only 68000, a Blitter that could copy rectangles, and a Copper that could change display registers. Yet by 1990, demoscene coders were rendering real-time filled 3D objects, and by 1994 they were flying through voxel landscapes at playable framerates. The entire 3D pipeline — projection, clipping, rasterization, fill — was built from scratch in hand-tuned 68000 assembly using fixed-point arithmetic.
This article covers the demoscene 3D rendering techniques that made it possible: fixed-point math, Blitter-filled polygons, texture-mapped rotozoom, dot tunnels, and voxel space rendering. Each technique maps to a specific hardware capability — and the demoscene's creative abuse of it.
```mermaid
graph TB
subgraph "Math Foundation"
FP["Fixed-Point<br/>16.16 arithmetic"]
MATRIX["Matrix Ops<br/>Rotation × projection"]
SINCOS["Sine/Cosine<br/>Table lookup"]
end
subgraph "Rendering"
FILL["Filled Polygons<br/>Blitter line + fill"]
TEXTURE["Rotozoom<br/>Affine texture map"]
DOT["Dot Tunnel<br/>Z-ordered circles"]
VOXEL["Voxel Space<br/>Raycast heightmap"]
end
subgraph "Optimization"
C2P["Chunky-to-Planar<br/>Kalms/Blitter C2P"]
DIV["Fast Division<br/>Reciprocal table"]
CLIP["Screen-Space Clip<br/>Cohen-Sutherland"]
end
FP --> FILL
FP --> TEXTURE
MATRIX --> FILL
SINCOS --> MATRIX
FILL --> DOT
TEXTURE --> VOXEL
C2P --> TEXTURE
DIV --> VOXEL
```
---
## Foundation: Fixed-Point Arithmetic
The 68000 has no floating-point unit. All 3D math must use integers. The solution is **fixed-point** — encoding fractional values as integers with an implicit decimal point.
### 16.16 Fixed-Point Format
```
┌──────────────────────────────────┬──────────────────────────────────┐
│ Upper 16 bits: integer part │ Lower 16 bits: fractional part │
│ (signed, two's complement) │ (unsigned, represents 0 to ~1) │
└──────────────────────────────────┴──────────────────────────────────┘
Example: 1.5 = $00018000
0.25 = $00004000
-1.0 = $FFFF0000
π = $0003243F (3.14159...)
```
### Fixed-Point Operations
```c
/* fixedpoint.h — 16.16 fixed-point arithmetic for 68000 */
typedef LONG FIXED; /* 32-bit signed: 16.16 format */
#define INT_TO_FIXED(x) ((FIXED)((x) << 16))
#define FIXED_TO_INT(x) ((WORD)((x) >> 16)) /* Truncate */
#define FIXED_TO_INT_R(x) (WORD)(((x) + 0x8000) >> 16) /* Round */
#define FLOAT_TO_FIXED(f) ((FIXED)((f) * 65536.0))
/* Multiply: result = a × b / 65536
On 68000, use 32×32→64 MULS.L then shift right 16 */
static inline FIXED fixed_mul(FIXED a, FIXED b) {
/* 68000 asm:
move.l a, d0
muls.l b, d0:d1 ; d0:d1 = 64-bit result
swap d0 ; d0 = upper 32 bits (already >> 16)
; d0 contains the result
*/
return (FIXED)(((LONG)a * (LONG)b) >> 16);
}
/* Divide: result = a × 65536 / b
Must multiply first to avoid losing precision */
static inline FIXED fixed_div(FIXED a, FIXED b) {
/* 68000: be careful with overflow!
Use reciprocal table for perspective division */
return (FIXED)(((LONG)a << 16) / (LONG)b);
}
```
### Sine/Cosine Tables
Pre-calculated lookup tables are essential — computing `sin()` at runtime is too slow:
```c
/* trig_tables.c — Pre-calculated 16.16 sine/cosine tables */
/* 1024 entries covering 0-2π, index = angle × 1024 / (2π) */
#define TRIG_TABLE_SIZE 1024
#define ANGLE_2PI 1024 /* Full circle = 1024 units */
/* 16.16 fixed-point: sin values range from -1.0 ($FFFF0000) to 1.0 ($00010000) */
static const FIXED sin_table[TRIG_TABLE_SIZE]; /* Generated at build time */
/* Fast lookup with wrapping */
static inline FIXED fast_sin(int angle) {
return sin_table[angle & (TRIG_TABLE_SIZE - 1)];
}
static inline FIXED fast_cos(int angle) {
return sin_table[(angle + 256) & (TRIG_TABLE_SIZE - 1)]; /* cos = sin(x+π/2) */
}
```
---
## Foundation: Matrix Operations
3D rotation uses 3×3 matrices multiplied with vertex coordinates. Each rotation (X, Y, Z axis) is a matrix multiply:
### Rotation Matrix Construction
```c
/* matrix3d.c — 3D rotation matrices using fixed-point */
typedef struct {
FIXED m[3][3]; /* 3×3 rotation matrix */
} Matrix3D;
/* Build rotation matrix from Euler angles */
void build_rotation_matrix(Matrix3D *mat, int ax, int ay, int az) {
FIXED sx = fast_sin(ax), cx = fast_cos(ax);
FIXED sy = fast_sin(ay), cy = fast_cos(ay);
FIXED sz = fast_sin(az), cz = fast_cos(az);
/* Combined Z×Y×X rotation (standard demoscene order) */
mat->m[0][0] = fixed_mul(cy, cz);
mat->m[0][1] = fixed_mul(fixed_mul(sx, sy), cz) - fixed_mul(cx, sz);
mat->m[0][2] = fixed_mul(fixed_mul(cx, sy), cz) + fixed_mul(sx, sz);
mat->m[1][0] = fixed_mul(cy, sz);
mat->m[1][1] = fixed_mul(fixed_mul(sx, sy), sz) + fixed_mul(cx, cz);
mat->m[1][2] = fixed_mul(fixed_mul(cx, sy), sz) - fixed_mul(sx, cz);
mat->m[2][0] = -sy;
mat->m[2][1] = fixed_mul(sx, cy);
mat->m[2][2] = fixed_mul(cx, cy);
}
/* Transform a vertex: result = matrix × vertex */
void transform_vertex(const Matrix3D *mat, FIXED vx, FIXED vy, FIXED vz,
FIXED *rx, FIXED *ry, FIXED *rz) {
*rx = fixed_mul(mat->m[0][0], vx) +
fixed_mul(mat->m[0][1], vy) +
fixed_mul(mat->m[0][2], vz);
*ry = fixed_mul(mat->m[1][0], vx) +
fixed_mul(mat->m[1][1], vy) +
fixed_mul(mat->m[1][2], vz);
*rz = fixed_mul(mat->m[2][0], vx) +
fixed_mul(mat->m[2][1], vy) +
fixed_mul(mat->m[2][2], vz);
}
```
### Perspective Projection
```c
/* project.c — Perspective projection to screen coordinates */
#define SCREEN_CX 160 /* Center X (320 wide) */
#define SCREEN_CY 128 /* Center Y (256 tall) */
#define FOCAL_LEN 256 /* Focal length in fixed-point */
void project_vertex(FIXED x, FIXED y, FIXED z,
WORD *sx, WORD *sy) {
/* Perspective divide: screen = world × focal / z */
if (z > INT_TO_FIXED(1)) { /* Avoid division by zero */
FIXED scale = fixed_div(INT_TO_FIXED(FOCAL_LEN), z);
*sx = SCREEN_CX + FIXED_TO_INT(fixed_mul(x, scale));
*sy = SCREEN_CY - FIXED_TO_INT(fixed_mul(y, scale)); /* Y flipped */
} else {
*sx = SCREEN_CX; /* Behind camera */
*sy = SCREEN_CY;
}
}
```
---
## Technique 1: Blitter-Filled Polygons
The Blitter's **line-draw + fill mode** combination is the foundation of Amiga 3D rendering. The process:
1. Draw polygon edges using Blitter line mode (sets pixels at boundaries)
2. Activate Blitter fill mode (fills between set pixels, even→odd fill rule)
3. Result: a filled polygon with zero CPU pixel writing
```mermaid
sequenceDiagram
participant CPU as 68000 CPU
participant Blitter as Blitter
participant Bitmap as Bitplane Memory
Note over CPU: Polygon: 4 vertices
CPU->>Blitter: Configure line mode (BLTCON0=$0B7A)
CPU->>Blitter: Draw edge V1→V2
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V2→V3
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V3→V4
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Draw edge V4→V1
Blitter->>Bitmap: Write edge pixels
CPU->>Blitter: Configure fill mode (BLTCON0=$01F2)
CPU->>Blitter: Fill bitmap (even-odd rule)
Blitter->>Bitmap: Fill interior pixels
Note over Bitmap: Filled polygon ready
```
### Blitter Fill Mode Detail
The Blitter fill uses the **inclusive-odd** fill rule: scanning left to right, it inverts pixels each time it encounters a set bit. This means it fills between pairs of edge pixels:
```asm
; blit_fill.asm — Blitter fill for a single bitplane
; Assumes edges already drawn in the bitmap
; Fill from top to bottom of polygon
lea $DFF000,a0 ; Custom registers base
; Set up fill-mode blit
move.w #$01F2,BLTCON0(a0) ; Fill mode: A→D, fill enabled
move.w #$0000,BLTCON1(a0) ; No line mode, ascending
move.w #$FFFF,BLTAFWM(a0) ; First word mask = all bits
move.w #$FFFF,BLTALWM(a0) ; Last word mask = all bits
; Source A = the bitmap data (edge pixels)
move.l bitmap_start,BLTAPTH(a0) ; Source address
; Destination = same bitmap (fill in-place)
move.l bitmap_start,BLTDPTH(a0) ; Dest = source
; Blit size: height × width
; height = number of scanlines, width = words per line
move.w #(HEIGHT<<6)|(WORDS_PER_LINE),BLTSIZE(a0)
; Blitter starts immediately
```
### Multiple-Face Sorting (Painter's Algorithm)
For solid 3D objects, faces must be drawn back-to-front (painter's algorithm):
```c
/* face_sort.c — Sort polygon faces by depth for painter's algorithm */
typedef struct {
WORD num_vertices;
WORD *vertices; /* Index into vertex array */
FIXED avg_z; /* Average Z depth (for sorting) */
UWORD color; /* Face color */
} Face;
int compare_faces(const void *a, const void *b) {
FIXED za = ((const Face *)a)->avg_z;
FIXED zb = ((const Face *)b)->avg_z;
/* Sort far-to-near (painter's algorithm) */
if (za > zb) return -1; /* a is farther, draw first */
if (za < zb) return 1;
return 0;
}
void render_object(Face *faces, int num_faces,
FIXED *transformed_z) {
int i;
/* Calculate average Z for each face */
for (i = 0; i < num_faces; i++) {
FIXED sum = 0;
int j;
for (j = 0; j < faces[i].num_vertices; j++) {
sum += transformed_z[faces[i].vertices[j]];
}
faces[i].avg_z = sum / faces[i].num_vertices;
}
/* Sort back-to-front */
qsort(faces, num_faces, sizeof(Face), compare_faces);
/* Draw each face */
for (i = 0; i < num_faces; i++) {
draw_filled_polygon(&faces[i]);
}
}
```
---
## Technique 2: Rotozoom (Affine Texture Mapping)
Rotozoom renders a texture-mapped rectangle that can rotate and scale in real-time. The name comes from **rotate + zoom**. It works by computing a texture coordinate (U,V) for each screen pixel using an affine transform.
### Algorithm
For each screen pixel (x, y), compute texture coordinates:
```
U = U_start + x × dU_dx + y × dU_dy
V = V_start + x × dV_dx + y × dV_dy
```
Where `dU_dx`, `dV_dx`, `dU_dy`, `dV_dy` are the rotation+scale matrix coefficients.
```c
/* rotozoom.c — Affine texture mapping (rotzoom) */
#define SCREEN_W 320
#define SCREEN_H 256
#define TEX_SIZE 256 /* Texture is 256×256 */
extern UBYTE texture[TEX_SIZE][TEX_SIZE]; /* Chunky texture */
extern UBYTE *chunky_buffer; /* Output chunky buffer */
void render_rotozoom(FIXED cx, FIXED cy, /* Texture center offset */
FIXED angle, FIXED zoom) {
FIXED cos_a = fast_cos(angle);
FIXED sin_a = fast_sin(angle);
FIXED inv_zoom = fixed_div(INT_TO_FIXED(1), zoom);
/* Rotation × inverse zoom matrix coefficients */
FIXED du_dx = fixed_mul(cos_a, inv_zoom);
FIXED dv_dx = fixed_mul(sin_a, inv_zoom);
FIXED du_dy = fixed_mul(-sin_a, inv_zoom);
FIXED dv_dy = fixed_mul(cos_a, inv_zoom);
/* Start position: center of screen maps to texture center */
FIXED u_start = cx - fixed_mul(INT_TO_FIXED(SCREEN_W/2), du_dx)
- fixed_mul(INT_TO_FIXED(SCREEN_H/2), du_dy);
FIXED v_start = cy - fixed_mul(INT_TO_FIXED(SCREEN_W/2), dv_dx)
- fixed_mul(INT_TO_FIXED(SCREEN_H/2), dv_dy);
UBYTE *dst = chunky_buffer;
int y;
for (y = 0; y < SCREEN_H; y++) {
FIXED u = u_start;
FIXED v = v_start;
int x;
for (x = 0; x < SCREEN_W; x++) {
/* Texture lookup with wrapping */
*dst++ = texture[FIXED_TO_INT(u) & 0xFF]
[FIXED_TO_INT(v) & 0xFF];
u += du_dx;
v += dv_dx;
}
u_start += du_dy;
v_start += dv_dy;
}
/* Convert chunky buffer to planar bitplanes (C2P) */
chunky_to_planar(chunky_buffer, bitplane_data, SCREEN_W, SCREEN_H);
}
```
### Rotozoom in Assembly (Inner Loop)
The 68000 assembly inner loop is highly optimized. The key insight: texture coordinates wrap at power-of-2 boundaries, so masking with `$FF` (256-wide texture) is free using byte-level addressing:
```asm
; rotozoom_inner.asm — Optimized inner loop
; a0 = destination (chunky buffer)
; a1 = texture base (256×256)
; d0 = U (16.16 fixed)
; d1 = V (16.16 fixed)
; d2 = dU/dx (16.16 fixed)
; d3 = dV/dx (16.16 fixed)
; d4 = loop counter (SCREEN_W)
.roto_inner:
move.l d0,d5 ; Copy U
swap d5 ; d5.w = integer part of U
move.l d1,d6 ; Copy V
swap d6 ; d6.w = integer part of V
; Texture lookup: tex[v & 0xFF][u & 0xFF]
and.w #$FF,d5 ; U mask (free wrap)
and.w #$FF,d6 ; V mask (free wrap)
lsl.w #8,d6 ; V × 256 (row offset)
move.b (a1,d5.w),d7 ; Texel = tex[v*256+u]
move.b d7,(a0)+ ; Write to chunky buffer
add.l d2,d0 ; U += dU/dx
add.l d3,d1 ; V += dV/dx
dbra d4,.roto_inner
```
---
## Technique 3: Dot Tunnel
The dot tunnel renders concentric rings that appear to fly toward the viewer, creating the illusion of traveling through a tunnel. Each ring is a circle rendered at a specific Z-depth.
### Algorithm
```c
/* dot_tunnel.c — Z-ordered ring tunnel effect */
#define NUM_RINGS 30
#define MAX_Z 1024
#define RING_POINTS 32
typedef struct {
FIXED z; /* Depth (0=near, far=background) */
FIXED radius; /* Apparent radius (decreases with z) */
WORD cx, cy; /* Center (can be animated) */
UWORD color; /* Ring color */
} Ring;
void render_dot_tunnel(Ring *rings, int num_rings, ULONG frame) {
int i;
/* Update ring positions (move toward viewer) */
for (i = 0; i < num_rings; i++) {
rings[i].z -= INT_TO_FIXED(4); /* Speed toward viewer */
/* If ring passed camera, reset to far end */
if (rings[i].z < INT_TO_FIXED(1)) {
rings[i].z = INT_TO_FIXED(MAX_Z);
}
/* Perspective projection of radius */
rings[i].radius = fixed_div(
INT_TO_FIXED(200), /* Base radius */
rings[i].z /* Divide by depth */
);
}
/* Sort rings far-to-near (painter's algorithm) */
/* ... sort by rings[i].z descending ... */
/* Draw each ring */
for (i = 0; i < num_rings; i++) {
int p;
int radius = FIXED_TO_INT(rings[i].radius);
int cx = rings[i].cx + FIXED_TO_INT(
fixed_mul(fast_sin(frame * 3), INT_TO_FIXED(30)));
int cy = rings[i].cy + FIXED_TO_INT(
fixed_mul(fast_cos(frame * 5), INT_TO_FIXED(20)));
for (p = 0; p < RING_POINTS; p++) {
int angle = p * 360 / RING_POINTS;
FIXED sa = fast_sin(angle * TRIG_TABLE_SIZE / 360);
FIXED ca = fast_cos(angle * TRIG_TABLE_SIZE / 360);
WORD px = cx + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), sa));
WORD py = cy + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), ca));
/* Plot pixel or draw Blitter circle at (px, py) */
plot_dot(px, py, rings[i].color);
}
}
}
```
---
## Technique 4: Voxel Space
Voxel space renders a 3D landscape from a 2D heightmap and colormap. The algorithm casts rays from the viewer, one per screen column, and draws vertical strips of pixels. The result is a fly-over landscape effect, as seen in the 1994 demo "Space Rangers" by Rebels.
### Algorithm (Column-Based Raycasting)
```mermaid
sequenceDiagram
participant CPU as 68000 CPU
participant HM as Heightmap
participant CM as Colormap
participant Screen as Screen Buffer
Note over CPU: For each screen column x (0-319):
CPU->>CPU: Calculate ray direction for column x
CPU->>HM: Sample height at (ray_x, ray_z)
HM-->>CPU: height value h
CPU->>CPU: Project h to screen Y: y = horizon - h/z
CPU->>CM: Get color at (ray_x, ray_z)
CM-->>CPU: color value
CPU->>Screen: Draw vertical line from y to previous y in color
Note over CPU: Advance ray (step outward)
CPU->>HM: Sample next point...
```
### Voxel Space Implementation
```c
/* voxelspace.c — Column-based voxel landscape rendering */
#define SCREEN_W 320
#define SCREEN_H 256
#define MAP_SIZE 256
#define HORIZON 100 /* Horizon line Y position */
#define MAX_DEPTH 200 /* Maximum ray distance */
extern UBYTE heightmap[MAP_SIZE][MAP_SIZE];
extern UBYTE colormap[MAP_SIZE][MAP_SIZE];
extern UBYTE *chunky_buffer;
void render_voxel_space(FIXED cam_x, FIXED cam_z,
FIXED cam_angle, FIXED cam_height) {
int x;
for (x = 0; x < SCREEN_W; x++) {
/* Ray angle: camera angle + column offset */
FIXED column_offset = INT_TO_FIXED(x - SCREEN_W/2);
FIXED ray_angle = cam_angle + fixed_div(column_offset,
INT_TO_FIXED(FOCAL_LEN));
FIXED ray_dx = fast_cos(ray_angle); /* Direction X */
FIXED ray_dz = fast_sin(ray_angle); /* Direction Z */
FIXED ray_x = cam_x;
FIXED ray_z = cam_z;
WORD prev_draw_y = SCREEN_H; /* Bottom of column */
int distance;
for (distance = 1; distance < MAX_DEPTH; distance++) {
FIXED dz = fixed_div(INT_TO_FIXED(distance), ray_dz);
FIXED dx = fixed_div(INT_TO_FIXED(distance), ray_dx);
/* Current map position */
int mx = (FIXED_TO_INT(cam_x + dx)) & (MAP_SIZE - 1);
int mz = (FIXED_TO_INT(cam_z + dz)) & (MAP_SIZE - 1);
/* Height at this point */
FIXED terrain_h = INT_TO_FIXED(heightmap[mz][mx]);
/* Project to screen Y */
FIXED height_diff = terrain_h - cam_height;
WORD draw_y = HORIZON -
FIXED_TO_INT(fixed_div(height_diff,
INT_TO_FIXED(distance)));
/* Only draw if above previously drawn pixel */
if (draw_y < prev_draw_y) {
UBYTE color = colormap[mz][mx];
int y;
for (y = draw_y; y < prev_draw_y; y++) {
chunky_buffer[y * SCREEN_W + x] = color;
}
prev_draw_y = draw_y;
}
/* Step ray outward */
ray_x += ray_dx;
ray_z += ray_dz;
}
}
/* C2P conversion for planar display */
chunky_to_planar(chunky_buffer, bitplane_data, SCREEN_W, SCREEN_H);
}
```
### Voxel Space Optimization
The naive algorithm is too slow for 50 FPS on a 68000. Key optimizations:
| Optimization | Speedup | How |
|-------------|---------|-----|
| **Reciprocal table** | 2× | Pre-compute 1/z values, avoid division |
| **Step doubling** | 3-4× | Double step size beyond certain depth (less detail needed) |
| **Height caching** | 1.5× | Cache last N height lookups |
| **Reduced resolution** | 2-4× | Render at 160×128 and scale up (acceptable for landscape) |
| **Fast C2P** | 30× | Use Kalms C2P instead of naive conversion |
---
## Performance Budget
### 3D Rendering Costs on Stock A500 (7 MHz 68000)
| Operation | Cycles (approx.) | Notes |
|-----------|------------------|-------|
| Fixed-point multiply | ~28 | `MULS.W` (16×16→32) |
| Fixed-point divide | ~140 | `DIVS.W` — very expensive! |
| Sine table lookup | ~12 | Table indexed by angle |
| Vertex transform | ~300 | 3 multiplies + 3 adds per axis |
| Perspective divide | ~160 | 2 divides per vertex |
| Blitter line draw | ~200/edge | DMA time for edge |
| Blitter fill | ~2000/polygon | Depends on polygon size |
| Full C2P (Kalms) | ~35ms | 320×256 × 8bpp |
| Voxel column | ~500/col | Heightmap lookup + draw |
### Frame Budget (PAL: 20ms per frame)
| Effect | Vertices | Time | FPS |
|--------|----------|------|-----|
| Single flat-shaded cube | 8 | ~3ms | 50 |
| 100-face object | 30+ | ~12ms | 30-50 |
| Rotozoom 320×256 | 0 (per pixel) | ~40ms (with C2P) | 15-25 |
| Dot tunnel 30 rings | 960 dots | ~8ms | 50 |
| Voxel space 320×256 | 64K cols | ~80ms | 6-12 |
---
## Antipatterns
### 1. The Floating-Point Temptation
Using floating-point math on the 68000. The 68881 FPU is optional — most Amigas don't have one. Software floating-point emulation is **100× slower** than fixed-point.
**Broken:**
```c
/* Don't do this — requires FPU or slow software emulation */
float x = sin(angle) * distance;
float y = cos(angle) * distance;
```
**Fixed:**
```c
/* Use fixed-point with pre-calculated tables */
FIXED x = fixed_mul(fast_sin(angle), distance);
FIXED y = fixed_mul(fast_cos(angle), distance);
```
### 2. The Per-Pixel Divide
Calling `fixed_div()` for every pixel in a rotozoom or voxel renderer. Division is the most expensive operation on the 68000 (~140 cycles for 16-bit, ~280 for 32-bit).
**Broken:**
```c
for (x = 0; x < 320; x++) {
for (y = 0; y < 256; y++) {
FIXED u = fixed_div(x, z); /* DIVIDE PER PIXEL! */
FIXED v = fixed_div(y, z);
}
}
```
**Fixed:**
```c
/* Pre-compute step values (multiply instead of divide) */
FIXED du_dx = fixed_mul(scale, inv_z); /* One divide per frame */
FIXED dv_dy = fixed_mul(scale, inv_z);
for (y = 0; y < 256; y++) {
FIXED u = u_start;
for (x = 0; x < 320; x++) {
u += du_dx; /* ADD, not multiply/divide */
}
u_start += du_dy;
}
```
### 3. The Backface Cull Miss
Skipping backface culling for convex objects. Every polygon drawn behind other polygons is wasted Blitter time. A simple dot-product test rejects ~50% of faces.
**Broken:**
```c
/* Draw all faces — 50% are facing away! */
for (i = 0; i < num_faces; i++) {
draw_filled_polygon(&faces[i]); /* Wastes time on hidden faces */
}
```
**Fixed:**
```c
for (i = 0; i < num_faces; i++) {
/* Backface cull: if face normal points away, skip it */
FIXED nx = compute_normal_x(&faces[i]);
FIXED nz = compute_normal_z(&faces[i]);
if (nz < 0) continue; /* Facing away from camera */
draw_filled_polygon(&faces[i]);
}
```
### 4. The Unsorted Z-Fight
Drawing faces in random order without depth sorting. Overlapping polygons flicker as they overwrite each other unpredictably each frame.
**Broken:**
```c
/* Draw faces in arbitrary order → z-fighting */
for (i = 0; i < num_faces; i++) {
draw_filled_polygon(&faces[i]);
}
```
**Fixed:**
```c
/* Sort by average Z depth (painter's algorithm) */
qsort(faces, num_faces, sizeof(Face), compare_faces_back_to_front);
for (i = 0; i < num_faces; i++) {
draw_filled_polygon(&faces[i]);
}
```
### 5. The Naive C2P
Using a naive chunky-to-planar conversion for rotozoom/voxel output. The naive method processes each pixel individually with bit shifts, taking over 1 second per frame on a stock 68000.
**Broken:**
```c
/* Naive C2P: ~70,000 pixels/sec — 0.9 FPS for 320×256 */
for (i = 0; i < 320*256; i++) {
int pixel = chunky[i];
for (bit = 0; bit < 8; bit++) {
planes[bit][i/8] |= ((pixel >> bit) & 1) << (7 - (i & 7));
}
}
```
**Fixed:**
```c
/* Use Kalms C2P or Blitter-assisted C2P: ~30× faster */
kalms_c2p(chunky_buffer, planar_data, 320, 256);
/* See pixel_conversion.md for full implementation */
```
---
## Decision Guide
```mermaid
flowchart TD
START[Need 3D rendering] --> Q1{What are you rendering?}
Q1 -->|Solid objects| Q2{Convex or arbitrary?}
Q1 -->|Landscapes| VOXEL[Voxel Space]
Q1 -->|Textures| ROTZ[Rotozoom]
Q1 -->|Abstract/Tunnel| DOT[Dot Tunnel]
Q2 -->|Convex| Q3{Number of faces?}
Q2 -->|Arbitrary| BSP[BSP-tree or Z-buffer approach]
Q3 -->|<100| SIMPLE[Simple painter's algorithm<br/>+ backface cull]
Q3 -->|100-500| SORT[Sorted faces + Blitter fill]
Q3 -->|>500| Q4{Stock A500 or accelerated?}
Q4 -->|Stock A500| REDUCE[Reduce geometry or<br/>use wireframe only]
Q4 -->|Accelerated| OPT[Optimized fill + C2P pipeline]
VOXEL --> Q5{Resolution?}
Q5 -->|Full 320×256| SLOW[~8 FPS on stock A500]
Q5 -->|Half 160×128| OK[~15-25 FPS on A500]
ROTZ --> Q6{Platform?}
Q6 -->|OCS/ECS| C2P_PATH[C2P required<br/>~15-25 FPS]
Q6 -->|RTG/AGA| CHUNKY[Direct chunky write<br/>~50 FPS possible]
```
---
## Historical Timeline
```mermaid
timeline
title 3D Rendering on Amiga
1987 : First wireframe 3D demos (line drawing only)
1988 : Hidden line removal algorithms
: Flat-shaded 3D in demoscene
1989 : Phenomena — filled polygon 3D objects
: Fixed-point math becomes standard
1990 : Complex demo — multiple filled objects
: First rotozoom effects appear
1991 : Texture-mapped rotozoom at acceptable framerates
: 3D starfields with depth sorting
1992 : Voxel space demos appear (low resolution)
: Dot tunnels with Blitter-optimized rendering
1993 : Spaceballs — state-of-the-art dot tunnel
: Higher-resolution voxel landscapes
1994 : Rebels — smooth voxel space fly-over
: 68040-accelerated 3D at full frame rate
1995 : Polka Brothers — CPU raytracer (proof of concept)
: 68060-accelerated demos with lighting
2000+ : Demoscene pushes 3D on stock A500
: Group-optimized rotozoom achieves new speed records
: MiSTer preserves exact Blitter timing for fill accuracy
```
---
## Modern Analogies
| Amiga 3D Concept | Modern Equivalent | Why It Maps |
|-----------------|-------------------|-------------|
| Fixed-point 16.16 | Half-precision float (FP16) | Both trade precision for speed |
| Sine lookup table | GPU SFU (Special Function Unit) | Both use hardware-assisted transcendental |
| Blitter fill mode | GPU rasterizer | Both fill polygon interiors |
| Painter's algorithm | Z-buffer / depth test | Both solve polygon visibility |
| Backface culling | GPU backface culling | Both skip invisible faces |
| Rotozoom | Affine texture sampling | Both use 2×2 matrix transform per pixel |
| Voxel space raycasting | Heightfield terrain shader | Both cast rays through a heightmap |
| C2P conversion | Texture swizzle/deswizzle | Both convert between memory layouts |
| Reciprocal table | GPU reciprocal approximation | Both avoid expensive division |
| Chunky buffer | Render-to-texture (FBO) | Both render to off-screen buffer |
---
## Use Cases
| Use Case | Technique | Notable Examples |
|----------|-----------|-----------------|
| 3D game objects | Filled polygons | Flight simulators, Elite clones |
| Rotating logo | Rotozoom | Every demo with a bitmap logo |
| Tunnel fly-through | Dot tunnel | Spaceballs, numerous demos |
| Landscape fly-over | Voxel space | Rebels, numerous demos |
| 3D chess/board games | Filled polygons + sorting | Various Amiga games |
| Virtual reality scenes | Combined techniques | Various demo compos |
| Star field | Z-ordered point rendering | Standard demo effect |
| Wavy floor/ceiling | Rotozoom variant | Doom-like perspective tricks |
---
## FPGA / Emulation Impact
| Concern | Impact | Notes |
|---------|--------|-------|
| **Blitter fill timing** | Fill must use exact inclusive-odd rule | Emulators must match Blitter fill behavior precisely |
| **Line-draw accuracy** | Blitter Bresenham must match real hardware | Affects polygon edge positions |
| **C2P pipeline** | Chunky→Planar timing affects frame rate | Must be accounted for in demo timing |
| **Fixed-point overflow** | 68000 MULS.L/DIVS.L edge cases | 32-bit overflow behavior must match hardware |
| **Blitter-CPU interleaving** | BLTPRI affects CPU stall duration | Must match real Blitter busy-wait timing |
---
## FAQ
**Q: Why not use the FPU for 3D math?**
A: The 68881/68882 FPU is optional hardware that most Amiga models don't have. Software FPU emulation is 50-100× slower than fixed-point integer math. Only 68030/040/060 accelerated Amigas typically have an FPU, and even then, fixed-point is faster for many operations because the 68000's integer multiply is well-optimized.
**Q: What is the fastest C2P for rotozoom?**
A: The Kalms C2P is the standard. For AGA machines with 32-bit Blitter access, a Blitter-assisted C2P can be even faster. For RTG cards, C2P is unnecessary — write directly to chunky VRAM. See [Pixel Conversion](../08_graphics/pixel_conversion.md) for benchmarks.
**Q: How do I handle polygon clipping?**
A: For simple 3D objects, screen-space clipping (Cohen-Sutherland or Sutherland-Hodgman) is sufficient. Clip against the four screen edges. For objects that can go behind the camera, you need near-plane clipping in 3D space — this is much more complex and most demos avoid it by keeping objects in front of the camera.
**Q: Can I do texture-mapped polygons (not just rotozoom)?**
A: Yes, but affine texture mapping (per-polygon UV) produces visible distortion on large polygons. Correct perspective texture mapping requires per-pixel division, which is too slow on a 68000. Most demos use subdivision (split large polygons into smaller ones) or simply use rotozoom for the entire screen.
**Q: What is a dot matrix / voxel display?**
A: A voxel (volume pixel) display renders 3D data as a grid of points. On the Amiga, this typically means rendering heightmap terrain as vertical columns (voxel space) or rendering 3D point clouds. The Blitter's line-draw mode can efficiently render individual dots.
---
## References
### Related Knowledge Base Articles
- [Pixel Conversion](../08_graphics/pixel_conversion.md) — C2P algorithms (Kalms, Blitter, Akiko)
- [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Fill mode, line draw, minterms
- [Blitter](../08_graphics/blitter/blitter.md) — Blitter hardware architecture
- [Bitmap](../08_graphics/bitmap.md) — Bitplane memory layout, interleaving
- [Copper Effects](copper_effects.md) — Copper-driven display effects
- [Timing Optimization](timing_optimization.md) — Cycle counting, Blitter-CPU interleaving
- [FPU Architecture](../15_fpu_mmu_cache/fpu_architecture.md) — 68881/68882 floating-point
### External Resources
- **Amiga Hardware Reference Manual** — Blitter fill mode, line-draw mode
- **Scoopex Amiga Hardware Programming** (Photon) — [YouTube playlist](https://www.youtube.com/playlist?list=PLc3ltHgmiidpK-s0eP5hTKJnjdTHz0_bW) — Blitter fill mode and line-draw video walkthroughs; companion articles at [coppershade.org](http://coppershade.org/)
- **Pouet.net** — https://www.pouet.net — 3D demo releases with source code
- **Demozoo** — https://demozoo.org — Demoscene production encyclopedia
- **Amiga Graphics Archive** — https://amiga.lychesis.net — Copper-enhanced 3D rendering analysis in commercial games
- **Kalms C2P** — Standard chunky-to-planar implementation
- **Comanche Voxel Engine** — Original voxel space algorithm reference (NovaLogic)