[← Home](../README.md) · [Demoscene Techniques](README.md) # 3D Rendering — Fixed-Point Math, Blitter Polygons, Rotozoom, Dot Tunnels, and Voxel Space ## Overview In 1985, the Amiga had no 3D hardware. No matrix engine, no floating-point unit, no texture mapper — just a 7 MHz integer-only 68000, a Blitter that could copy rectangles, and a Copper that could change display registers. Yet by 1990, demoscene coders were rendering real-time filled 3D objects, and by 1994 they were flying through voxel landscapes at playable framerates. The entire 3D pipeline — projection, clipping, rasterization, fill — was built from scratch in hand-tuned 68000 assembly using fixed-point arithmetic. This article covers the demoscene 3D rendering techniques that made it possible: fixed-point math, Blitter-filled polygons, texture-mapped rotozoom, dot tunnels, and voxel space rendering. Each technique maps to a specific hardware capability — and the demoscene's creative abuse of it. ```mermaid graph TB subgraph "Math Foundation" FP["Fixed-Point
16.16 arithmetic"] MATRIX["Matrix Ops
Rotation × projection"] SINCOS["Sine/Cosine
Table lookup"] end subgraph "Rendering" FILL["Filled Polygons
Blitter line + fill"] TEXTURE["Rotozoom
Affine texture map"] DOT["Dot Tunnel
Z-ordered circles"] VOXEL["Voxel Space
Raycast heightmap"] end subgraph "Optimization" C2P["Chunky-to-Planar
Kalms/Blitter C2P"] DIV["Fast Division
Reciprocal table"] CLIP["Screen-Space Clip
Cohen-Sutherland"] end FP --> FILL FP --> TEXTURE MATRIX --> FILL SINCOS --> MATRIX FILL --> DOT TEXTURE --> VOXEL C2P --> TEXTURE DIV --> VOXEL ``` --- ## Foundation: Fixed-Point Arithmetic The 68000 has no floating-point unit. All 3D math must use integers. The solution is **fixed-point** — encoding fractional values as integers with an implicit decimal point. ### 16.16 Fixed-Point Format ``` ┌──────────────────────────────────┬──────────────────────────────────┐ │ Upper 16 bits: integer part │ Lower 16 bits: fractional part │ │ (signed, two's complement) │ (unsigned, represents 0 to ~1) │ └──────────────────────────────────┴──────────────────────────────────┘ Example: 1.5 = $00018000 0.25 = $00004000 -1.0 = $FFFF0000 π = $0003243F (3.14159...) ``` ### Fixed-Point Operations ```c /* fixedpoint.h — 16.16 fixed-point arithmetic for 68000 */ typedef LONG FIXED; /* 32-bit signed: 16.16 format */ #define INT_TO_FIXED(x) ((FIXED)((x) << 16)) #define FIXED_TO_INT(x) ((WORD)((x) >> 16)) /* Truncate */ #define FIXED_TO_INT_R(x) (WORD)(((x) + 0x8000) >> 16) /* Round */ #define FLOAT_TO_FIXED(f) ((FIXED)((f) * 65536.0)) /* Multiply: result = a × b / 65536 On 68000, use 32×32→64 MULS.L then shift right 16 */ static inline FIXED fixed_mul(FIXED a, FIXED b) { /* 68000 asm: move.l a, d0 muls.l b, d0:d1 ; d0:d1 = 64-bit result swap d0 ; d0 = upper 32 bits (already >> 16) ; d0 contains the result */ return (FIXED)(((LONG)a * (LONG)b) >> 16); } /* Divide: result = a × 65536 / b Must multiply first to avoid losing precision */ static inline FIXED fixed_div(FIXED a, FIXED b) { /* 68000: be careful with overflow! Use reciprocal table for perspective division */ return (FIXED)(((LONG)a << 16) / (LONG)b); } ``` ### Sine/Cosine Tables Pre-calculated lookup tables are essential — computing `sin()` at runtime is too slow: ```c /* trig_tables.c — Pre-calculated 16.16 sine/cosine tables */ /* 1024 entries covering 0-2π, index = angle × 1024 / (2π) */ #define TRIG_TABLE_SIZE 1024 #define ANGLE_2PI 1024 /* Full circle = 1024 units */ /* 16.16 fixed-point: sin values range from -1.0 ($FFFF0000) to 1.0 ($00010000) */ static const FIXED sin_table[TRIG_TABLE_SIZE]; /* Generated at build time */ /* Fast lookup with wrapping */ static inline FIXED fast_sin(int angle) { return sin_table[angle & (TRIG_TABLE_SIZE - 1)]; } static inline FIXED fast_cos(int angle) { return sin_table[(angle + 256) & (TRIG_TABLE_SIZE - 1)]; /* cos = sin(x+π/2) */ } ``` --- ## Foundation: Matrix Operations 3D rotation uses 3×3 matrices multiplied with vertex coordinates. Each rotation (X, Y, Z axis) is a matrix multiply: ### Rotation Matrix Construction ```c /* matrix3d.c — 3D rotation matrices using fixed-point */ typedef struct { FIXED m[3][3]; /* 3×3 rotation matrix */ } Matrix3D; /* Build rotation matrix from Euler angles */ void build_rotation_matrix(Matrix3D *mat, int ax, int ay, int az) { FIXED sx = fast_sin(ax), cx = fast_cos(ax); FIXED sy = fast_sin(ay), cy = fast_cos(ay); FIXED sz = fast_sin(az), cz = fast_cos(az); /* Combined Z×Y×X rotation (standard demoscene order) */ mat->m[0][0] = fixed_mul(cy, cz); mat->m[0][1] = fixed_mul(fixed_mul(sx, sy), cz) - fixed_mul(cx, sz); mat->m[0][2] = fixed_mul(fixed_mul(cx, sy), cz) + fixed_mul(sx, sz); mat->m[1][0] = fixed_mul(cy, sz); mat->m[1][1] = fixed_mul(fixed_mul(sx, sy), sz) + fixed_mul(cx, cz); mat->m[1][2] = fixed_mul(fixed_mul(cx, sy), sz) - fixed_mul(sx, cz); mat->m[2][0] = -sy; mat->m[2][1] = fixed_mul(sx, cy); mat->m[2][2] = fixed_mul(cx, cy); } /* Transform a vertex: result = matrix × vertex */ void transform_vertex(const Matrix3D *mat, FIXED vx, FIXED vy, FIXED vz, FIXED *rx, FIXED *ry, FIXED *rz) { *rx = fixed_mul(mat->m[0][0], vx) + fixed_mul(mat->m[0][1], vy) + fixed_mul(mat->m[0][2], vz); *ry = fixed_mul(mat->m[1][0], vx) + fixed_mul(mat->m[1][1], vy) + fixed_mul(mat->m[1][2], vz); *rz = fixed_mul(mat->m[2][0], vx) + fixed_mul(mat->m[2][1], vy) + fixed_mul(mat->m[2][2], vz); } ``` ### Perspective Projection ```c /* project.c — Perspective projection to screen coordinates */ #define SCREEN_CX 160 /* Center X (320 wide) */ #define SCREEN_CY 128 /* Center Y (256 tall) */ #define FOCAL_LEN 256 /* Focal length in fixed-point */ void project_vertex(FIXED x, FIXED y, FIXED z, WORD *sx, WORD *sy) { /* Perspective divide: screen = world × focal / z */ if (z > INT_TO_FIXED(1)) { /* Avoid division by zero */ FIXED scale = fixed_div(INT_TO_FIXED(FOCAL_LEN), z); *sx = SCREEN_CX + FIXED_TO_INT(fixed_mul(x, scale)); *sy = SCREEN_CY - FIXED_TO_INT(fixed_mul(y, scale)); /* Y flipped */ } else { *sx = SCREEN_CX; /* Behind camera */ *sy = SCREEN_CY; } } ``` --- ## Technique 1: Blitter-Filled Polygons The Blitter's **line-draw + fill mode** combination is the foundation of Amiga 3D rendering. The process: 1. Draw polygon edges using Blitter line mode (sets pixels at boundaries) 2. Activate Blitter fill mode (fills between set pixels, even→odd fill rule) 3. Result: a filled polygon with zero CPU pixel writing ```mermaid sequenceDiagram participant CPU as 68000 CPU participant Blitter as Blitter participant Bitmap as Bitplane Memory Note over CPU: Polygon: 4 vertices CPU->>Blitter: Configure line mode (BLTCON0=$0B7A) CPU->>Blitter: Draw edge V1→V2 Blitter->>Bitmap: Write edge pixels CPU->>Blitter: Draw edge V2→V3 Blitter->>Bitmap: Write edge pixels CPU->>Blitter: Draw edge V3→V4 Blitter->>Bitmap: Write edge pixels CPU->>Blitter: Draw edge V4→V1 Blitter->>Bitmap: Write edge pixels CPU->>Blitter: Configure fill mode (BLTCON0=$01F2) CPU->>Blitter: Fill bitmap (even-odd rule) Blitter->>Bitmap: Fill interior pixels Note over Bitmap: Filled polygon ready ``` ### Blitter Fill Mode Detail The Blitter fill uses the **inclusive-odd** fill rule: scanning left to right, it inverts pixels each time it encounters a set bit. This means it fills between pairs of edge pixels: ```asm ; blit_fill.asm — Blitter fill for a single bitplane ; Assumes edges already drawn in the bitmap ; Fill from top to bottom of polygon lea $DFF000,a0 ; Custom registers base ; Set up fill-mode blit move.w #$01F2,BLTCON0(a0) ; Fill mode: A→D, fill enabled move.w #$0000,BLTCON1(a0) ; No line mode, ascending move.w #$FFFF,BLTAFWM(a0) ; First word mask = all bits move.w #$FFFF,BLTALWM(a0) ; Last word mask = all bits ; Source A = the bitmap data (edge pixels) move.l bitmap_start,BLTAPTH(a0) ; Source address ; Destination = same bitmap (fill in-place) move.l bitmap_start,BLTDPTH(a0) ; Dest = source ; Blit size: height × width ; height = number of scanlines, width = words per line move.w #(HEIGHT<<6)|(WORDS_PER_LINE),BLTSIZE(a0) ; Blitter starts immediately ``` ### Multiple-Face Sorting (Painter's Algorithm) For solid 3D objects, faces must be drawn back-to-front (painter's algorithm): ```c /* face_sort.c — Sort polygon faces by depth for painter's algorithm */ typedef struct { WORD num_vertices; WORD *vertices; /* Index into vertex array */ FIXED avg_z; /* Average Z depth (for sorting) */ UWORD color; /* Face color */ } Face; int compare_faces(const void *a, const void *b) { FIXED za = ((const Face *)a)->avg_z; FIXED zb = ((const Face *)b)->avg_z; /* Sort far-to-near (painter's algorithm) */ if (za > zb) return -1; /* a is farther, draw first */ if (za < zb) return 1; return 0; } void render_object(Face *faces, int num_faces, FIXED *transformed_z) { int i; /* Calculate average Z for each face */ for (i = 0; i < num_faces; i++) { FIXED sum = 0; int j; for (j = 0; j < faces[i].num_vertices; j++) { sum += transformed_z[faces[i].vertices[j]]; } faces[i].avg_z = sum / faces[i].num_vertices; } /* Sort back-to-front */ qsort(faces, num_faces, sizeof(Face), compare_faces); /* Draw each face */ for (i = 0; i < num_faces; i++) { draw_filled_polygon(&faces[i]); } } ``` --- ## Technique 2: Rotozoom (Affine Texture Mapping) Rotozoom renders a texture-mapped rectangle that can rotate and scale in real-time. The name comes from **rotate + zoom**. It works by computing a texture coordinate (U,V) for each screen pixel using an affine transform. ### Algorithm For each screen pixel (x, y), compute texture coordinates: ``` U = U_start + x × dU_dx + y × dU_dy V = V_start + x × dV_dx + y × dV_dy ``` Where `dU_dx`, `dV_dx`, `dU_dy`, `dV_dy` are the rotation+scale matrix coefficients. ```c /* rotozoom.c — Affine texture mapping (rotzoom) */ #define SCREEN_W 320 #define SCREEN_H 256 #define TEX_SIZE 256 /* Texture is 256×256 */ extern UBYTE texture[TEX_SIZE][TEX_SIZE]; /* Chunky texture */ extern UBYTE *chunky_buffer; /* Output chunky buffer */ void render_rotozoom(FIXED cx, FIXED cy, /* Texture center offset */ FIXED angle, FIXED zoom) { FIXED cos_a = fast_cos(angle); FIXED sin_a = fast_sin(angle); FIXED inv_zoom = fixed_div(INT_TO_FIXED(1), zoom); /* Rotation × inverse zoom matrix coefficients */ FIXED du_dx = fixed_mul(cos_a, inv_zoom); FIXED dv_dx = fixed_mul(sin_a, inv_zoom); FIXED du_dy = fixed_mul(-sin_a, inv_zoom); FIXED dv_dy = fixed_mul(cos_a, inv_zoom); /* Start position: center of screen maps to texture center */ FIXED u_start = cx - fixed_mul(INT_TO_FIXED(SCREEN_W/2), du_dx) - fixed_mul(INT_TO_FIXED(SCREEN_H/2), du_dy); FIXED v_start = cy - fixed_mul(INT_TO_FIXED(SCREEN_W/2), dv_dx) - fixed_mul(INT_TO_FIXED(SCREEN_H/2), dv_dy); UBYTE *dst = chunky_buffer; int y; for (y = 0; y < SCREEN_H; y++) { FIXED u = u_start; FIXED v = v_start; int x; for (x = 0; x < SCREEN_W; x++) { /* Texture lookup with wrapping */ *dst++ = texture[FIXED_TO_INT(u) & 0xFF] [FIXED_TO_INT(v) & 0xFF]; u += du_dx; v += dv_dx; } u_start += du_dy; v_start += dv_dy; } /* Convert chunky buffer to planar bitplanes (C2P) */ chunky_to_planar(chunky_buffer, bitplane_data, SCREEN_W, SCREEN_H); } ``` ### Rotozoom in Assembly (Inner Loop) The 68000 assembly inner loop is highly optimized. The key insight: texture coordinates wrap at power-of-2 boundaries, so masking with `$FF` (256-wide texture) is free using byte-level addressing: ```asm ; rotozoom_inner.asm — Optimized inner loop ; a0 = destination (chunky buffer) ; a1 = texture base (256×256) ; d0 = U (16.16 fixed) ; d1 = V (16.16 fixed) ; d2 = dU/dx (16.16 fixed) ; d3 = dV/dx (16.16 fixed) ; d4 = loop counter (SCREEN_W) .roto_inner: move.l d0,d5 ; Copy U swap d5 ; d5.w = integer part of U move.l d1,d6 ; Copy V swap d6 ; d6.w = integer part of V ; Texture lookup: tex[v & 0xFF][u & 0xFF] and.w #$FF,d5 ; U mask (free wrap) and.w #$FF,d6 ; V mask (free wrap) lsl.w #8,d6 ; V × 256 (row offset) move.b (a1,d5.w),d7 ; Texel = tex[v*256+u] move.b d7,(a0)+ ; Write to chunky buffer add.l d2,d0 ; U += dU/dx add.l d3,d1 ; V += dV/dx dbra d4,.roto_inner ``` --- ## Technique 3: Dot Tunnel The dot tunnel renders concentric rings that appear to fly toward the viewer, creating the illusion of traveling through a tunnel. Each ring is a circle rendered at a specific Z-depth. ### Algorithm ```c /* dot_tunnel.c — Z-ordered ring tunnel effect */ #define NUM_RINGS 30 #define MAX_Z 1024 #define RING_POINTS 32 typedef struct { FIXED z; /* Depth (0=near, far=background) */ FIXED radius; /* Apparent radius (decreases with z) */ WORD cx, cy; /* Center (can be animated) */ UWORD color; /* Ring color */ } Ring; void render_dot_tunnel(Ring *rings, int num_rings, ULONG frame) { int i; /* Update ring positions (move toward viewer) */ for (i = 0; i < num_rings; i++) { rings[i].z -= INT_TO_FIXED(4); /* Speed toward viewer */ /* If ring passed camera, reset to far end */ if (rings[i].z < INT_TO_FIXED(1)) { rings[i].z = INT_TO_FIXED(MAX_Z); } /* Perspective projection of radius */ rings[i].radius = fixed_div( INT_TO_FIXED(200), /* Base radius */ rings[i].z /* Divide by depth */ ); } /* Sort rings far-to-near (painter's algorithm) */ /* ... sort by rings[i].z descending ... */ /* Draw each ring */ for (i = 0; i < num_rings; i++) { int p; int radius = FIXED_TO_INT(rings[i].radius); int cx = rings[i].cx + FIXED_TO_INT( fixed_mul(fast_sin(frame * 3), INT_TO_FIXED(30))); int cy = rings[i].cy + FIXED_TO_INT( fixed_mul(fast_cos(frame * 5), INT_TO_FIXED(20))); for (p = 0; p < RING_POINTS; p++) { int angle = p * 360 / RING_POINTS; FIXED sa = fast_sin(angle * TRIG_TABLE_SIZE / 360); FIXED ca = fast_cos(angle * TRIG_TABLE_SIZE / 360); WORD px = cx + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), sa)); WORD py = cy + FIXED_TO_INT(fixed_mul(INT_TO_FIXED(radius), ca)); /* Plot pixel or draw Blitter circle at (px, py) */ plot_dot(px, py, rings[i].color); } } } ``` --- ## Technique 4: Voxel Space Voxel space renders a 3D landscape from a 2D heightmap and colormap. The algorithm casts rays from the viewer, one per screen column, and draws vertical strips of pixels. The result is a fly-over landscape effect, as seen in the 1994 demo "Space Rangers" by Rebels. ### Algorithm (Column-Based Raycasting) ```mermaid sequenceDiagram participant CPU as 68000 CPU participant HM as Heightmap participant CM as Colormap participant Screen as Screen Buffer Note over CPU: For each screen column x (0-319): CPU->>CPU: Calculate ray direction for column x CPU->>HM: Sample height at (ray_x, ray_z) HM-->>CPU: height value h CPU->>CPU: Project h to screen Y: y = horizon - h/z CPU->>CM: Get color at (ray_x, ray_z) CM-->>CPU: color value CPU->>Screen: Draw vertical line from y to previous y in color Note over CPU: Advance ray (step outward) CPU->>HM: Sample next point... ``` ### Voxel Space Implementation ```c /* voxelspace.c — Column-based voxel landscape rendering */ #define SCREEN_W 320 #define SCREEN_H 256 #define MAP_SIZE 256 #define HORIZON 100 /* Horizon line Y position */ #define MAX_DEPTH 200 /* Maximum ray distance */ extern UBYTE heightmap[MAP_SIZE][MAP_SIZE]; extern UBYTE colormap[MAP_SIZE][MAP_SIZE]; extern UBYTE *chunky_buffer; void render_voxel_space(FIXED cam_x, FIXED cam_z, FIXED cam_angle, FIXED cam_height) { int x; for (x = 0; x < SCREEN_W; x++) { /* Ray angle: camera angle + column offset */ FIXED column_offset = INT_TO_FIXED(x - SCREEN_W/2); FIXED ray_angle = cam_angle + fixed_div(column_offset, INT_TO_FIXED(FOCAL_LEN)); FIXED ray_dx = fast_cos(ray_angle); /* Direction X */ FIXED ray_dz = fast_sin(ray_angle); /* Direction Z */ FIXED ray_x = cam_x; FIXED ray_z = cam_z; WORD prev_draw_y = SCREEN_H; /* Bottom of column */ int distance; for (distance = 1; distance < MAX_DEPTH; distance++) { FIXED dz = fixed_div(INT_TO_FIXED(distance), ray_dz); FIXED dx = fixed_div(INT_TO_FIXED(distance), ray_dx); /* Current map position */ int mx = (FIXED_TO_INT(cam_x + dx)) & (MAP_SIZE - 1); int mz = (FIXED_TO_INT(cam_z + dz)) & (MAP_SIZE - 1); /* Height at this point */ FIXED terrain_h = INT_TO_FIXED(heightmap[mz][mx]); /* Project to screen Y */ FIXED height_diff = terrain_h - cam_height; WORD draw_y = HORIZON - FIXED_TO_INT(fixed_div(height_diff, INT_TO_FIXED(distance))); /* Only draw if above previously drawn pixel */ if (draw_y < prev_draw_y) { UBYTE color = colormap[mz][mx]; int y; for (y = draw_y; y < prev_draw_y; y++) { chunky_buffer[y * SCREEN_W + x] = color; } prev_draw_y = draw_y; } /* Step ray outward */ ray_x += ray_dx; ray_z += ray_dz; } } /* C2P conversion for planar display */ chunky_to_planar(chunky_buffer, bitplane_data, SCREEN_W, SCREEN_H); } ``` ### Voxel Space Optimization The naive algorithm is too slow for 50 FPS on a 68000. Key optimizations: | Optimization | Speedup | How | |-------------|---------|-----| | **Reciprocal table** | 2× | Pre-compute 1/z values, avoid division | | **Step doubling** | 3-4× | Double step size beyond certain depth (less detail needed) | | **Height caching** | 1.5× | Cache last N height lookups | | **Reduced resolution** | 2-4× | Render at 160×128 and scale up (acceptable for landscape) | | **Fast C2P** | 30× | Use Kalms C2P instead of naive conversion | --- ## Performance Budget ### 3D Rendering Costs on Stock A500 (7 MHz 68000) | Operation | Cycles (approx.) | Notes | |-----------|------------------|-------| | Fixed-point multiply | ~28 | `MULS.W` (16×16→32) | | Fixed-point divide | ~140 | `DIVS.W` — very expensive! | | Sine table lookup | ~12 | Table indexed by angle | | Vertex transform | ~300 | 3 multiplies + 3 adds per axis | | Perspective divide | ~160 | 2 divides per vertex | | Blitter line draw | ~200/edge | DMA time for edge | | Blitter fill | ~2000/polygon | Depends on polygon size | | Full C2P (Kalms) | ~35ms | 320×256 × 8bpp | | Voxel column | ~500/col | Heightmap lookup + draw | ### Frame Budget (PAL: 20ms per frame) | Effect | Vertices | Time | FPS | |--------|----------|------|-----| | Single flat-shaded cube | 8 | ~3ms | 50 | | 100-face object | 30+ | ~12ms | 30-50 | | Rotozoom 320×256 | 0 (per pixel) | ~40ms (with C2P) | 15-25 | | Dot tunnel 30 rings | 960 dots | ~8ms | 50 | | Voxel space 320×256 | 64K cols | ~80ms | 6-12 | --- ## Antipatterns ### 1. The Floating-Point Temptation Using floating-point math on the 68000. The 68881 FPU is optional — most Amigas don't have one. Software floating-point emulation is **100× slower** than fixed-point. **Broken:** ```c /* Don't do this — requires FPU or slow software emulation */ float x = sin(angle) * distance; float y = cos(angle) * distance; ``` **Fixed:** ```c /* Use fixed-point with pre-calculated tables */ FIXED x = fixed_mul(fast_sin(angle), distance); FIXED y = fixed_mul(fast_cos(angle), distance); ``` ### 2. The Per-Pixel Divide Calling `fixed_div()` for every pixel in a rotozoom or voxel renderer. Division is the most expensive operation on the 68000 (~140 cycles for 16-bit, ~280 for 32-bit). **Broken:** ```c for (x = 0; x < 320; x++) { for (y = 0; y < 256; y++) { FIXED u = fixed_div(x, z); /* DIVIDE PER PIXEL! */ FIXED v = fixed_div(y, z); } } ``` **Fixed:** ```c /* Pre-compute step values (multiply instead of divide) */ FIXED du_dx = fixed_mul(scale, inv_z); /* One divide per frame */ FIXED dv_dy = fixed_mul(scale, inv_z); for (y = 0; y < 256; y++) { FIXED u = u_start; for (x = 0; x < 320; x++) { u += du_dx; /* ADD, not multiply/divide */ } u_start += du_dy; } ``` ### 3. The Backface Cull Miss Skipping backface culling for convex objects. Every polygon drawn behind other polygons is wasted Blitter time. A simple dot-product test rejects ~50% of faces. **Broken:** ```c /* Draw all faces — 50% are facing away! */ for (i = 0; i < num_faces; i++) { draw_filled_polygon(&faces[i]); /* Wastes time on hidden faces */ } ``` **Fixed:** ```c for (i = 0; i < num_faces; i++) { /* Backface cull: if face normal points away, skip it */ FIXED nx = compute_normal_x(&faces[i]); FIXED nz = compute_normal_z(&faces[i]); if (nz < 0) continue; /* Facing away from camera */ draw_filled_polygon(&faces[i]); } ``` ### 4. The Unsorted Z-Fight Drawing faces in random order without depth sorting. Overlapping polygons flicker as they overwrite each other unpredictably each frame. **Broken:** ```c /* Draw faces in arbitrary order → z-fighting */ for (i = 0; i < num_faces; i++) { draw_filled_polygon(&faces[i]); } ``` **Fixed:** ```c /* Sort by average Z depth (painter's algorithm) */ qsort(faces, num_faces, sizeof(Face), compare_faces_back_to_front); for (i = 0; i < num_faces; i++) { draw_filled_polygon(&faces[i]); } ``` ### 5. The Naive C2P Using a naive chunky-to-planar conversion for rotozoom/voxel output. The naive method processes each pixel individually with bit shifts, taking over 1 second per frame on a stock 68000. **Broken:** ```c /* Naive C2P: ~70,000 pixels/sec — 0.9 FPS for 320×256 */ for (i = 0; i < 320*256; i++) { int pixel = chunky[i]; for (bit = 0; bit < 8; bit++) { planes[bit][i/8] |= ((pixel >> bit) & 1) << (7 - (i & 7)); } } ``` **Fixed:** ```c /* Use Kalms C2P or Blitter-assisted C2P: ~30× faster */ kalms_c2p(chunky_buffer, planar_data, 320, 256); /* See pixel_conversion.md for full implementation */ ``` --- ## Decision Guide ```mermaid flowchart TD START[Need 3D rendering] --> Q1{What are you rendering?} Q1 -->|Solid objects| Q2{Convex or arbitrary?} Q1 -->|Landscapes| VOXEL[Voxel Space] Q1 -->|Textures| ROTZ[Rotozoom] Q1 -->|Abstract/Tunnel| DOT[Dot Tunnel] Q2 -->|Convex| Q3{Number of faces?} Q2 -->|Arbitrary| BSP[BSP-tree or Z-buffer approach] Q3 -->|<100| SIMPLE[Simple painter's algorithm
+ backface cull] Q3 -->|100-500| SORT[Sorted faces + Blitter fill] Q3 -->|>500| Q4{Stock A500 or accelerated?} Q4 -->|Stock A500| REDUCE[Reduce geometry or
use wireframe only] Q4 -->|Accelerated| OPT[Optimized fill + C2P pipeline] VOXEL --> Q5{Resolution?} Q5 -->|Full 320×256| SLOW[~8 FPS on stock A500] Q5 -->|Half 160×128| OK[~15-25 FPS on A500] ROTZ --> Q6{Platform?} Q6 -->|OCS/ECS| C2P_PATH[C2P required
~15-25 FPS] Q6 -->|RTG/AGA| CHUNKY[Direct chunky write
~50 FPS possible] ``` --- ## Historical Timeline ```mermaid timeline title 3D Rendering on Amiga 1987 : First wireframe 3D demos (line drawing only) 1988 : Hidden line removal algorithms : Flat-shaded 3D in demoscene 1989 : Phenomena — filled polygon 3D objects : Fixed-point math becomes standard 1990 : Complex demo — multiple filled objects : First rotozoom effects appear 1991 : Texture-mapped rotozoom at acceptable framerates : 3D starfields with depth sorting 1992 : Voxel space demos appear (low resolution) : Dot tunnels with Blitter-optimized rendering 1993 : Spaceballs — state-of-the-art dot tunnel : Higher-resolution voxel landscapes 1994 : Rebels — smooth voxel space fly-over : 68040-accelerated 3D at full frame rate 1995 : Polka Brothers — CPU raytracer (proof of concept) : 68060-accelerated demos with lighting 2000+ : Demoscene pushes 3D on stock A500 : Group-optimized rotozoom achieves new speed records : MiSTer preserves exact Blitter timing for fill accuracy ``` --- ## Modern Analogies | Amiga 3D Concept | Modern Equivalent | Why It Maps | |-----------------|-------------------|-------------| | Fixed-point 16.16 | Half-precision float (FP16) | Both trade precision for speed | | Sine lookup table | GPU SFU (Special Function Unit) | Both use hardware-assisted transcendental | | Blitter fill mode | GPU rasterizer | Both fill polygon interiors | | Painter's algorithm | Z-buffer / depth test | Both solve polygon visibility | | Backface culling | GPU backface culling | Both skip invisible faces | | Rotozoom | Affine texture sampling | Both use 2×2 matrix transform per pixel | | Voxel space raycasting | Heightfield terrain shader | Both cast rays through a heightmap | | C2P conversion | Texture swizzle/deswizzle | Both convert between memory layouts | | Reciprocal table | GPU reciprocal approximation | Both avoid expensive division | | Chunky buffer | Render-to-texture (FBO) | Both render to off-screen buffer | --- ## Use Cases | Use Case | Technique | Notable Examples | |----------|-----------|-----------------| | 3D game objects | Filled polygons | Flight simulators, Elite clones | | Rotating logo | Rotozoom | Every demo with a bitmap logo | | Tunnel fly-through | Dot tunnel | Spaceballs, numerous demos | | Landscape fly-over | Voxel space | Rebels, numerous demos | | 3D chess/board games | Filled polygons + sorting | Various Amiga games | | Virtual reality scenes | Combined techniques | Various demo compos | | Star field | Z-ordered point rendering | Standard demo effect | | Wavy floor/ceiling | Rotozoom variant | Doom-like perspective tricks | --- ## FPGA / Emulation Impact | Concern | Impact | Notes | |---------|--------|-------| | **Blitter fill timing** | Fill must use exact inclusive-odd rule | Emulators must match Blitter fill behavior precisely | | **Line-draw accuracy** | Blitter Bresenham must match real hardware | Affects polygon edge positions | | **C2P pipeline** | Chunky→Planar timing affects frame rate | Must be accounted for in demo timing | | **Fixed-point overflow** | 68000 MULS.L/DIVS.L edge cases | 32-bit overflow behavior must match hardware | | **Blitter-CPU interleaving** | BLTPRI affects CPU stall duration | Must match real Blitter busy-wait timing | --- ## FAQ **Q: Why not use the FPU for 3D math?** A: The 68881/68882 FPU is optional hardware that most Amiga models don't have. Software FPU emulation is 50-100× slower than fixed-point integer math. Only 68030/040/060 accelerated Amigas typically have an FPU, and even then, fixed-point is faster for many operations because the 68000's integer multiply is well-optimized. **Q: What is the fastest C2P for rotozoom?** A: The Kalms C2P is the standard. For AGA machines with 32-bit Blitter access, a Blitter-assisted C2P can be even faster. For RTG cards, C2P is unnecessary — write directly to chunky VRAM. See [Pixel Conversion](../08_graphics/pixel_conversion.md) for benchmarks. **Q: How do I handle polygon clipping?** A: For simple 3D objects, screen-space clipping (Cohen-Sutherland or Sutherland-Hodgman) is sufficient. Clip against the four screen edges. For objects that can go behind the camera, you need near-plane clipping in 3D space — this is much more complex and most demos avoid it by keeping objects in front of the camera. **Q: Can I do texture-mapped polygons (not just rotozoom)?** A: Yes, but affine texture mapping (per-polygon UV) produces visible distortion on large polygons. Correct perspective texture mapping requires per-pixel division, which is too slow on a 68000. Most demos use subdivision (split large polygons into smaller ones) or simply use rotozoom for the entire screen. **Q: What is a dot matrix / voxel display?** A: A voxel (volume pixel) display renders 3D data as a grid of points. On the Amiga, this typically means rendering heightmap terrain as vertical columns (voxel space) or rendering 3D point clouds. The Blitter's line-draw mode can efficiently render individual dots. --- ## References ### Related Knowledge Base Articles - [Pixel Conversion](../08_graphics/pixel_conversion.md) — C2P algorithms (Kalms, Blitter, Akiko) - [Blitter Programming](../08_graphics/blitter/blitter_programming.md) — Fill mode, line draw, minterms - [Blitter](../08_graphics/blitter/blitter.md) — Blitter hardware architecture - [Bitmap](../08_graphics/bitmap.md) — Bitplane memory layout, interleaving - [Copper Effects](copper_effects.md) — Copper-driven display effects - [Timing Optimization](timing_optimization.md) — Cycle counting, Blitter-CPU interleaving - [FPU Architecture](../15_fpu_mmu_cache/fpu_architecture.md) — 68881/68882 floating-point ### External Resources - **Amiga Hardware Reference Manual** — Blitter fill mode, line-draw mode - **Scoopex Amiga Hardware Programming** (Photon) — [YouTube playlist](https://www.youtube.com/playlist?list=PLc3ltHgmiidpK-s0eP5hTKJnjdTHz0_bW) — Blitter fill mode and line-draw video walkthroughs; companion articles at [coppershade.org](http://coppershade.org/) - **Pouet.net** — https://www.pouet.net — 3D demo releases with source code - **Demozoo** — https://demozoo.org — Demoscene production encyclopedia - **Amiga Graphics Archive** — https://amiga.lychesis.net — Copper-enhanced 3D rendering analysis in commercial games - **Kalms C2P** — Standard chunky-to-planar implementation - **Comanche Voxel Engine** — Original voxel space algorithm reference (NovaLogic)