21 KiB
translator.library — English-to-Phonetic Translation for Speech Synthesis
Overview
translator.library is the front half of the Amiga's built-in text-to-speech pipeline: a single-function library that converts unrestricted English text into phonetic strings — the expanded ARPABET phoneme codes used by narrator.device to generate human-like speech through the Amiga's audio hardware. Introduced with AmigaOS 1.2 and distributed as a disk-based library in LIBS:, it encapsulates over 450 context-sensitive pronunciation rules, an exception dictionary for irregular words (through, though, cough), abbreviation expansion (Dr., Prof., lb.), and automatic content-word accentuation — all in a single call: Translate(). The output is a string of space-delimited phoneme codes with stress markers that can be passed directly to narrator.device via CMD_WRITE, stored for later playback, or analyzed for phonetic research. While hand-coded phonetics always produce higher-quality speech, Translate() is the only practical option when the input is arbitrary user text at runtime.
Architecture
The Amiga Speech Pipeline
flowchart LR
subgraph INPUT["Input Layer"]
ENG["English Text<br/>(ASCII)"]
PHON["Hand-Coded<br/>Phonetic String"]
end
subgraph TRANSLATOR["translator.library"]
TR["Translate()<br/>English → Phonetic"]
RULES["450+ Context Rules<br/>Exception Dictionary<br/>Abbreviation Expansion"]
end
subgraph NARRATOR["narrator.device"]
SYNTH["Speech Synthesizer<br/>Formant Model"]
MOUTH["Mouth Shape<br/>Generator"]
end
subgraph OUTPUT["Output Layer"]
AUDIO["audio.device<br/>DMA Audio Channels"]
MOUTHDATA["mouth_rb<br/>Width/Height"]
end
ENG --> TR
TR --> RULES
RULES --> TR
TR -->|"Phonetic String"| SYNTH
PHON --> SYNTH
SYNTH --> AUDIO
SYNTH --> MOUTH
MOUTH --> MOUTHDATA
style TR fill:#e8f4fd,stroke:#2196f3,color:#333
style SYNTH fill:#fff3e0,stroke:#ff9800,color:#333
Library Base
| Name | Type | Description |
|---|---|---|
TranslatorBase |
struct Library * |
Library base pointer returned by OpenLibrary() |
ITranslator |
Interface pointer (OS 4.x+) | Interface-based access for AmigaOS 4+ |
translator.library is a disk-based library — it lives in LIBS:translator.library, not in ROM. This means OpenLibrary() can fail if the file is missing, and the library can be expunged from memory under low-memory conditions.
Key Design Decisions
| Decision | Rationale |
|---|---|
| Single-function API | Translation is inherently stateless — input text, output phonetics. No session, no configuration |
| Disk-based, not ROM | Phonetic dictionary is large (~20+ KB of rules); keeping it out of ROM saves Kickstart space |
| Negative return codes for overflow | Allows progressive translation of long texts without pre-allocating a huge buffer |
| Rule-based, not neural | 1985 technology couldn't run a neural TTS; the 450 context-sensitive rules were state-of-the-art for the era |
API Reference
Opening and Closing
/* Classic AmigaOS (1.x–3.x) — LVO -30 */
struct Library *TranslatorBase;
TranslatorBase = OpenLibrary("translator.library", 0);
if (!TranslatorBase) { /* LIBS:translator.library not found */ }
/* ... use Translate() ... */
CloseLibrary(TranslatorBase);
/* AmigaOS 4.x — Interface-based */
struct Library *TranslatorBase;
struct TranslatorIFace *ITranslator;
TranslatorBase = IExec->OpenLibrary("translator.library", 0);
if (TranslatorBase)
{
ITranslator = (struct TranslatorIFace *)
IExec->GetInterface(TranslatorBase, "main", 1, NULL);
if (ITranslator)
{
/* ... use ITranslator->Translate() ... */
}
IExec->DropInterface((struct Interface *)ITranslator);
}
IExec->CloseLibrary(TranslatorBase);
Translate()
/* LVO -36 — Converts English text to phonetic string */
LONG Translate(STRPTR input, /* a0: English input string */
LONG inputLen, /* d0: length of input */
STRPTR output, /* a1: output buffer for phonetics */
LONG outputSize /* d0: size of output buffer */);
| Parameter | Description |
|---|---|
input |
Null-terminated or length-delimited English ASCII string. Case-insensitive; punctuation is preserved where it affects pronunciation |
inputLen |
Number of characters to translate from input. Use strlen(input) for the full string |
output |
Pre-allocated buffer to receive the phonetic string. Must be large enough — phonetics are typically 2–4× the input length |
outputSize |
Size of the output buffer in bytes |
Return value:
| Return | Meaning |
|---|---|
0 |
Full translation succeeded; output buffer was large enough |
| Negative value | Buffer overflow — translation stopped at a word boundary. -(rtnCode) is the character offset in the input string where translation ended. Resume by calling Translate(input + offset, inputLen - offset, output, outputSize) |
| Other non-zero | Translation error (unlikely — the library tries to translate literally if rules fail) |
Note
The negative return value always stops at a word boundary (space or punctuation), not mid-word. This prevents split phonemes and makes resumption seamless.
Output Format
The output is a space-delimited string of ARPABET phoneme codes with stress markers appended to vowels:
Input: "This is Amiga speaking."
Output: "DH IH1 Z IH1 Z AE1 M IH0 G AH0 S P IY1 K IH0 NG ."
└─ "This" ─┘ └"is"─┘ └─── "Amiga" ───┘ └─── "speaking" ───┘
| Marker | Meaning | Example |
|---|---|---|
0 |
No stress (unstressed vowel) | IH0 = unstressed "i" (as in "rabbit") |
1 |
Primary stress | IY1 = stressed "ee" (as in "speak") |
2 |
Secondary stress | OW2 = secondary "oh" (as in "overflow") |
3 |
Emphatic stress (rare) | Used for contrastive emphasis |
Phonetic Output Examples
| English Input | Phonetic Output (approx.) |
|---|---|
Hello world. |
HH EH0 L OW1 W ER1 L D . |
The quick brown fox. |
DH AH0 K W IH1 K B R AW1 N F AA1 K S . |
Amiga |
AE1 M IH0 G AH0 or AH0 M IY1 G AH0 (both valid) |
Commodore |
K AA1 M AH0 D AO1 R |
Guru Meditation |
G UH1 R UW0 M EH2 D IH0 T EY1 SH AH0 N |
Warning
The translator library was designed for American English pronunciation. British spellings (colour, centre) and non-English words will be translated using American phonetic rules and may sound odd.
Integration with narrator.device
The standard workflow:
#include <devices/narrator.h>
#include <clib/translator_protos.h>
/* 1. Open translator */
struct Library *TranslatorBase = OpenLibrary("translator.library", 0);
/* 2. Open narrator device */
struct MsgPort *mp = CreatePort(NULL, 0);
struct narrator_rb *voiceIO = (struct narrator_rb *)
CreateExtIO(mp, sizeof(struct narrator_rb));
OpenDevice("narrator.device", 0, (struct IORequest *)voiceIO, 0);
/* 3. Translate English → phonetic */
#define PHONBUF_SIZE 2048
STRPTR english = "Welcome to the Amiga speech system.";
UBYTE phonBuffer[PHONBUF_SIZE];
LONG result = Translate(english, strlen(english),
(STRPTR)phonBuffer, PHONBUF_SIZE);
if (result == 0)
{
/* 4. Configure voice parameters */
voiceIO->rate = 150; /* words per minute */
voiceIO->pitch = 110; /* Hz baseline */
voiceIO->sex = 0; /* 0=male, 1=female */
voiceIO->volume = 64; /* 0–64 */
voiceIO->sampfreq = 22200; /* Hz (Amiga native rate) */
/* 5. Send to narrator */
voiceIO->message.io_Command = CMD_WRITE;
voiceIO->message.io_Data = phonBuffer;
voiceIO->message.io_Length = strlen((STRPTR)phonBuffer);
DoIO((struct IORequest *)voiceIO);
}
/* 6. Cleanup */
CloseDevice((struct IORequest *)voiceIO);
DeleteExtIO((struct IORequest *)voiceIO);
DeletePort(mp);
CloseLibrary(TranslatorBase);
When to Use / When NOT to Use
| Scenario | Use Translate()? |
Rationale |
|---|---|---|
| Unrestricted user input (text editor, terminal, chat) | ✅ Yes | Only practical option — you can't pre-code phonetics for arbitrary text |
| Fixed application strings (game dialog, error messages) | ❌ No | Hand-code phonetics once; ship the phonetic strings. Much better quality |
| Accessibility screen reader | ✅ Yes | Essential — must speak whatever is on screen |
| Demo/game with iconic lines | ❌ No | Hand-tune phonetics, stress, and timing for maximum impact |
| Multi-language support | ❌ No | translator.library is English-only; use a third-party TTS or pre-recorded samples |
| Phonetic research/analysis | ⚠️ Maybe | Output is useful for analysis but not linguistically rigorous — use as a starting point |
| Speaking numbers/dates | ⚠️ Maybe | Library handles some abbreviations but not all; pre-process complex formats into spelled-out words |
Pitfalls & Common Mistakes
1. Underestimating Phonetic Buffer Size
The phonetic representation is always longer than the input English. A 100-character sentence typically produces 300–500 bytes of phonetics:
/* BAD: Same-sized buffer — will overflow on first long word */
UBYTE phonBuf[256];
STRPTR english = "The extraordinarily complicated implementation...";
LONG result = Translate(english, strlen(english), (STRPTR)phonBuf, 256);
/* result will be negative — phonetic for "extraordinarily" alone is ~40 chars */
/* CORRECT: Allocate 4× input length, minimum 512 bytes */
#define PHONBUF_SIZE(maxInput) (((maxInput) * 4) + 512)
UBYTE *phonBuf = AllocMem(PHONBUF_SIZE(strlen(english)), MEMF_ANY);
2. Ignoring Negative Return Code
A negative return from Translate() is a resumption offset, not a fatal error:
/* BAD: Treats partial translation as failure */
LONG rtn = Translate(text, len, buf, size);
if (rtn != 0) { /* panic — but text was partially translated! */ }
/* CORRECT: Resume from offset on negative return */
LONG offset = 0;
while (offset < len)
{
LONG rtn = Translate(text + offset, len - offset, buf, BUF_SIZE);
if (rtn == 0) break; /* done */
if (rtn < 0) offset += (-rtn); /* resume from word boundary */
else { /* unexpected error */ break; }
}
3. Passing Non-Null-Terminated Input with Wrong Length
If inputLen doesn't match the actual string, Translate() reads garbage or stops early:
/* BAD: strlen() on a buffer that may not be null-terminated */
UBYTE buf[256];
Read(fh, buf, 256); /* may fill entire buffer — no terminator */
Translate((STRPTR)buf, strlen((STRPTR)buf), out, 1024);
/* strlen() may read past the buffer! */
/* CORRECT: Use the explicit read count */
LONG actual = Read(fh, buf, 256);
Translate((STRPTR)buf, actual, out, 1024);
4. Not Checking for Missing Disk-Based Library
Unlike ROM libraries, translator.library may not be present:
/* BAD: Assumes library is always available */
struct Library *TranslatorBase = OpenLibrary("translator.library", 0);
Translate("Hello", 5, buf, 512); /* crash if TranslatorBase == NULL! */
/* CORRECT: Always check the return */
struct Library *TranslatorBase = OpenLibrary("translator.library", 0);
if (TranslatorBase)
{
Translate("Hello", 5, buf, 512);
CloseLibrary(TranslatorBase);
}
else
{
Printf("Speech not available — translator.library missing\n");
}
Named Antipatterns
"The Mumbler" — Unrealistic Rate/Pitch
Setting rate extremely high makes speech unintelligible, but the translator itself has nothing to do with it — the problem is feeding valid phonetics to a misconfigured narrator:
/* BAD: Chipmunk speech */
voiceIO->rate = 400; /* 400 words/min — unintelligible */
voiceIO->pitch = 255; /* extremely high pitch */
/* Sensible defaults: */
voiceIO->rate = 150; /* natural conversational speed */
voiceIO->pitch = 110; /* male baseline (85–110 for male, 160–220 for female) */
voiceIO->sex = 0; /* 0=male, 1=female */
"The Silent Speaker" — Mismatched Audio Allocation
The narrator device must allocate audio channels. If another application holds all four channels, OpenDevice("narrator.device", ...) succeeds but speech may not be audible:
/* BAD: No check on audio channel availability */
OpenDevice("narrator.device", 0, (struct IORequest *)voiceIO, 0);
/* Speech may be silent if audio channels are all in use */
/* CORRECT: Set channel mask to request specific channels */
UBYTE chanMasks[] = { 0x03, 0x0C, 0x30, 0xC0 }; /* try channels 0-1, 2-3, 4-5, 6-7 */
voiceIO->ch_masks = chanMasks;
voiceIO->nm_masks = 4;
"The Echo" — Forgetting io_Data Nesting
When you send a CMD_WRITE to the narrator device, the io_Data pointer must remain valid until the I/O completes. Using a stack buffer with DoIO() is fine (blocking); using SendIO() (asynchronous) with a stack buffer is not:
/* BAD: Stack buffer with async I/O */
void SpeakAsync(STRPTR text)
{
UBYTE phonBuf[512]; /* stack — disappears on return! */
Translate(text, strlen(text), (STRPTR)phonBuf, 512);
voiceIO->message.io_Data = phonBuf;
SendIO((struct IORequest *)voiceIO); /* async — phonBuf gone when this returns */
}
/* CORRECT: Allocate or use static buffer for async */
UBYTE phonBuf[2048]; /* static — stays valid */
void SpeakAsync(STRPTR text)
{
Translate(text, strlen(text), (STRPTR)phonBuf, sizeof(phonBuf));
voiceIO->message.io_Data = phonBuf;
SendIO((struct IORequest *)voiceIO);
/* phonBuf lives until AbortIO or CMD_FLUSH */
}
FAQ
Q: Can I use translator.library without narrator.device?
Yes. The phonetic output is a plain ASCII string — you can save it, analyze it, send it over a network, or use it as input to a custom speech synthesizer. The translator and narrator are independent.
Q: Why does the same word sometimes translate differently?
The translator uses context-sensitive rules. The pronunciation of "read" depends on surrounding tense markers; "record" as a noun vs. verb gets different stress. The same word in different sentences may produce different phonetics — this is correct behavior.
Q: How do I make the narrator sound female?
Set voiceIO->sex = 1 (female). This adjusts formant frequencies and baseline pitch. For manual fine-tuning, adjust voiceIO->pitch (160–220 Hz for female) and voiceIO->F1adj through F3adj (formant shifts).
Q: Can translator.library handle multiple languages?
No. The rule set and exception dictionary are English-only. German, French, or other languages will be treated as misspelled English and produce garbled phonetics. Use locale-specific TTS solutions for non-English speech.
Q: How big is the output buffer really needed?
Empirically, 4× the input length plus a 512-byte safety margin. The longest single English word phonetics (like "supercalifragilisticexpialidocious") is roughly 80 characters from 34 input characters. A typical sentence expands 2.5–3×.
Q: Does Translate() handle punctuation?
Yes. Punctuation marks (., ,, ?, !, ;, :) are passed through to the phonetic output. The narrator device interprets them as prosody cues: . = falling intonation, ? = rising intonation.
Use-Case Cookbook
1. Simple One-Shot Speech
The blocking pattern — suitable for alert messages, game notifications, short announcements:
void Say(STRPTR english)
{
struct Library *TranslatorBase = OpenLibrary("translator.library", 0);
if (!TranslatorBase) return;
UBYTE phonBuf[2048];
LONG rtn = Translate(english, strlen(english),
(STRPTR)phonBuf, sizeof(phonBuf));
if (rtn == 0)
{
struct MsgPort *mp = CreatePort(NULL, 0);
struct narrator_rb *vio = (struct narrator_rb *)
CreateExtIO(mp, sizeof(struct narrator_rb));
if (OpenDevice("narrator.device", 0, (struct IORequest *)vio, 0) == 0)
{
vio->rate = 150;
vio->pitch = 110;
vio->volume = 64;
vio->sampfreq = 22200;
vio->message.io_Command = CMD_WRITE;
vio->message.io_Data = phonBuf;
vio->message.io_Length = strlen((STRPTR)phonBuf);
DoIO((struct IORequest *)vio);
CloseDevice((struct IORequest *)vio);
}
DeleteExtIO((struct IORequest *)vio);
DeletePort(mp);
}
CloseLibrary(TranslatorBase);
}
/* Usage: */
Say("Game over. Insert coin to continue.");
2. Animated Talking Head (with Mouth Shapes)
The narrator can generate mouth width/height data while speaking:
/* Open two I/O requests — one for speech, one for mouth data */
struct narrator_rb *voiceIO = /* ... */;
struct mouth_rb *mouthIO = (struct mouth_rb *)
CreateExtIO(mp, sizeof(struct mouth_rb));
/* Enable mouth shape generation */
voiceIO->mouths = 1; /* non-zero = generate mouth data */
/* Send speech command */
voiceIO->message.io_Command = CMD_WRITE;
voiceIO->message.io_Data = phonBuf;
voiceIO->message.io_Length = strlen((STRPTR)phonBuf);
SendIO((struct IORequest *)voiceIO);
/* While speaking, read mouth shapes */
while (!CheckIO((struct IORequest *)voiceIO))
{
mouthIO->voice.message.io_Command = CMD_READ;
mouthIO->voice.message.io_Data = phonBuf; /* same buffer — narrator correlates */
mouthIO->voice.message.io_Length = strlen((STRPTR)phonBuf);
DoIO((struct IORequest *)mouthIO);
/* mouthIO->width = 0..255 (closed → wide open) */
/* mouthIO->height = 0..255 (closed → wide open) */
AnimateMouth(mouthIO->width, mouthIO->height);
}
3. Progressive Translation of Long Text
For documents or long-form text where a single 2 KB buffer won't suffice:
LONG TranslateLongText(STRPTR text, LONG totalLen, BPTR outputFH)
{
UBYTE phonBuf[2048];
LONG offset = 0;
while (offset < totalLen)
{
LONG bytesAvail = totalLen - offset;
LONG rtn = Translate(text + offset, bytesAvail,
(STRPTR)phonBuf, sizeof(phonBuf));
if (rtn == 0)
{
/* Final chunk — write and done */
LONG phonLen = strlen((STRPTR)phonBuf);
Write(outputFH, phonBuf, phonLen);
break;
}
else if (rtn < 0)
{
/* Write completed portion, resume at word boundary */
LONG phonLen = strlen((STRPTR)phonBuf);
Write(outputFH, phonBuf, phonLen);
offset += (-rtn);
}
else
{
/* unexpected error */
return rtn;
}
}
return 0;
}
Modern Analogies
| Amiga Concept | Modern Equivalent | Why It Maps | Where It Diverges |
|---|---|---|---|
| translator.library | macOS NSSpeechSynthesizer / Windows SAPI Text-to-Speech |
Both accept English text and produce speech. The API philosophy — text in, audio out — is identical | Modern APIs bundle translation and synthesis; Amiga splits them into library (translate) and device (speak) |
| ARPABET phonemes | IPA (International Phonetic Alphabet) | Both encode pronunciation as discrete symbols. ARPABET is a machine-readable subset of IPA | ARPABET is English-only; IPA is universal. ARPABET uses ASCII, IPA uses Unicode |
| 450 context-sensitive rules | Modern TTS neural networks (Tacotron, FastSpeech) | Both learn pronunciation from data — rules are a 1985 hand-crafted "model" | Neural TTS requires gigabytes of training data; rule-based works with zero training |
| narrator.device formant synthesis | Vocaloid / singing synthesis | Both use formant models (F0, F1, F2...) to generate vocal sounds | Narrator.device is a 1985-era 8-bit formant synth; Vocaloid uses concatenative sampling + ML |
Say command / speak: handler |
say command on macOS / espeak on Linux |
Both provide command-line text-to-speech | Amiga Say feeds translator.library → narrator.device; macOS say uses a system-wide speech server |
References
- ADCD 2.1: ROM Kernel Reference Manual: Libraries — Chapter 36: Translator Library
- ADCD 2.1: ROM Kernel Reference Manual: Devices — Chapter 8: Narrator Device
- NDK 3.9:
devices/narrator.h—narrator_rbandmouth_rbstructures - NDK 3.9:
clib/translator_protos.h—Translate()prototype - AmigaOS Documentation Wiki: Narrator Device — complete phoneme table and phonetic writing guide
- AmigaOS Documentation Wiki: Translator Library — OS 4.x interface reference
- See also: audio.md — audio.device DMA channel allocation used by narrator
- See also: iffparse.md — IFF FTXT parsing (the AmigaGuide format sometimes wraps speech metadata in IFF chunks)