Background

It’s sometimes the case that a word is not pronounced the way you expect. This most frequently happens for proper nouns (e.g. “Weber” can be pronounced “web-er” or “wee-ber”) and homographs (e.g. “lead” can be pronounced “led” or “leed”).

Pronunciation tags

When this happens, you can use pronunciation tags to override the default pronunciation. These tags are inserted into your input text and tells the API how to pronounce a word.

The tag format is [word : arpabet] where word is the word you want pronounced differently and arpabet is a space-separated list of ARPABET phonetic symbols. Tags are insert in-line with your input text.

Example: The [quick: K W IH1 K] brown fox jumps over the lazy dog.

What is ARPABET?

ARPABET is a set of symbols that correspond to phonemes in English. It’s like a “phonetic alphabet” that’s used to indicate how a word is pronounced.

We support a simplified version of ARPABET as shown below. We also use stress markers (numbers after each vowel) to indicate which syllable is emphasized. For example, EMphasis → EH1 M F AH0 S IH0 S whereas emphaSIS → EH2 M F AH0 S IH1 S

Vowels: AA AE AH AO AW AY EH ER EY IH IY OW OY UH UW

Consonants: B CH D DH F G HH JH K L M N NG P R S SH T TH V W Y Z ZH

The mandatory number after a vowel indicates different levels of stress as follows:

numbermeaning
0not stressed
1primary stress
2secondary stress

e.g. AA0 AA1 AA2

Take care to inlcude only one single primary stress vowel in any given pronunciation tag to avoid inconsistent results.

Example pronunciations:

  • Quack → K W AE1 K
  • Spider → S P AY1 D ER0
  • Mango → M AE1 NG G OW0