Understanding phonetic symbols
All languages and voices of the IBM Watson® Text to Speech service support both the standard International Phonetic Alphabet (IPA) and IBM Symbolic Phonetic Representation (SPR) notations to represent the sounds of words. Both notations provide phonetic encoding that represents the pronunciation of a word, the sounds that make up the word, how the sounds are divided into syllables, and which syllables are stressed. Phonetic symbols for supported languages provides links to topics that document the phonetic symbols for each language.
Defining a word pronunciation
To define the phonetic pronunciation for a word, either within input text or for a custom model, you use the <phoneme>
element of the Speech Synthesis Markup Language (SSML) or equivalent method parameters. The <phoneme>
element has two attributes:
- The
alphabet
attribute specifies the notation of the pronunciation. Use the valueibm
to indicate that the pronunciation is defined in SPR. Use the valueipa
to indicate that the pronunciation is defined in IPA. - The
ph
attribute defines the pronunciation. It consists of a sequence of allowable symbols for a given language. The symbols define how the word that is enclosed in the<phoneme>
element is to be pronounced.
Follow these rules when you define a pronunciation:
- Use only the documented SPR or IPA symbols. The service considers invalid any definition that contains phonetic symbols that are not allowed in a language. An SPR or IPA entry that does not conform to the required specification is invalid.
- When multiple IPA symbols (or symbol combinations) are documented for an SPR symbol, all of the IPA symbols are equivalent to the single SPR symbol. The service treats all of these IPA symbols the same and does not realize the subtle or regional differences that IPA is meant to describe.
For more information, see
Working with IBM SPR
IBM SPR is an alternative representation to standard IPA. The following examples of valid SPR notations define the words through and shocking in US English:
<phoneme alphabet="ibm" ph=".1Tru">through</phoneme>
<phoneme alphabet="ibm" ph=".1Sa.0kIG">shocking</phoneme>
In the definitions, the letters represent specific sounds of US English speech. A .
signals the beginning of a new syllable, and the digits 1
and 0
indicate syllable stress. For more information, see
Specifying syllables.
Speech sound symbols
Each language uses its own inventory of SPR symbols to represent the speech sounds of that language. The following rules apply to specifying an SPR symbol:
- Letters are case-sensitive, so
e
andE
, for example, represent two different sounds. - Two- and three-character symbols must be enclosed in single quotes when indicated in the symbol tables. The single quotes indicate that the multiple characters are actually a single symbol. For example, the symbol
'aj'
in the German word heim is specified as"h'aj'm"
. - Some three-character symbols include single quotes around only two of the characters. The single quotes indicate that the two characters are a single symbol. So the SPR consists of two symbols. For example, the symbol
'a:'n
in the Netherlands Dutch word dependances contains two symbols,'a:'
andn
, and is specified asd'e:'.pEn.1d'a:'n.s@s
.
Also consider the following when defining a word's pronunciation in SPR format:
- The sounds of every language have specific distributional patterns within that language. For example, in all dialects of English, the sound
G
in sing (".1sIG"
) does not occur at the beginning of a word. Other US English sounds that have a particularly narrow distribution are the glottal stop (?
), the flap (F
), and the syllabic nasal (N
). If you enter a sound symbol in a context in which it does not normally occur, the resulting speech might sound unnatural. - The service applies a sophisticated set of linguistic rules to its input to reflect the processes by which sounds change in specific contexts in natural language. For example, in US English, the sound
t
of the word write (".1r1Yt"
) is pronounced as a flap (F
) in writer (".1rY.0FR"
). SPR input undergoes these modifications just as ordinary input text does. In this example, whether you enter".1rY.0tR"
or".1rY.0FR"
does not affect the speech that is generated.
Working with IPA
You can define IPA pronunciations by using phonetic symbols or Unicode values. IPA is an industry standard notation. The following are examples of valid IPA notations for the word tomato in phonetic symbols and Unicode:
<phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ">tomato</phoneme>
<phoneme alphabet="ipa" ph="təˈmeɪ.ɾoʊ">tomato</phoneme>
Specifying syllables
You can specify syllable boundaries and stress in both SPR and IPA.
Syllable boundaries
You can use a .
(period, IPA Unicode 002E
) to mark the beginning of each syllable in SPR or IPA. However, to preserve the valid phonetics of a language, the service can elect not to honor periods in some cases (for
example, if a syllable boundary is placed at an illegal or unnatural position for a language). In general, in cases where you can indicate a valid preference for a syllable boundary or other aspect of a word's pronunciation, the service
honors such requests.
Syllable stress
Table 1 identifies the symbols that you can use to indicate syllable stress for a pronunciation. IBM recommends that you indicate primary stress for pronunciations in either SPR or IPA. However, indicating syllable stress is optional for both formats; the service determines where stress occurs if you do not indicate it.
Stress | SPR symbol | IPA symbol | IPA Unicode |
---|---|---|---|
Primary stress | 1 |
ˈ |
02C8 |
Secondary stress | 2 |
ˌ |
02CC |
No stress | 0 |
No symbol | No value |
You must place a syllable stress marker within a syllable boundary but always to the left of the syllable's vowel. You can place a marker anywhere to the left of the stressed vowel. For example, each of the following SPR examples places the
primary stress (1
) on the correct vowel of the word construction:
<phoneme alphabet="ibm" ph="kXn1strHkSXn">construction</phoneme>
<phoneme alphabet="ibm" ph="kXns1trHkSXn">construction</phoneme>
<phoneme alphabet="ibm" ph="kXnst1rHkSXn">construction</phoneme>
<phoneme alphabet="ibm" ph="kXnstr1HkSXn">construction</phoneme>
Language-specific rules for using syllable stress
Table 2 lists language-specific considerations that apply to specifying syllable stress. Unless the table qualifies the rules for a language, you can use the syllable stress symbols described in the previous section.
Language | Notation | Language-specific rules |
---|---|---|
French and Canadian French |
SPR | All syllable stress symbols are honored. But syllable stress must immediately precede the vowel of the syllable. Syllable stress for French is much stricter than for other languages. An error occurs if you place the stress symbol in an invalid location. |
French and Canadian French |
IPA | All syllable stress symbols are ignored. |
Italian | SPR and IPA | You can specify only 1 (primary stress). An error occurs if you specify secondary or no stress. |
Japanese | SPR and IPA | You can specify only 1 (primary stress) and 0 (no stress). An error occurs if you specify secondary stress. |
Spanish | SPR and IPA | You can specify only 1 (primary stress). An error occurs if you specify secondary or no stress. |
Phonetic symbols for supported languages
Table 3 lists the languages that the service supports and provides links to topics that describe their SPR symbols, IPA symbols, and IPA Unicode values. The topics provide examples of each symbol in words from the language. Because of dialectal differences, the examples might not always match your pronunciation.
The Availability column indicates whether each voice is available for IBM Cloud, IBM Cloud Pak for Data, or both (All versions). For more information about the supported voices, see Languages and voices.
Language | Availability |
---|---|
Dutch (Netherlands) symbols | All versions |
English (Australian) symbols | All versions |
English (United Kingdom) symbols | All versions |
English (United States) symbols | All versions |
French (Canadian) symbols | All versions |
French (France) symbols | All versions |
German symbols | All versions |
Italian symbols | All versions |
Japanese symbols | All versions |
Korean symbols | All versions |
Portuguese (Brazilian) symbols | All versions |
Spanish symbols | All versions |