Using a custom prompt for speech synthesis
The Tune by Example feature is beta functionality that is supported only for US English custom models and voices.
To use a custom prompt in a speech synthesis request, you include the simple <ibm:prompt>
element as the text of the request. This element is an IBM-specific extension to SSML. The element has one attribute, id
,
which is a string that identifies a predefined prompt:
<ibm:prompt id="{prompt_id}"/>
In addition to the Rules for creating custom prompts, the following restrictions apply to the use of a prompt in a speech synthesis request:
- Only a single prompt can be used in a synthesis request. You cannot include two prompts in the same request.
- A prompt must be the only thing that appears in a synthesis request. You cannot include additional text with the prompt.
- A prompt can include only fixed text, not variable data that can change for different uses of the prompt. For example, "Your account balance is $500" contains variable data: "$500." The account balance is variable data that changes depending on a specific user's account. The prompt needs to speak "Your account balance is," and a second synthesis request needs to say the balance.
Examples of using a custom prompt
The following examples show speech synthesis requests for the goodbye
prompt that was created in Creating a custom prompt. The examples use the en-US_AllisonV3Voice
to speak the prompt and accept audio that is in the audio/ogg;codecs=opus
audio format. You could use these same calls to evaluate the prompt before using it in a production application.
For the customization_id
, substitute the GUID of the custom model that contains the prompt. Note that the speaker ID is specified only when you create a prompt, not when you include the prompt in a synthesis request.
The following examples use the HTTP interface to synthesize the prompt:
-
This example call the HTTP
POST /v1/synthesize
method to synthesize the prompt:IBM Cloud
curl -X POST -u "apikey:{apikey}" \ --header "Content-Type: application/json" \ --header "Accept: audio/ogg;codecs=opus" \ --data "{\"text\":\"<ibm:prompt id='goodbye'/>\"}" \ "{url}/v1/synthesize?customization_id={customization_id}&voice=en-US_AllisonV3Voice"
IBM Cloud Pak for Data
curl -X POST \ --header "Authorization: Bearer {token}" \ --header "Content-Type: application/json" \ --header "Accept: audio/ogg;codecs=opus" \ --data "{\"text\":\"<ibm:prompt id='goodbye'/>\"}" \ "{url}/v1/synthesize?customization_id={customization_id}&voice=en-US_AllisonV3Voice"
-
This example calls the
GET /v1/synthesize
method to synthesize the prompt, which must be URL-encoded:IBM Cloud
curl -X GET -u apikey:{apikey } \ --header "Accept: audio/ogg;codecs=opus" \ "{url}/v1/synthesize?customization_id={customization_id}&voice=en-US_AllisonV3Voice&text=%3Cibm%3Aprompt%20id%3D%22goodbye%22%2F%3E"
IBM Cloud Pak for Data
curl -X GET \ --header "Authorization: Bearer {token}" \ --header "Accept: audio/ogg;codecs=opus" \ "{url}/v1/synthesize?customization_id={customization_id}&voice=en-US_AllisonV3Voice&text=%3Cibm%3Aprompt%20id%3D%22goodbye%22%2F%3E"
The following snippet of JavaScript code uses the WebSocket interface to synthesize the prompt:
var access_token = '{access_token}';
var wsURI = '{ws_url}/v1/synthesize'
+ '?access_token=' + access_token
+ '&customization_id={customization_id}'
+ '&voice=en-US_AllisonV3Voice';
var websocket = new WebSocket(wsURI);
function onOpen(evt) {
var message = {
text: '<ibm:prompt id="goodbye"/>',
accept: 'audio/ogg;codecs=opus'
};
websocket.send(JSON.stringify(message));
}
. . .