Introduction

The IBM Watson™ Speech to Text: Customer Care service provides APIs that use IBM's speech-recognition capabilities to produce transcripts of spoken audio. The Speech to Text: Customer Care service is an on-premise solution that is available only on IBM Cloud Private.

The service can transcribe speech from various languages and audio formats. In addition to basic transcription, the service can produce detailed information about many different aspects of the audio. For all languages, the service supports two sampling rates, broadband and narrowband. It returns all JSON response content in the UTF-8 character set.

For speech recognition, the service supports synchronous and asynchronous HTTP Representational State Transfer (REST) interfaces. It also supports a WebSocket interface that provides a full-duplex, low-latency communication channel: Clients send requests and audio to the service and receive results over a single connection asynchronously.

The service also supports a batch-processing interface for performing speech recognition on multiple audio files. Batch processing also provides speech analytics capabilities, which return results for conversations and for each individual speaker turn from a conversation. Batch processing requires the use of a Cloud Object Storage (COS) server to manage the input and output buckets in which the service reads and writes files.

The service also offers two customization interfaces. Use language model customization to expand the vocabulary of a base model with domain-specific terminology. Use acoustic model customization to adapt a base model for the acoustic characteristics of your audio. For language model customization, the service also supports grammars. A grammar is a formal language specification that lets you restrict the phrases that the service can recognize.

Language model customization is generally available for production use with all supported languages. Acoustic model customization is beta functionality that is available for all supported languages.

Java SDK version 7.0.0 only requires additional configuration to set the target. For details, see https://github.com/watson-developer-cloud/java-sdk#installation.

Beginning with version 4.0.0, the Node SDK returns a Promise for all methods when a callback is not specified.

The package location moved to ibm-watson. It remains available at watson-developer-cloud but is not updated there. Use ibm-watson to stay up to date.

The code examples on this tab use the client library that is provided for Java.

Maven

<dependency>
  <groupId>com.ibm.watson</groupId>
  <artifactId>ibm-watson</artifactId>
  <version>7.1.1</version>
</dependency>

Gradle

compile 'com.ibm.watson:ibm-watson:7.1.1'

GitHub

The code examples on this tab use the client library that is provided for Node.js.

Installation

npm install ibm-watson

GitHub

The code examples on this tab use the client library that is provided for Python.

Installation

pip install --upgrade "ibm-watson>=3.1.1"

GitHub

The code examples on this tab use the client library that is provided for Ruby.

Installation

gem install ibm_watson

GitHub

The code examples on this tab use the client library that is provided for Go.

go get -u github.com/watson-developer-cloud/go-sdk/...

GitHub

The code examples on this tab use the client library that is provided for Swift.

Cocoapods

pod 'IBMWatsonSpeechToTextV1', '~> 2.1.0'

Carthage

github "watson-developer-cloud/swift-sdk" ~> 2.1.0

Swift Package Manager

.package(url: "https://github.com/watson-developer-cloud/swift-sdk", from: "2.1.0")

GitHub

Authentication

You authenticate by providing the API key for your service instance. A Speech to Text: Customer Care cluster has a single instance of the service and a single API key.

The SDK manages the lifecycle of the tokens for the API key.

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

curl -u "apikey:{apikey}" -X {request_method} "https://{icp_cluster_host}{:port}/speech-to-text/api/v1/{method}"

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

SpeechToText speechToText = new SpeechToText();
speechToText.setUsernameAndPassword("apikey","{apikey}");
speechToText.setEndPoint("https://{icp_cluster_host}{:port}/speech-to-text/api");

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

const  = require('ibm-watson/speech-to-text/v1');

const speechToText = new SpeechToTextV1({
  username: 'apikey',
  password: '{apikey}',
  url: 'https://{icp_cluster_host}{:port}/speech-to-text/api',
});

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

from ibm_watson import SpeechToTextV1

speech_to_text = SpeechToTextV1(
    username='apikey',
    password='{apikey}',
    url='https://{icp_cluster_host}{:port}/speech-to-text/api'
)

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

require "ibm_watson"

speech_to_text = IBMWatson::SpeechToTextV1.new(
  username: "apikey",
  password: "{apikey}",
  url:"https://{icp_cluster_host}{:port}/speech-to-text/api"
)

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

import "github.com/watson-developer-cloud/go-sdk/speechtotextv1"

speechToText, speechToTextErr := speechtotextv1.NewSpeechToTextV1(&speechtotextv1.SpeechToTextV1Options{
  Username: "apikey",
  Password: "{apikey}",
  URL:      "https://{icp_cluster_host}{:port}/speech-to-text/api",
})

Replace {apikey} with the API key for your service instance. Replace {icp_cluster_host} and {port} with the name or IP address of the host on which your Speech to Text: Customer Care cluster is deployed and the port number on which the service listens. The default port is 443.

let speechToText = SpeechToText(apikey: "{apikey}")
speechToText.serviceURL = "https://{icp_cluster_host}{:port}/speech-to-text/api"

Service endpoint

The service endpoint for Speech to Text: Customer Care is always the following base URL:

https://{icp_cluster_host}{:port}/speech-to-text/api

Replace {icp_cluster_host} with the name or IP address of the host on which your cluster is deployed. Replace {port} with the port number on which the service listens. The default port is 443.

Use that URL in your requests to Speech to Text: Customer Care.

Set the correct service URL by calling the setEndPoint() method of the service instance.

Set the correct service URL by calling the url parameter when you create the service instance.

Set the correct service URL by calling the url parameter when you create the service instance or by calling the set_url() method of the service instance.

Set the correct service URL by calling the url parameter when you create the service instance or by calling the url= method of the service instance.

Set the correct service URL by the URL parameter when you create the service instance or by calling the SetURL= method of the service instance.

Set the correct service URL by setting the serviceURL property of the service instance.

Example

curl -u "apikey:{apikey}" -X {request_method} "https://{icp_cluster_host}{:port}/speech-to-text/api/{method}"

Example

SpeechToText speechToText = new SpeechToText();
speechToText.setUsernameAndPassword("apikey","{apikey}");
speechToText.setEndPoint("https://{icp_cluster_host}{:port}/speech-to-text/api");

Example

const  = require('ibm-watson/speech-to-text/v1');

const speechToText = new SpeechToTextV1({
  username: 'apikey',
  password: '{apikey}',
  url: 'https://{icp_cluster_host}{:port}/speech-to-text/api',
});

Examples in the constructor and after instantiation

from ibm_watson import SpeechToTextV1

speech_to_text = SpeechToTextV1(
    username='apikey',
    password='{apikey}',
    url='https://{icp_cluster_host}{:port}/speech-to-text/api'
)

or

speech_to_text.set_url('https://{icp_cluster_host}{:port}/speech-to-text/api')

Examples in the constructor and after instantiation

require "ibm_watson"

speech_to_text = IBMWatson::SpeechToTextV1.new(
  username: "apikey",
  password: "{apikey}",
  url:"https://{icp_cluster_host}{:port}/speech-to-text/api"
)

or

speech_to_text.url = "https://{icp_cluster_host}{:port}/speech-to-text/api"

Examples in the constructor and after instantiation

import "github.com/watson-developer-cloud/go-sdk/speechtotextv1"

speechToText, speechToTextErr := speechtotextv1.NewSpeechToTextV1(&speechtotextv1.SpeechToTextV1Options{
  Username: "apikey",
  Password: "{apikey}",
  URL:      "https://{icp_cluster_host}{:port}/speech-to-text/api",
})

or

speechToText.SetURL("https://{icp_cluster_host}{:port}/speech-to-text/api")

Example

let speechToText = SpeechToText(apikey: "{apikey}")
speechToText.serviceURL = "https://{icp_cluster_host}{:port}/speech-to-text/api"

Disabling SSL verification

All Watson services use Secure Sockets Layer (SSL) (or Transport Layer Security (TLS)) for secure connections between the client and server. The connection is verified against the local certificate store to ensure authentication, integrity, and confidentiality.

If you use a self-signed certificate, you need to disable SSL verification to make a successful connection.

Enabling SSL verification is highly recommended. Disabling SSL jeopardizes the security of the connection and data. Disable SSL only if absolutely necessary, and take steps to enable SSL as soon as possible.

To disable SSL verification for a curl request, use the --insecure (-k) option with the request.

To disable SSL verification, create an HttpConfigOptions object and set the disableSslVerification property to true. Then pass the object to the service instance by using the configureClient method.

To disable SSL verification, set the disable_ssl_verification parameter to true when you create the service instance.

To disable SSL verification, call the disable_SSL_verification method on the service instance.

To disable SSL verification, call the configure_http_client method on the service instance and set the disable_ssl parameter to true.

To disable SSL verification, call the DisableSSLVerification method on the service instance.

To disable SSL verification, call the disableSSLVerification() method on the service instance. You cannot disable SSL verification on Linux.

Example that disables SSL verification

curl -k -u "apikey:{apikey}" -X {request_method} "https://{icp_cluster_host}{:port}/speech-to-text/api/v1/{method}"

Example that disables SSL verification

SpeechToText speechToText = new SpeechToText();
speechToText.setUsernameAndPassword("apikey","{apikey}");
speechToText.setEndPoint("https://{icp_cluster_host}{:port}/speech-to-text/api");

HttpConfigOptions configOptions = new HttpConfigOptions.Builder()
  .disableSslVerification(true)
  .build();
speechToText.configureClient(configOptions);

Example that disables SSL verification

const  = require('ibm-watson/speech-to-text/v1');

const speechToText = new SpeechToTextV1({
  username: 'apikey',
  password: '{apikey}',
  url: 'https://{icp_cluster_host}{:port}/speech-to-text/api',
  disable_ssl_verification: true,
});

Example that disables SSL verification

from ibm_watson import SpeechToTextV1

speech_to_text = SpeechToTextV1(
    username='apikey',
    password='{apikey}',
    url='https://{icp_cluster_host}{:port}/speech-to-text/api'
)
speech_to_text.disable_SSL_verification()

Example that disables SSL verification

require "ibm_watson"

speech_to_text = IBMWatson::SpeechToTextV1.new(
  username: "apikey",
  password: "{apikey}",
  url:"https://{icp_cluster_host}{:port}/speech-to-text/api"
)
speech_to_text.configure_http_client(disable_ssl: true)

Example that disables SSL verification

import "github.com/watson-developer-cloud/go-sdk/speechtotextv1"

speechToText, speechToTextErr := speechtotextv1.NewSpeechToTextV1(&speechtotextv1.SpeechToTextV1Options{
  Username: "apikey",
  Password: "{apikey}",
  URL:      "https://{icp_cluster_host}{:port}/speech-to-text/api",
})
speechToText.DisableSSLVerification()

Example that disables SSL verification


let speechToText = SpeechToText(apiKey: "{apikey}")
speechToText.disableSSLVerification()

Error handling

Speech to Text: Customer Care uses standard HTTP response codes to indicate whether a method completed successfully. HTTP response codes in the 2xx range indicate success. A response in the 4xx range is some sort of failure, and a response in the 5xx range usually indicates an internal system error that cannot be resolved by the user. Response codes are listed with the method.

ErrorResponse

Name Description
error
string
Description of the problem.
code
integer
HTTP response code.
code_description
string
Response message.
warnings
string
Warnings associated with the error.

The Java SDK generates an exception for any unsuccessful method invocation. All methods that accept an argument can also throw an IllegalArgumentException.

Exception Description
IllegalArgumentException An illegal argument was passed to the method.

When the Java SDK receives an error response from the Speech to Text: Customer Care service, it generates an exception from the com.ibm.watson.developer_cloud.service.exception package. All service exceptions contain the following fields.

Field Description
statusCode The HTTP response code returned.
message A message that describes the error.

When the Node SDK receives an error response from the Speech to Text: Customer Care service, it creates an Error object with information that describes the error that occurred. This error object is passed as the first parameter to the callback function for the method. The contents of the error object are as shown in the following table.

Error

Field Description
code The HTTP response code returned.
message A message that describes the error.

The Python SDK generates an exception for any unsuccessful method invocation. When the Python SDK receives an error response from the Speech to Text: Customer Care service, it generates an ApiException that contains the following fields.

Field Description
code The HTTP response code returned.
message A message that describes the error.
info A dictionary of additional information about the error.

When the Ruby SDK receives an error response from the Speech to Text: Customer Care service, it generates an ApiException that contains the following fields.

Field Description
code The HTTP response code returned.
message A message that describes the error.
info A dictionary of additional information about the error.

The Go SDK generates an error for any unsuccessful service instantiation and method invocation. You can check for the error immediately. The contents of the error object are as shown in the following table.

Error

Field Description
code The HTTP response code returned.
message A message that describes the error.

The Swift SDK returns a WatsonError in the completionHandler any unsuccessful method invocation. This error type is an enum that conforms to LocalizedError and contains an errorDescription property that returns an error message. Some of the WatsonError cases contain associated values that reveal more information about the error.

Field Description
errorDescription A message that describes the error.

Example error handling

try {
  // Invoke a Speech to Text: Customer Care method
} catch (NotFoundException e) {
  // Handle Not Found (404) exception
} catch (RequestTooLargeException e) {
  // Handle Request Too Large (413) exception
} catch (ServiceResponseException e) {
  // Base class for all exceptions caused by error responses from the service
  System.out.println("Service returned status code "
    + e.getStatusCode() + ": " + e.getMessage());
}

Example error handling

speechToText.method(params)
  .catch(err => {
    console.log('error:', err);
  });

Example error handling

from ibm_watson import ApiException
try:
    # Invoke a Speech to Text: Customer Care method
except ApiException as ex:
    print "Method failed with status code " + str(ex.code) + ": " + ex.message

Example error handling

require "ibm_watson"
begin
  # Invoke a Speech to Text: Customer Care method
rescue IBMWatson::ApiException => ex
  print "Method failed with status code #{ex.code}: #{ex.error}"
end

Example error handling

import "github.com/watson-developer-cloud/go-sdk/speechtotextv1"

// Instantiate a service
speechToText, speechToTextErr := speechtotextv1.NewSpeechToTextV1(&speechToTextv1.SpeechToTextV1Options{})

// Check for error
if speechToTextErr != nil {
  panic(speechToTextErr)
}

// Call a method
response, responseErr := speechToText.methodName(&methodOptions)

// Check for error
if responseErr != nil {
  panic(responseErr)
}

Example error handling

speechToText.method() {
  response, error in

  if let error = error {
    switch error {
    case let .http(statusCode, message, metadata):
      switch statusCode {
      case .some(404):
        // Handle Not Found (404) exception
        print("Not found")
      case .some(413):
        // Handle Request Too Large (413) exception
        print("Payload too large")
      default:
        if let statusCode = statusCode {
          print("Error - code: \(statusCode), \(message ?? "")")
        }
      }
    default:
      print(error.localizedDescription)
    }
    return
  }

  guard let result = response?.result else {
    print(error?.localizedDescription ?? "unknown error")
    return
  }

  print(result)
}

Data handling

Additional headers

Some Watson services accept special parameters in headers that are passed with the request.

You can pass request header parameters in all requests or in a single request to the service.

To pass a request header, use the --header (-H) option with a curl request.

To pass header parameters with every request, use the setDefaultHeaders method of the service object.

To pass header parameters in a single request, use the addHeader method as a modifier on the request before you execute the request.

To pass header parameters with every request, specify the headers parameter when you create the service object.

To pass header parameters in a single request, use the headers method as a modifier on the request before you execute the request.

To pass header parameters with every request, specify the set_default_headers method of the service object.

To pass header parameters in a single request, include headers as a dict in the request.

To pass header parameters with every request, specify the add_default_headers method of the service object.

To pass header parameters in a single request, specify the headers method as a chainable method in the request.

To pass header parameters with every request, specify the SetDefaultHeaders method of the service object.

To pass header parameters in a single request, specify the Headers as a map in the request.

To pass header parameters with every request, add them to the defaultHeaders property of the service object.

To pass header parameters in a single request, pass the headers parameter to the request method.

Example header parameter in a request

curl -u "apikey:{apikey}" -X {request_method} --header "Request-Header: {header_value}" "https://{icp_cluster_host}{:port}/speech-to-text/api/{method}"

Example header parameter in a request

ReturnType returnValue = speechToText.methodName(parameters)
  .addHeader("Custom-Header", "{header_value}")
  .execute();

Example header parameter in a request

const parameters = {
  {parameters}
};

speechToText.methodName(
  parameters,
  headers: {
    'Custom-Header': '{header_value}'
  })
   .then(result => {
    console.log(response);
  })
  .catch(err => {
    console.log('error:', err);
  });

Example header parameter in a request

response = speech_to_text.methodName(
    parameters,
    headers = {
        'Custom-Header': '{header_value}'
    })

Example header parameter in a request

response = speech_to_text.headers(
  "Custom-Header" => "{header_value}"
).methodName(parameters)

Example header parameter in a request

response, _ := speechtotextv1.methodName(
  &methodOptions{
    Headers: map[string]string{
      "Accept": "application/json",
    },
  },
)

Example header parameter in a request

let customHeader: [String: String] = ["Custom-Header": "{header_value}"]
speechToText.methodName(parameters, headers: customHeader) {
  response, error in
}

Response details

Speech to Text: Customer Care might return information to the application in response headers.

To access all response headers that the service returns, include the --include (-i) option with a curl request. To see detailed response data for the request, including request headers, response headers, and additional debugging information, include the --verbose (-v) option with the request.

Example request to access response headers

curl -u "apikey:{apikey}" -X {request_method} --include "https://{icp_cluster_host}{:port}/speech-to-text/api/{method}"

To access information in the response headers, use one of the request methods that returns details with the response: executeWithDetails(), enqueueWithDetails(), or rxWithDetails(). These methods return a Response<T> object, where T is the expected response model. Use the getResult() method to access the response object for the method, and use the getHeaders() method to access information in response headers.

Example request to access response headers

Response<ReturnType> response = speechToText.methodName(parameters)
  .executeWithDetails();
// Access response from methodName
ReturnType returnValue = response.getResult();
// Access information in response headers
Headers responseHeaders = response.getHeaders();

To access information in the response headers, add the return_response parameter set to true and specify the headers attribute on the response object that is returned by the method. To access information in the response object, use the following properties.

Property Description
result Returns the response for the service-specific method.
headers Returns the response header information.
status Returns the HTTP status code.

Example request to access response headers

const parameters = {
  {parameters}
};

parameters.return_response = true;

speechToText.methodName(parameters)
  .then(response => {
    console.log(response.headers);
  })
  .catch(err => {
    console.log('error:', err);
  });

The return value from all service methods is a DetailedResponse object. To access information in the result object or response headers, use the following methods.

DetailedResponse

Method Description
get_result() Returns the response for the service-specific method.
get_headers() Returns the response header information.
get_status_code() Returns the HTTP status code.

Example request to access response headers

speech_to_text.set_detailed_response(True)
response = speech_to_text.methodName(parameters)
# Access response from methodName
print(json.dumps(response.get_result(), indent=2))
# Access information in response headers
print(response.get_headers())
# Access HTTP response status
print(response.get_status_code())

The return value from all service methods is a DetailedResponse object. To access information in the response object, use the following properties.

DetailedResponse

Property Description
result Returns the response for the service-specific method.
headers Returns the response header information.
status Returns the HTTP status code.

Example request to access response headers

response = speech_to_text.methodName(parameters)
# Access response from methodName
print response.result
# Access information in response headers
print response.headers
# Access HTTP response status
print response.status

The return value from all service methods is a DetailedResponse object. To access information in the response object or response headers, use the following methods.

DetailedResponse

Method Description
GetResult() Returns the response for the service-specific method.
GetHeaders() Returns the response header information.
GetStatusCode() Returns the HTTP status code.

Example request to access response headers

import "github.com/IBM/go-sdk-core/core"
response, _ := speechtotextv1.methodName(&methodOptions{})

// Access result
core.PrettyPrint(response.GetResult(), "Result ")

// Access response headers
core.PrettyPrint(response.GetHeaders(), "Headers ")

// Access status code
core.PrettyPrint(response.GetStatusCode(), "Status Code ")

All response data is available in the WatsonResponse<T> object returned in each method's completionHandler.

Example request to access response headers

speechToText.methodName(parameters) {
  response, error in

  guard let result = response?.result else {
    print(error?.localizedDescription ?? "unknown error")
    return
  }
  print(result) // The data returned by the service
  print(response?.statusCode)
  print(response?.headers)
}

Data labels

You can remove customer data if you associate the customer and the data when you send the information to a service. First you label the data with a customer ID, and then you can delete the data by the ID.

  • Use the X-Watson-Metadata header to associate a customer ID with the data. By adding a customer ID to a request, you indicate that it contains data that belongs to that customer.

    Specify a random or generic string for the customer ID. Do not include personal data, such as an email address. Pass the string customer_id={id} as the argument of the header.

  • Use the Delete labeled data method to remove data that is associated with a customer ID.

Labeling data is used only by methods that accept customer data. For more information about Speech to Text: Customer Care and labeling data, see Information security.

For more information about how to pass headers, see Additional headers.

Synchronous and asynchronous requests

The Java SDK supports both synchronous (blocking) and asynchronous (non-blocking) execution of service methods. All service methods implement the ServiceCall interface.

  • To call a method synchronously, use the execute method of the ServiceCall interface. You can call the execute method directly from an instance of the service.
  • To call a method asynchronously, use the enqueue method of the ServiceCall interface to receive a callback when the response arrives. The ServiceCallback interface of the method's argument provides onResponse and onFailure methods that you override to handle the callback.

The Ruby SDK supports both synchronous (blocking) and asynchronous (non-blocking) execution of service methods. All service methods implement the Concurrent::Async module. When you use the synchronous or asynchronous methods, an IVar object is returned. You access the DetailedResponse object by calling ivar_object.value.

For more information about the Ivar object, see the IVar class docs.

  • To call a method synchronously, either call the method directly or use the .await chainable method of the Concurrent::Async module.

    Calling a method directly (without .await) returns a DetailedResponse object.

  • To call a method asynchronously, use the .async chainable method of the Concurrent::Async module.

You can call the .await and .async methods directly from an instance of the service.

Example synchronous request

ReturnType returnValue = speechToText.method(parameters).execute();

Example asynchronous request

speechToText.method(parameters).enqueue(new ServiceCallback<ReturnType>() {
  @Override public void onResponse(ReturnType response) {
    . . .
  }
  @Override public void onFailure(Exception e) {
    . . .
  }
});

Example synchronous request

response = speech_to_text.method_name(parameters)

or

response = speech_to_text.await.method_name(parameters)

Example asynchronous request

response = speech_to_text.async.method_name(parameters)

WebSockets

Recognize audio (WebSockets)

Sends audio and returns transcription results for recognition requests over a WebSocket connection. Requests and responses are enabled over a single TCP connection that abstracts much of the complexity of the request to offer efficient implementation, low latency, high throughput, and an asynchronous response. The endpoint for the WebSocket API is

wss://{icp_cluster_host}/speech-to-text/api/v1/recognize

You can pass a maximum of 100 MB and a minimum of 100 bytes of audio per utterance (per recognition request). You can send multiple utterances over a single WebSocket connection. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding.

By default, the service returns only final results for any request. To enable interim results, set the interim_results interimResults parameter to true.

The WebSocket interface cannot be called from curl. Use a client-side scripting language to call the interface. The example request uses JavaScript to invoke the WebSocket recognize method.

You cannot use JavaScript to call the WebSocket interface from a browser. The watson-token parameter that is available with the /v1/recognize method does not accept API keys. For information about working around this limitation, see the Release notes.

The createRecognizeStream method is deprecated. Use the equivalent recognizeUsingWebSocket method instead.

The recognize_with_websocket method is deprecated. Use the equivalent recognize_using_websocket method instead.

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the content-type contentType content_type parameter with the request to specify the format of the audio.
  • For all other formats, you can omit the content-type contentType content_type parameter or specify application/octet-stream with the parameter to have the service automatically detect the format of the audio.

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • application/octet-stream
  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

See also: Audio formats

The Python recognize_using_websocket method requires the content_type parameter.

URI /v1/recognize
okhttp3.WebSocket recognizeUsingWebSocket(RecognizeOptions options,
  RecognizeCallback callback)
RecognizeStream recognizeUsingWebSocket(params)
dict recognize_using_websocket(audio, content_type,
  recognize_callback, model=None,
  language_customization_id=None, acoustic_customization_id=None,
  customization_weight=None, base_model_version=None,
  inactivity_timeout=None, interim_results=None,
  keywords=None, keywords_threshold=None,
  max_alternatives=None, word_alternatives_threshold=None,
  word_confidence=None, timestamps=None, profanity_filter=None,
  smart_formatting=None, speaker_labels=None, http_proxy_host=None,
  http_proxy_port=None, **kwargs)
WebSocketClient recognize_using_websocket(content_type:,
  recognize_callback:, audio: nil, chunk_data: false, model: nil,
  language_customization_id: nil, acoustic_customization_id: nil,
  customization_weight: nil, base_model_version: nil,
  inactivity_timeout: nil, interim_results: nil,
  keywords: nil, keywords_threshold: nil,
  max_alternatives: nil, word_alternatives_threshold: nil,
  word_confidence: nil, timestamps: nil, profanity_filter: nil,
  smart_formatting: nil, speaker_labels: nil)

Request

The client calls the recognize method to obtain a string that contains the URI for the WebSocket interface. The call to the recognize method sets basic parameters for the connection and for all recognition requests that are sent over it. See the Parameters of recognize method table.

The client then establishes a connection with the service by passing the URI to the WebSocket constructor, which returns a WebSocket connection object. The client initiates and manages recognition requests by sending JSON-formatted text messages to the service over the connection. The text messages can include all other parameters of the recognition request. The required action parameter tells the service which action is to be performed. See the Parameters of WebSocket text messages table.

After sending the text message to initiate a request, the client sends the audio data to be transcribed as a binary message (blob) over the connection.

Parameters of recognize method

  • Provides an authentication token for the service.

    Important: You cannot use JavaScript to call the WebSocket interface from a browser. The watson-token parameter does not accept API keys. For information about working around this limitation, see the Release notes.

  • The identifier of the model that is to be used for all recognition requests sent over the See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used for all requests sent over the connection. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to use the specified model with no custom language model. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used for the request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to use the specified model with no custom acoustic model. See Custom models.

  • The version of the specified base model that is to be used for all requests sent over the connection. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • Associates a customer ID with all data that is passed over the connection. The parameter accepts the argument customer_id={id}, where {id} is a random or generic string that is to be associated with the data. URL-encode the argument to the parameter, for example customer_id%3dmy_ID. By default, no customer ID is associated with the data. See Data labels.

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with all requests sent over the connection. Do not specify both parameters with a request.

Call the recognizeUsingWebSocket method to initiate a recognition request. Use the recognizeOptions argument to pass a RecognizeOptions object that provides the parameters for the request, including the audio. Use the callback argument to pass a Java BaseRecognizeCallback object to handle events from the WebSocket connection.

Call the recognizeUsingWebSocket method to initiate a recognition request. The method returns a RecognizeStream object to which you pipe the audio that is to be transcribed. You also use the object's on method to define event handlers for the request. You pass all other parameters of the request as arguments of the method.

Call the recognize_using_websocket method to initiate a recognition request. Pass the audio and all parameters of the request, including the RecognizeCallback and AudioSource objects, as arguments of the method.

Call the recognize_using_websocket method to create a WebSocketClient object. Pass the audio and all parameters of the request, including the RecognizeCallback object, as arguments of the method.

Parameters of WebSocket text messages

Parameters

  • The action that is to be performed.

    Allowable values:

    • start initiates a recognition request. The message can also include any other optional parameters described in this table. After sending this text message, the client sends the data as a binary message (blob).

      Between recognition requests, the client can send new start messages to modify the parameters that are to be used for subsequent requests. By default, the service continues to use the parameters that were specified with the previous start message.

    • stop indicates that all audio data for the request has been sent to the service. The client can send additional requests with the same or different parameters.

  • Indicates how the data event handler is to return the response from the service:

    • If false, the event handler returns only a string with the final transcription of the recognition results, regardless of the parameters that you pass with the request. You must set the encoding for your instance of the RecognizeStream object to UTF-8 by including a call that is similar to the following line of code in your application:

      recognizeStream.setEncoding('utf8');

      Do not include this call if you set the objectMode parameter to true.

    • If true, the event handler returns the recognition results exactly as it receives them from the service: as one or more instances of a SpeechRecognitionResults object.

    For more information, see the Example request for the method.

  • The audio that is to be transcribed.

    An AudioSource object that provides the audio that is to be transcribed.

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream, audio/basic, audio/flac, audio/g729, audio/l16, audio/mp3, audio/mpeg, audio/mulaw, audio/ogg, audio/ogg;codecs=opus, audio/ogg;codecs=vorbis, audio/wav, audio/webm, audio/webm;codecs=opus, audio/webm;codecs=vorbis]

  • A BaseRecognizeCallback object that implements the RecognizeCallback interface to handle events from the WebSocket connection. Override the definitions of the object's default methods to respond to events as needed by your application.

    A RecognizeCallback object that defines methods to handle events from the WebSocket connection. Override the definitions of the object's default methods to respond to events as needed by your application.

  • The audio that is to be transcribed.

  • If true, the WebSocketClient expects to receive data in chunks rather than as a single audio file. See Audio transmission.

    Default: false

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used for the request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to use the specified model with no custom language model. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id customizationId parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used for the request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with service credentials created for the instance of the service that owns the custom model. Omit the parameter to use the specified model with no custom acoustic model. See Custom models.

  • If you specify a customization ID when you open the connection, If you specify a customization ID, you can use the customization weight to tell the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The version of the specified base model that is to be used for the request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • The time in seconds after which, if only silence (no speech) is detected in the audio, the connection is closed. The default is 30 seconds. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

    Default: 30

  • If true, the service returns interim results as a stream of JSON SpeechRecognitionResults objects. If false, the service returns a single SpeechRecognitionResults object with final results only. See Interim results. (See the objectMode parameter for information about controlling the response from the method.)

    Default: false

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. No keyword spotting is performed if you omit the parameter. If you specify a threshold, you must also specify one or more keywords. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, a single transcription is returned. See Maximum alternatives.

    Default: 1

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. No alternative words are computed if you omit the parameter. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, no word confidence measures are returned. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, no smart formatting is performed. Applies to US English and Japanese transcription only. See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, no speaker labels are returned. Specifying true forces the timestamps parameter to be true, regardless of whether you specify false for that parameter.

    To determine whether a language model supports speaker labels, use the Get a model method and check that the attribute speaker_labels is set to true. See Speaker labels.

    Default: false

  • If you are passing requests through a proxy, specify the host name of the proxy server. Use the http_proxy_port parameter to specify the port number at which the proxy listens. Omit both parameters if you are not using a proxy.

    Default: None

  • If you are passing requests through a proxy, specify the port number at which the proxy service listens. Use the http_proxy_host parameter to specify the host name of the proxy. Omit both parameters if you are not using a proxy.

    Default: None

  • Deprecated. Use the language_customization_id languageCustomizationId parameter to specify the customization ID (GUID) of a custom language model that is to be used with the request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, no redaction is performed.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    See Numeric redaction.

    Default: false

Example request

var token = "{authentication-token}";
var wsURI = "wss://{icp_cluster_host}/speech-to-text/api/v1/recognize"
  + "?watson-token=" + token + '&model=en-US_BroadbandModel';

var websocket = new WebSocket(wsURI);
websocket.onopen = function(evt) { onOpen(evt) };
websocket.onclose = function(evt) { onClose(evt) };
websocket.onmessage = function(evt) { onMessage(evt) };
websocket.onerror = function(evt) { onError(evt) };

function onOpen(evt) {
  var message = {
    action: 'start',
    keywords: ['colorado', 'tornado', 'tornadoes'],
    keywords_threshold: 0.5,
    max-alternatives: 3
  };
  websocket.send(JSON.stringify(message));

  // Prepare and send the audio file.
  websocket.send(blob);

  websocket.send(JSON.stringify({action: 'stop'}));
}

function onClose(evt) {
  console.log(evt.data);
}

function onMessage(evt) {
  console.log(evt.data);
}

function onError(evt) {
  console.log(evt.data);
}

Example request

SpeechToText speechToText = new SpeechToText();
speechToText.setUsernameAndPassword("apikey", "{apikey}");
speechToText.setEndPoint("https://{icp_cluster_host}{:port}/speech-to-text/api");

try {
  RecognizeOptions recognizeOptions = new RecognizeOptions.Builder()
    .audio(new FileInputStream("audio-file.flac"))
    .contentType("audio/flac")
    .model("en-US_BroadbandModel")
    .keywords(Arrays.asList("colorado", "tornado", "tornadoes"))
    .keywordsThreshold((float) 0.5)
    .maxAlternatives(3)
    .build();

  BaseRecognizeCallback baseRecognizeCallback =
    new BaseRecognizeCallback() {

      @Override
      public void onTranscription
        (SpeechRecognitionResults speechRecognitionResults) {
          System.out.println(speechRecognitionResults);
      }

      @Override
      public void onDisconnected() {
        System.exit(0);
      }

    };

  speechToText.recognizeUsingWebSocket(recognizeOptions,
    baseRecognizeCallback);
} catch (FileNotFoundException e) {
  e.printStackTrace();
}

Example request

var SpeechToTextV1 = require('ibm-watson/speech-to-text/v1');
var fs = require('fs');

var speechToText = new SpeechToTextV1({
  username: 'apikey',
  password: '{apikey}',
  url: 'https://{icp_cluster_host}{:port}/speech-to-text/api'
});

var params = {
  objectMode: true,
  content_type: 'audio/flac',
  model: 'en-US_BroadbandModel',
  keywords: ['colorado', 'tornado', 'tornadoes'],
  keywords_threshold: 0.5,
  max_alternatives: 3
};

// Create the stream.
var recognizeStream = speechToText.recognizeUsingWebSocket(params);

// Pipe in the audio.
fs.createReadStream('audio-file.flac').pipe(recognizeStream);

/*
 * Uncomment the following two lines of code ONLY if `objectMode` is `false`.
 *
 * WHEN USED TOGETHER, the two lines pipe the final transcript to the named
 * file and produce it on the console.
 *
 * WHEN USED ALONE, the following line pipes just the final transcript to
 * the named file but produces numeric values rather than strings on the
 * console.
 */
// recognizeStream.pipe(fs.createWriteStream('transcription.txt'));

/*
 * WHEN USED ALONE, the following line produces just the final transcript
 * on the console.
 */
// recognizeStream.setEncoding('utf8');

// Listen for events.
recognizeStream.on('data', function(event) { onEvent('Data:', event); });
recognizeStream.on('error', function(event) { onEvent('Error:', event); });
recognizeStream.on('close', function(event) { onEvent('Close:', event); });

// Display events on the console.
function onEvent(name, event) {
    console.log(name, JSON.stringify(event, null, 2));
};

Example request

from ibm_watson import SpeechToTextV1
from ibm_watson.websocket import RecognizeCallback, AudioSource
from os.path import join, dirname
import json

speech_to_text = SpeechToTextV1(
    username='apikey',
    password='{apikey}',
    url='https://{icp_cluster_host}{:port}/speech-to-text/api'
)

class MyRecognizeCallback(RecognizeCallback):
    def __init__(self):
        RecognizeCallback.__init__(self)

    def on_data(self, data):
        print(json.dumps(data, indent=2))

    def on_error(self, error):
        print('Error received: {}'.format(error))

    def on_inactivity_timeout(self, error):
        print('Inactivity timeout: {}'.format(error))

myRecognizeCallback = MyRecognizeCallback()

with open(join(dirname(__file__), './.', 'audio-file.flac'),
              'rb') as audio_file:
    audio_source = AudioSource(audio_file)
    speech_to_text.recognize_using_websocket(
        audio=audio_source,
        content_type='audio/flac',
        recognize_callback=myRecognizeCallback,
        model='en-US_BroadbandModel',
        keywords=['colorado', 'tornado', 'tornadoes'],
        keywords_threshold=0.5,
        max_alternatives=3)

Example request

require("ibm_watson/speech_to_text_v1")
require("ibm_watson/websocket/recognize_callback")
include IBMWatson

speech_to_text = IBMWatson::SpeechToTextV1.new(
  username: "apikey",
  password: "{apikey}",
  url: "https://{icp_cluster_host}{:port}/speech-to-text/api"
)

class MyRecognizeCallback < IBMWatson::RecognizeCallback
  def initialize
    super
  end

  def on_error(error:)
    puts "Error received: #{error}"
  end

  def on_inactivity_timeout(error:)
    puts "Inactivity timeout: #{error}"
  end

  def on_data(data:)
    puts data.to_s
  end
end

mycallback = MyRecognizeCallback.new
File.open(Dir.getwd + "/resources/speech.wav") do |audio_file|
  speech_to_text.recognize_using_websocket(
    audio: audio_file,
    recognize_callback: mycallback,
    content_type: "audio/wav"
  ).start
end

Response

Successful recognition returns one or more instances of a SpeechRecognitionResults object. The contents of the response depend on the parameters you send with the recognition request, including the interim_results parameter. For more information, see the results for the Recognize audio method.

If the objectMode parameter is true, successful recognition returns one or more instances of a SpeechRecognitionResults object. The contents of the response depend on the parameters you send with the recognition request, including the interim_results parameter. For more information, see the results for the Recognize audio method.

If the objectMode parameter is false, successful recognition returns only a single string with the final transcription results.

Response handling

Response handling for the WebSocket interface is different from HTTP response handling. The WebSocket constructor returns an instance of a WebSocket connection object. You assign application-specific calls to the following methods of the object to handle events that are associated with the connection. Each event handler must accept a single argument for an event from the connection. The event that it accepts causes it to execute.

Methods

  • The status of the connection's opening.

  • Response messages from the service, including the results of the request as one or more JSON SpeechRecognitionResults objects.

  • Errors for the connection or request.

  • The status of the connection's closing.

The callback parameter of the recognizeUsingWebSocket method accepts a Java object of type BaseRecognizeCallback, which implements the RecognizeCallback interface to handle events from the WebSocket connection. You override the definitions of the following default empty methods of the object to handle events that are associated with the connection and the request. The methods are called when their associated events occur.

Methods

  • The WebSocket connection is established.

  • The service is listening for audio.

  • Results for the request are received from the service.
  • Final results for the request have been returned by the service.

  • An error occurs in the WebSocket connection.

  • An inactivity timeout occurs for the request.

  • The WebSocket connection is closed.

You handle events that are associated with the WebSocket connection and the request by defining event-handler methods on the RecognizeCallback object that is returned by the recognizeUsingWebSocket method. The methods are called when their associated events occur. You can define handlers for the following events by using the object's on method. For more information about streams and events, see the Node.js documentation.

Events

  • Results for the request are received on the stream.

  • Data is available to be read from the stream.

  • No data remains to be read from the stream.
  • The WebSocket connection is closed.

  • An error occurs in the WebSocket connection.

The recognize_callback parameter of the recognize_using_websocket method accepts an object of type RecognizeCallback. The object defines the methods that handle events from the WebSocket connection. You can override the definitions of the following default empty methods of the object to handle events that are associated with the connection and the request. The methods are called when their associated events occur.

Methods

  • The WebSocket connection is established.

  • The service is listening for audio.

  • Returns all response data for the request from the service.
  • Returns interim results or maximum alternatives from the service when those responses are requested.

  • Returns final transcription results for the request from the service.

  • The service has returned final results for the request.

  • Reports an error in the WebSocket connection.

  • Reports an inactivity timeout for the request.

The connection can produce the following return codes.

Return code

  • The connection closed normally.

  • The connection closed because the remote peer is leaving.

  • The connection closed due to a protocol error.

  • The connection closed because the service could not process the input from the client.

  • Reserved response code.

  • The connection closed for a reason other than those defined by the remaining return codes.

  • The connection closed abnormally.

  • The connection closed because the service received invalid data.

  • The connection closed due to a policy violation.

  • The connection closed because the frame size exceeded the 4 MB limit.

  • The connection closed because the client requested a required extension that is not available.

  • The connection closed because the service encountered an unexpected internal condition that prevents it from fulfilling the request.

  • The connection was not established due to a TLS handshake error.

Example response

{
  "results": [
    {
      "final": true,
      "alternatives": [
        {
          "transcript": "several tornadoes touch down as a line of severe thunderstorms swept through Colorado on Sunday ",
          "confidence": 0.889
        },
        {
          "transcript": "several tornadoes touch down is a line of severe thunderstorms swept through Colorado on Sunday "
        },
        {
          "transcript": "several tornadoes touched down as a line of severe thunderstorms swept through Colorado on Sunday "
        }
      ],
      "keywords_result": {
        "tornadoes": [
          {
            "normalized_text": "tornadoes",
            "start_time": 1.52,
            "end_time": 2.15,
            "confidence": 1.0
          }
        ],
        "colorado": [
          {
            "normalized_text": "Colorado",
            "start_time": 4.95,
            "end_time": 5.59,
            "confidence": 0.978
          }
        ]
      }
    }
  ],
  "result_index": 0
}

Methods

List models

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Lists all language models that are available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

GET /v1/models
(speechToText *SpeechToTextV1) ListModels(listModelsOptions *ListModelsOptions) (*core.DetailedResponse, error)
ServiceCall<SpeechModels> listModels()
listModels(params, [ callback() ])
list_models(self, **kwargs)
list_models
func listModels(
    headers: [String: String]? = nil,
    completionHandler: @escaping (WatsonResponse<SpeechModels>?, WatsonError?) -> Void)
Request

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

No Request Parameters

This method does not accept any request parameters.

Response

Information about the available language models.

Information about the available language models.

Information about the available language models.

Information about the available language models.

Information about the available language models.

Information about the available language models.

Information about the available language models.

Status Code

  • OK. The request succeeded.

  • Not Acceptable. The request specified an Accept header with an incompatible content type.

  • Unsupported Media Type. The request specified an unacceptable media type.

Example responses

Get a model

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

Gets information for a single specified language model that is available for use with the service. The information includes the name of the model and its minimum sampling rate in Hertz, among other things.

See also: Languages and models.

GET /v1/models/{model_id}
(speechToText *SpeechToTextV1) GetModel(getModelOptions *GetModelOptions) (*core.DetailedResponse, error)
ServiceCall<SpeechModel> getModel(GetModelOptions getModelOptions)
getModel(params, [ callback() ])
get_model(self, model_id, **kwargs)
get_model(model_id:)
func getModel(
    modelID: String,
    headers: [String: String]? = nil,
    completionHandler: @escaping (WatsonResponse<SpeechModel>?, WatsonError?) -> Void)
Request

Instantiate the GetModelOptions struct and set the fields to provide parameter values for the GetModel method.

Use the GetModelOptions.Builder to create a GetModelOptions object that contains the parameter values for the getModel method.

Path Parameters

  • The identifier of the model in the form of its name from the output of the Get a model method.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

The GetModel options.

The getModel options.

parameters

  • The identifier of the model in the form of its name from the output of the Get a model method.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

parameters

  • The identifier of the model in the form of its name from the output of the Get a model method.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

parameters

  • The identifier of the model in the form of its name from the output of the Get a model method.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

parameters

  • The identifier of the model in the form of its name from the output of the Get a model method.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

Response

Information about an available language model.

Information about an available language model.

Information about an available language model.

Information about an available language model.

Information about an available language model.

Information about an available language model.

Information about an available language model.

Status Code

  • OK. The request succeeded.

  • Not Found. The specified model_id was not found.

  • Not Acceptable. The request specified an Accept header with an incompatible content type.

  • Unsupported Media Type. The request specified an unacceptable media type.

Example responses

Recognize audio

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either "Content-Type:" or "Content-Type: application/octet-stream".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

Sends audio and returns transcription results for a recognition request. You can pass a maximum of 100 MB and a minimum of 100 bytes of audio with a request. The service automatically detects the endianness of the incoming audio and, for audio that includes multiple channels, downmixes the audio to one-channel mono during transcoding. The method returns only final results; to enable interim results, use the WebSocket API. (With the curl command, use the --data-binary option to upload the file for the request.)

See also: Making a basic HTTP request.

Streaming mode

For requests to transcribe live audio as it becomes available, you must set the Transfer-Encoding header to chunked to use streaming mode. In streaming mode, the service closes the connection (status code 408) if it does not receive at least 15 seconds of audio (including silence) in any 30-second period. The service also closes the connection (status code 400) if it detects no speech for inactivity_timeout seconds of streaming audio; use the inactivity_timeout parameter to change the default of 30 seconds.

See also:

Audio formats (content types)

The service accepts audio in the following formats (MIME types).

  • For formats that are labeled Required, you must use the Content-Type header with the request to specify the format of the audio.
  • For all other formats, you can omit the Content-Type header or specify application/octet-stream with the header to have the service automatically detect the format of the audio. (With the curl command, you can specify either \"Content-Type:\" or \"Content-Type: application/octet-stream\".)

Where indicated, the format that you specify must include the sampling rate and can optionally include the number of channels and the endianness of the audio.

  • audio/basic (Required. Use only with narrowband models.)
  • audio/flac
  • audio/g729 (Use only with narrowband models.)
  • audio/l16 (Required. Specify the sampling rate (rate) and optionally the number of channels (channels) and endianness (endianness) of the audio.)
  • audio/mp3
  • audio/mpeg
  • audio/mulaw (Required. Specify the sampling rate (rate) of the audio.)
  • audio/ogg (The service automatically detects the codec of the input audio.)
  • audio/ogg;codecs=opus
  • audio/ogg;codecs=vorbis
  • audio/wav (Provide audio with a maximum of nine channels.)
  • audio/webm (The service automatically detects the codec of the input audio.)
  • audio/webm;codecs=opus
  • audio/webm;codecs=vorbis

The sampling rate of the audio must match the sampling rate of the model for the recognition request: for broadband models, at least 16 kHz; for narrowband models, at least 8 kHz. If the sampling rate of the audio is higher than the minimum required rate, the service down-samples the audio to the appropriate rate. If the sampling rate of the audio is lower than the minimum required rate, the request fails.

See also: Audio formats.

Multipart speech recognition

Note: The Watson SDKs do not support multipart speech recognition.

The HTTP POST method of the service also supports multipart speech recognition. With multipart requests, you pass all audio data as multipart form data. You specify some parameters as request headers and query parameters, but you pass JSON metadata as form data to control most aspects of the transcription. You can use multipart recognition to pass multiple audio files with a single request.

Use the multipart approach with browsers for which JavaScript is disabled or when the parameters used with the request are greater than the 8 KB limit imposed by most HTTP servers and proxies. You can encounter this limit, for example, if you want to spot a very large number of keywords.

See also: Making a multipart HTTP request.

POST /v1/recognize
(speechToText *SpeechToTextV1) Recognize(recognizeOptions *RecognizeOptions) (*core.DetailedResponse, error)
ServiceCall<SpeechRecognitionResults> recognize(RecognizeOptions recognizeOptions)
recognize(params, [ callback() ])
recognize(self, audio, content_type=None, model=None, language_customization_id=None, acoustic_customization_id=None, base_model_version=None, customization_weight=None, inactivity_timeout=None, keywords=None, keywords_threshold=None, max_alternatives=None, word_alternatives_threshold=None, word_confidence=None, timestamps=None, profanity_filter=None, smart_formatting=None, speaker_labels=None, customization_id=None, grammar_name=None, redaction=None, **kwargs)
recognize(audio:, content_type: nil, model: nil, language_customization_id: nil, acoustic_customization_id: nil, base_model_version: nil, customization_weight: nil, inactivity_timeout: nil, keywords: nil, keywords_threshold: nil, max_alternatives: nil, word_alternatives_threshold: nil, word_confidence: nil, timestamps: nil, profanity_filter: nil, smart_formatting: nil, speaker_labels: nil, customization_id: nil, grammar_name: nil, redaction: nil)
func recognize(
    audio: Data,
    contentType: String? = nil,
    model: String? = nil,
    languageCustomizationID: String? = nil,
    acousticCustomizationID: String? = nil,
    baseModelVersion: String? = nil,
    customizationWeight: Double? = nil,
    inactivityTimeout: Int? = nil,
    keywords: [String]? = nil,
    keywordsThreshold: Double? = nil,
    maxAlternatives: Int? = nil,
    wordAlternativesThreshold: Double? = nil,
    wordConfidence: Bool? = nil,
    timestamps: Bool? = nil,
    profanityFilter: Bool? = nil,
    smartFormatting: Bool? = nil,
    speakerLabels: Bool? = nil,
    customizationID: String? = nil,
    grammarName: String? = nil,
    redaction: Bool? = nil,
    headers: [String: String]? = nil,
    completionHandler: @escaping (WatsonResponse<SpeechRecognitionResults>?, WatsonError?) -> Void)
Request

Instantiate the RecognizeOptions struct and set the fields to provide parameter values for the Recognize method.

Use the RecognizeOptions.Builder to create a RecognizeOptions object that contains the parameter values for the recognize method.

Custom Headers

  • Set to chunked to send the audio in streaming mode. The data does not need to exist fully before being streamed to the service. See Audio transmission.

    Allowable values: [chunked]

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream,audio/basic,audio/flac,audio/g729,audio/l16,audio/mp3,audio/mpeg,audio/mulaw,audio/ogg,audio/ogg;codecs=opus,audio/ogg;codecs=vorbis,audio/wav,audio/webm,audio/webm;codecs=opus,audio/webm;codecs=vorbis]

Query Parameters

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used with the recognition request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom language model is used. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used with the recognition request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom acoustic model is used. See Custom models.

  • The version of the specified base model that is to be used with recognition request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • If you specify the customization ID (GUID) of a custom language model with the recognition request, the customization weight tells the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The time in seconds after which, if only silence (no speech) is detected in streaming audio, the connection is closed with a 400 error. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

    Default: 30

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold, you must also specify one or more keywords. The service performs no keyword spotting if you omit either parameter. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, the service returns a single transcript. See Maximum alternatives.

    Default: 1

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. By default, the service computes no alternative words. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, the service returns no word confidence scores. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, the service performs no smart formatting.

    Note: Applies to US English, Japanese, and Spanish transcription only.

    See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, the service returns no speaker labels. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.

    Note: Applies to US English, Japanese, and Spanish transcription only. To determine whether a language model supports speaker labels, you can also use the Get a model method and check that the attribute speaker_labels is set to true.

    See Speaker labels.

    Default: false

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with the recognition request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    Note: Applies to US English, Japanese, and Korean transcription only.

    See Numeric redaction.

    Default: false

The audio to transcribe.

The Recognize options.

The recognize options.

parameters

  • The audio to transcribe.

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream,audio/basic,audio/flac,audio/g729,audio/l16,audio/mp3,audio/mpeg,audio/mulaw,audio/ogg,audio/ogg;codecs=opus,audio/ogg;codecs=vorbis,audio/wav,audio/webm,audio/webm;codecs=opus,audio/webm;codecs=vorbis]

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used with the recognition request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom language model is used. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used with the recognition request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom acoustic model is used. See Custom models.

  • The version of the specified base model that is to be used with recognition request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • If you specify the customization ID (GUID) of a custom language model with the recognition request, the customization weight tells the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The time in seconds after which, if only silence (no speech) is detected in streaming audio, the connection is closed with a 400 error. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold, you must also specify one or more keywords. The service performs no keyword spotting if you omit either parameter. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, the service returns a single transcript. See Maximum alternatives.

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. By default, the service computes no alternative words. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, the service returns no word confidence scores. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, the service performs no smart formatting.

    Note: Applies to US English, Japanese, and Spanish transcription only.

    See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, the service returns no speaker labels. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.

    Note: Applies to US English, Japanese, and Spanish transcription only. To determine whether a language model supports speaker labels, you can also use the Get a model method and check that the attribute speaker_labels is set to true.

    See Speaker labels.

    Default: false

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with the recognition request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    Note: Applies to US English, Japanese, and Korean transcription only.

    See Numeric redaction.

    Default: false

parameters

  • The audio to transcribe.

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream,audio/basic,audio/flac,audio/g729,audio/l16,audio/mp3,audio/mpeg,audio/mulaw,audio/ogg,audio/ogg;codecs=opus,audio/ogg;codecs=vorbis,audio/wav,audio/webm,audio/webm;codecs=opus,audio/webm;codecs=vorbis]

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used with the recognition request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom language model is used. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used with the recognition request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom acoustic model is used. See Custom models.

  • The version of the specified base model that is to be used with recognition request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • If you specify the customization ID (GUID) of a custom language model with the recognition request, the customization weight tells the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The time in seconds after which, if only silence (no speech) is detected in streaming audio, the connection is closed with a 400 error. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold, you must also specify one or more keywords. The service performs no keyword spotting if you omit either parameter. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, the service returns a single transcript. See Maximum alternatives.

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. By default, the service computes no alternative words. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, the service returns no word confidence scores. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, the service performs no smart formatting.

    Note: Applies to US English, Japanese, and Spanish transcription only.

    See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, the service returns no speaker labels. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.

    Note: Applies to US English, Japanese, and Spanish transcription only. To determine whether a language model supports speaker labels, you can also use the Get a model method and check that the attribute speaker_labels is set to true.

    See Speaker labels.

    Default: false

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with the recognition request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    Note: Applies to US English, Japanese, and Korean transcription only.

    See Numeric redaction.

    Default: false

parameters

  • The audio to transcribe.

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream,audio/basic,audio/flac,audio/g729,audio/l16,audio/mp3,audio/mpeg,audio/mulaw,audio/ogg,audio/ogg;codecs=opus,audio/ogg;codecs=vorbis,audio/wav,audio/webm,audio/webm;codecs=opus,audio/webm;codecs=vorbis]

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used with the recognition request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom language model is used. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used with the recognition request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom acoustic model is used. See Custom models.

  • The version of the specified base model that is to be used with recognition request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • If you specify the customization ID (GUID) of a custom language model with the recognition request, the customization weight tells the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The time in seconds after which, if only silence (no speech) is detected in streaming audio, the connection is closed with a 400 error. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold, you must also specify one or more keywords. The service performs no keyword spotting if you omit either parameter. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, the service returns a single transcript. See Maximum alternatives.

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. By default, the service computes no alternative words. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, the service returns no word confidence scores. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, the service performs no smart formatting.

    Note: Applies to US English, Japanese, and Spanish transcription only.

    See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, the service returns no speaker labels. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.

    Note: Applies to US English, Japanese, and Spanish transcription only. To determine whether a language model supports speaker labels, you can also use the Get a model method and check that the attribute speaker_labels is set to true.

    See Speaker labels.

    Default: false

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with the recognition request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    Note: Applies to US English, Japanese, and Korean transcription only.

    See Numeric redaction.

    Default: false

parameters

  • The audio to transcribe.

  • The format (MIME type) of the audio. For more information about specifying an audio format, see Audio formats (content types) in the method description.

    Allowable values: [application/octet-stream,audio/basic,audio/flac,audio/g729,audio/l16,audio/mp3,audio/mpeg,audio/mulaw,audio/ogg,audio/ogg;codecs=opus,audio/ogg;codecs=vorbis,audio/wav,audio/webm,audio/webm;codecs=opus,audio/webm;codecs=vorbis]

  • The identifier of the model that is to be used for the recognition request. See Languages and models.

    Allowable values: [en-US_BroadbandModel,en-US_NarrowbandModel,en-US_ShortForm_NarrowbandModel,es-ES_BroadbandModel,es-ES_NarrowbandModel,fr-FR_BroadbandModel,fr-FR_NarrowbandModel,ja-JP_BroadbandModel,ja-JP_NarrowbandModel,ko-KR_BroadbandModel,ko-KR_NarrowbandModel]

    Default: en-US_BroadbandModel

  • The customization ID (GUID) of a custom language model that is to be used with the recognition request. The base model of the specified custom language model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom language model is used. See Custom models.

    Note: Use this parameter instead of the deprecated customization_id parameter.

  • The customization ID (GUID) of a custom acoustic model that is to be used with the recognition request. The base model of the specified custom acoustic model must match the model specified with the model parameter. You must make the request with credentials for the instance of the service that owns the custom model. By default, no custom acoustic model is used. See Custom models.

  • The version of the specified base model that is to be used with recognition request. Multiple versions of a base model can exist when a model is updated for internal improvements. The parameter is intended primarily for use with custom models that have been upgraded for a new base model. The default value depends on whether the parameter is used with or without a custom model. See Base model version.

  • If you specify the customization ID (GUID) of a custom language model with the recognition request, the customization weight tells the service how much weight to give to words from the custom language model compared to those from the base model for the current request.

    Specify a value between 0.0 and 1.0. Unless a different customization weight was specified for the custom model when it was trained, the default value is 0.3. A customization weight that you specify overrides a weight that was specified when the custom model was trained.

    The default value yields the best performance in general. Assign a higher value if your audio makes frequent use of OOV words from the custom model. Use caution when setting the weight: a higher value can improve the accuracy of phrases from the custom model's domain, but it can negatively affect performance on non-domain phrases.

    See Custom models.

  • The time in seconds after which, if only silence (no speech) is detected in streaming audio, the connection is closed with a 400 error. The parameter is useful for stopping audio submission from a live microphone when a user simply walks away. Use -1 for infinity. See Inactivity timeout.

  • An array of keyword strings to spot in the audio. Each keyword string can include one or more string tokens. Keywords are spotted only in the final results, not in interim hypotheses. If you specify any keywords, you must also specify a keywords threshold. You can spot a maximum of 1000 keywords. Omit the parameter or specify an empty array if you do not need to spot keywords. See Keyword spotting.

  • A confidence value that is the lower bound for spotting a keyword. A word is considered to match a keyword if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. If you specify a threshold, you must also specify one or more keywords. The service performs no keyword spotting if you omit either parameter. See Keyword spotting.

  • The maximum number of alternative transcripts that the service is to return. By default, the service returns a single transcript. See Maximum alternatives.

  • A confidence value that is the lower bound for identifying a hypothesis as a possible word alternative (also known as "Confusion Networks"). An alternative word is considered if its confidence is greater than or equal to the threshold. Specify a probability between 0.0 and 1.0. By default, the service computes no alternative words. See Word alternatives.

  • If true, the service returns a confidence measure in the range of 0.0 to 1.0 for each word. By default, the service returns no word confidence scores. See Word confidence.

    Default: false

  • If true, the service returns time alignment for each word. By default, no timestamps are returned. See Word timestamps.

    Default: false

  • If true, the service filters profanity from all output except for keyword results by replacing inappropriate words with a series of asterisks. Set the parameter to false to return results with no censoring. Applies to US English transcription only. See Profanity filtering.

    Default: true

  • If true, the service converts dates, times, series of digits and numbers, phone numbers, currency values, and internet addresses into more readable, conventional representations in the final transcript of a recognition request. For US English, the service also converts certain keyword strings to punctuation symbols. By default, the service performs no smart formatting.

    Note: Applies to US English, Japanese, and Spanish transcription only.

    See Smart formatting.

    Default: false

  • If true, the response includes labels that identify which words were spoken by which participants in a multi-person exchange. By default, the service returns no speaker labels. Setting speaker_labels to true forces the timestamps parameter to be true, regardless of whether you specify false for the parameter.

    Note: Applies to US English, Japanese, and Spanish transcription only. To determine whether a language model supports speaker labels, you can also use the Get a model method and check that the attribute speaker_labels is set to true.

    See Speaker labels.

    Default: false

  • Deprecated. Use the language_customization_id parameter to specify the customization ID (GUID) of a custom language model that is to be used with the recognition request. Do not specify both parameters with a request.

  • The name of a grammar that is to be used with the recognition request. If you specify a grammar, you must also use the language_customization_id parameter to specify the name of the custom language model for which the grammar is defined. The service recognizes only strings that are recognized by the specified grammar; it does not recognize other custom words from the model's words resource. See Grammars.

  • If true, the service redacts, or masks, numeric data from final transcripts. The feature redacts any number that has three or more consecutive digits by replacing each digit with an X character. It is intended to redact sensitive numeric data, such as credit card numbers. By default, the service performs no redaction.

    When you enable redaction, the service automatically enables smart formatting, regardless of whether you explicitly disable that feature. To ensure maximum security, the service also disables keyword spotting (ignores the keywords and keywords_threshold parameters) and returns only a single final transcript (forces the max_alternatives parameter to be 1).

    Note: Applies to US English, Japanese, and Korean transcription only.

    See Numeric redaction.

    Default: false

Response

The complete results for a speech recognition request.

The complete results for a speech recognition request.

The complete results for a speech recognition request.

The complete results for a speech recognition request.

The complete results for a speech recognition request.

The complete results for a speech recognition request.

The complete results for a speech recognition request.

Status Code

  • OK. The request succeeded.

  • Bad Request. The request failed because of a user input error. For example, the request passed audio that does not match the indicated format or failed to specify a required audio format; specified a custom language or custom acoustic model that is not in the available state; or experienced an inactivity timeout. Specific messages include

    • Model {model} not found
    • Requested model is not available
    • This 8000hz audio input requires a narrow band model. See /v1/models for a list of available models.
    • speaker_labels is not a supported feature for model {model}
    • You cannot specify both 'customization_id' and 'language_customization_id' parameter!
    • No speech detected for 30s
    • Unable to transcode data stream application/octet-stream -> audio/x-float-array
    • Stream was {number} bytes but needs to be at least 100 bytes.
  • Not Acceptable. The request specified an Accept header with an incompatible content type.

  • Request Timeout. The connection was closed due to inactivity (session timeout) for 30 seconds.

  • Payload Too Large. The request passed an audio file that exceeded the currently supported data limit.

  • Unsupported Media Type. The request specified an unacceptable media type.

  • Internal Server Error. The service experienced an internal error.

  • Service Unavailable. The service is currently unavailable.

Example responses

Register a callback

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

Registers a callback URL with the service for use with subsequent asynchronous recognition requests. The service attempts to register, or white-list, the callback URL if it is not already registered by sending a GET request to the callback URL. The service passes a random alphanumeric challenge string via the challenge_string parameter of the request. The request includes an Accept header that specifies text/plain as the required response type.

To be registered successfully, the callback URL must respond to the GET request from the service. The response must send status code 200 and must include the challenge string in its body. Set the Content-Type response header to text/plain. Upon receiving this response, the service responds to the original registration request with response code 201.

The service sends only a single GET request to the callback URL. If the service does not receive a reply with a response code of 200 and a body that echoes the challenge string sent by the service within five seconds, it does not white-list the URL; it instead sends status code 400 in response to the Register a callback request. If the requested callback URL is already white-listed, the service responds to the initial registration request with response code 200.

If you specify a user secret with the request, the service uses it as a key to calculate an HMAC-SHA1 signature of the challenge string in its response to the POST request. It sends this signature in the X-Callback-Signature header of its GET request to the URL during registration. It also uses the secret to calculate a signature over the payload of every callback notification that uses the URL. The signature provides authentication and data integrity for HTTP communications.

After you successfully register a callback URL, you can use it with an indefinite number of recognition requests. You can register a maximum of 20 callback URLS in a one-hour span of time.

See also: Registering a callback URL.

POST /v1/register_callback
(speechToText *SpeechToTextV1) RegisterCallback(registerCallbackOptions *RegisterCallbackOptions) (*core.DetailedResponse, error)
ServiceCall<RegisterStatus> registerCallback(RegisterCallbackOptions registerCallbackOptions)
registerCallback(params, [ callback() ])
register_callback(self, callback_url, user_secret=None, **kwargs)
register_callback(callback_url:, user_secret: nil)
func registerCallback(
    callbackURL: String,
    userSecret: String? = nil,
    headers: [String: String]? = nil,
    completionHandler: @escaping (WatsonResponse<RegisterStatus>?, WatsonError?) -> Void)
Request

Instantiate the RegisterCallbackOptions struct and set the fields to provide parameter values for the RegisterCallback method.

Use the RegisterCallbackOptions.Builder to create a RegisterCallbackOptions object that contains the parameter values for the registerCallback method.

Query Parameters

  • An HTTP or HTTPS URL to which callback notifications are to be sent. To be white-listed, the URL must successfully echo the challenge string during URL verification. During verification, the client can also check the signature that the service sends in the X-Callback-Signature header to verify the origin of the request.

  • A user-specified string that the service uses to generate the HMAC-SHA1 signature that it sends via the X-Callback-Signature header. The service includes the header during URL verification and with every notification sent to the callback URL. It calculates the signature over the payload of the notification. If you omit the parameter, the service does not send the header.

The RegisterCallback options.

The registerCallback options.

parameters

  • An HTTP or HTTPS URL to which callback notifications are to be sent. To be white-listed, the URL must successfully echo the challenge string during URL verification. During verification, the client can also check the signature that the service sends in the X-Callback-Signature header to verify the origin of the request.

  • A user-specified string that the service uses to generate the HMAC-SHA1 signature that it sends via the X-Callback-Signature header. The service includes the header during URL verification and with every notification sent to the callback URL. It calculates the signature over the payload of the notification. If you omit the parameter, the service does not send the header.

parameters

  • An HTTP or HTTPS URL to which callback notifications are to be sent. To be white-listed, the URL must successfully echo the challenge string during URL verification. During verification, the client can also check the signature that the service sends in the X-Callback-Signature header to verify the origin of the request.

  • A user-specified string that the service uses to generate the HMAC-SHA1 signature that it sends via the X-Callback-Signature header. The service includes the header during URL verification and with every notification sent to the callback URL. It calculates the signature over the payload of the notification. If you omit the parameter, the service does not send the header.

parameters

  • An HTTP or HTTPS URL to which callback notifications are to be sent. To be white-listed, the URL must successfully echo the challenge string during URL verification. During verification, the client can also check the signature that the service sends in the X-Callback-Signature header to verify the origin of the request.

  • A user-specified string that the service uses to generate the HMAC-SHA1 signature that it sends via the X-Callback-Signature header. The service includes the header during URL verification and with every notification sent to the callback URL. It calculates the signature over the payload of the notification. If you omit the parameter, the service does not send the header.

parameters

  • An HTTP or HTTPS URL to which callback notifications are to be sent. To be white-listed, the URL must successfully echo the challenge string during URL verification. During verification, the client can also check the signature that the service sends in the X-Callback-Signature header to verify the origin of the request.

  • A user-specified string that the service uses to generate the HMAC-SHA1 signature that it sends via the X-Callback-Signature header. The service includes the header during URL verification and with every notification sent to the callback URL. It calculates the signature over the payload of the notification. If you omit the parameter, the service does not send the header.

Response

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Information about a request to register a callback for asynchronous speech recognition.

Status Code

  • OK. The callback was already registered (white-listed). The status included in the response is already created.

  • Created. The callback was successfully registered (white-listed). The status included in the response is created.

  • Bad Request. The callback registration failed. The request was missing a required parameter or specified an invalid argument; the client sent an invalid response to the service's GET request during the registration process; or the client failed to respond to the server's request before the five-second timeout.

  • Service Unavailable. The service is currently unavailable.

Example responses

Unregister a callback

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

Unregisters a callback URL that was previously white-listed with a Register a callback request for use with the asynchronous interface. Once unregistered, the URL can no longer be used with asynchronous recognition requests.

See also: Unregistering a callback URL.

POST /v1/unregister_callback
(speechToText *SpeechToTextV1) UnregisterCallback(unregisterCallbackOptions *UnregisterCallbackOptions) (*core.DetailedResponse, error)
ServiceCall<Void> unregisterCallback(UnregisterCallbackOptions unregisterCallbackOptions)
unregisterCallback(params, [ callback() ])
unregister_callback(self, callback_url, **kwargs)
unregister_callback(callback_url:)
func unregisterCallback(
    callbackURL: String,
    headers: [String: String]? = nil,
    completionHandler: @escaping (WatsonResponse<Void>?, WatsonError?) ->