watsonx.ai | IBM Cloud API Docs

Introduction to IBM watsonx.ai software

Last updated: 2025-06-10

Using IBM watsonx.ai software APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).

If you are looking for the IBM watsonx.ai as a Service APIs, see here.

Step-by-step instructions on how to use IBM watsonx.ai software can be found here.

There is a specialized python library that is available to access this REST API.

Endpoint URLs

The base URLs for API endpoints come from the cluster and add-on service instance. The URL follows this pattern:

https://{cluster_url}/ml/v1

{cluster_url} represents the name or IP address of your deployed cluster. Use a hostname that resolves to an IP address in the cluster.

To find the base URL, view the details for the service instance from the Cloud Pak for Data web client.

Note that for prompts, notebooks and vector indexes the base URLs are /wx.

https://{cluster_url}/wx

Use that URL in your requests to the API.

Endpoint example

curl -k -X {request_method} -H "Authorization: Bearer {token}" "https://{cluster_url}/ml/v1/text/generation"

Disabling SSL verification

Watson Machine Learning uses Secure Sockets Layer (SSL) (or Transport Layer Security (TLS)) for secure connections between the client and server. The connection is verified against the local certificate store to ensure authentication, integrity, and confidentiality.

If you use a self-signed certificate, you need to disable SSL verification to make a successful connection.

Enabling SSL verification is highly recommended. Disabling SSL jeopardizes the security of the connection and data. Disable SSL only if necessary, and take steps to enable SSL as soon as possible.

To disable SSL verification for a curl request, use the --insecure (-k) option with the request.

Authentication

A bearer token is required to use any of the watsonx.ai APIs.

For more information, see the Authorization section of the Platform API reference.

Use the value of the access_token property from the example request. Set the access_token value as the authorization header parameter for requests to the APIs. The format is Authorization: Bearer {access_token_value}:

Authorization: Bearer eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6IlJTMjU2In0...

Example request that uses an API key to retrieve the token

curl -k -X POST "https://{cluster_url}/icp4d-api/v1/authorize"   -H "cache-control: no-cache"   -H "content-type: application/json"   -d "{\‚Äúusername\‚Äù:\‚Äúadmin\‚Äù,\‚Äúpassword\‚Äù:\‚Äúpassword\‚Äù}"
Copy to clipboard

Response

{
  "username": "admin",
  "role": "Admin",
  "permissions": [
    "administrator"
  ],
  "sub": "admin",
  "iss": "KNOXSSO",
  "aud": "DSX",
  "uid": "999",
  "authenticator": "default",
  "access_token": "eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6...",
  "_messageCode_": "success"
}
Copy to clipboard

Error handling

This API uses standard HTTP response codes to indicate whether a method completed successfully. A 200 type response indicates success.

HTTP Code	Description	Recovery
`200`	Success	The request was successful.
`400`	Bad Request	The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body.
`401`	Unauthorized	You are not authorized to make this request. Log in and try again or provide a valid token. For more information about logging in, see the Authentication section. If this error persists, contact the account owner to check your permissions.
`403`	Forbidden	The supplied authentication is not authorized.
`404`	Not Found	The requested resource could not be found.

Note that 429 and 503 errors may mean that the model is overloaded or unavailable, check the error description for more details.

Error response

Name	Description
trace	An identifier that can be used to trace the request. This can be set using `X-Global-Transaction-Id`.
errors	The list of errors.

Errors

Name	Description
code	A simple string code that should convey the general sense of the error.
message	The message that describes the error.
more_info	A reference to a more detailed explanation when available.

Additional headers

Some additional headers might be required to make successful requests to the API. Those additional headers are described below.

An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.

If there is not a transaction ID that is passed in, then one is generated randomly.

API change log

In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API. The change log lists changes that have been made, ordered by the date they were released. Changes to existing API versions are designed to be compatible with existing client applications, if this is not the case then a new version date will be created.

14 March 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically.

Versioning

API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.

When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.

Active Version Dates

Version date	Summary of changes
`2024-03-14`	Publication of the `/ml/v1` APIs.

Data References

Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires the use of connection_asset or data_asset reference types. These reference types are created within a space or a project and are referenced in requests to represent input data and results locations. These types contain two parameter objects, connection and location, which require different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied to the location object whereas using a connection_asset requires the connection_id for the connection object and different location fields depending on the data source type,

Example connection_asset payload:

{
  "training_data_references": [
    {
      "type": "connection_asset",
      "connection": {
        "id": "<connection_guid>"
      },
      "location": {
        "<wdp-properties depending on the type>": "<value depending on the type>"
      }
    }
  ]
}

Example data_asset payload:

{
  "training_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/<asset_id>?space_id=<space_id>"
      }
    }
  ]
}

Example fs payload:

project_id

{
  "training_data_references": [
    {
      "type":"fs",
      "location":{
        "path":"/projects/<project_id>/assets/<fs_path>"
      }
    }
  ]
}

space_id

{
  "training_data_references": [
    {
      "type":"fs",
      "location":{
        "path":"/spaces/<space_id>/assets/<fs_path>"
      }
    }
  ]
}

Create a new AI service with the given payload. A AI service is some code that can be deployed as a deployment.

POST /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

AIServiceRequest

Payload for creating the AI service. Either space_id or project_id has to be provided and is mandatory.

Examples:

Create request AI Service

curl --request POST 'https://{cluster_url}/ml/v4/ai_services?version=2024-10-17'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "name": "ai-service-1",
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "software_spec": {
    "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
  },
  "documentation": {
    "request": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "parameters": {
            "properties": {
              "max_new_tokens": {
                "type": "integer"
              },
              "top_p": {
                "type": "number"
              }
            },
            "required": [
              "max_new_tokens",
              "top_p"
            ]
          }
        },
        "required": [
          "query"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "image": {
            "type": "string",
            "format": "binary"
          }
        },
        "required": [
          "image"
        ]
      }
    },
    "response": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "result": {
            "type": "string"
          }
        },
        "required": [
          "query",
          "result"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "string",
        "format": "binary"
      }
    }
  }
}'
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

201
AI service created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201: The AI service response.

The response with the result.

{
  "metadata": {
    "id": "b53c5118-b1ca-43ef-a597-ef839ff7129f",
    "name": "ai-app-1",
    "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "created_at": "2023-05-02T16:27:51Z"
  },
  "entity": {
    "software_spec": {
      "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
    },
    "documentation": {
      "request": {
        "application/json": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "query": {
              "type": "string"
            },
            "parameters": {
              "properties": {
                "max_new_tokens": {
                  "type": "integer"
                },
                "top_p": {
                  "type": "number"
                }
              },
              "required": [
                "max_new_tokens",
                "top_p"
              ]
            }
          },
          "required": [
            "query"
          ]
        },
        "application/png": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "image": {
              "type": "string",
              "format": "binary"
            }
          },
          "required": [
            "image"
          ]
        }
      },
      "response": {
        "application/json": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "query": {
              "type": "string"
            },
            "result": {
              "type": "string"
            }
          },
          "required": [
            "query",
            "result"
          ]
        },
        "application/png": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "string",
          "format": "binary"
        }
      }
    }
  }
}
Copy to clipboard

Retrieve the AI services for the specified space or project.

GET /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50
tag.value
string
Return only the resources with the given tag values, separated by or or and to support multiple tags.

Example: tf2.0 or tf2.1
search
string
Returns only resources that match this search string. The path to the field must be the complete path to the field, and this field must be one of the indexed fields for this resource type. Note that the search string must be URL encoded.

These are the fields that can be searched in the metadata:
- /metadata/name
Note that tags are filtered using the tag query parameter and the tag query parameter takes precedence over the search query parameter.

The metadata fields, on all assets, can be searched like this:
- /metadata/name=asset2 -> search=%2Fmetadata%2Fname%3Dasset2
These are the fields that can be searched in the entity and that depend on the asset type:
- model
  - /entity/type
  - /entity/software_spec.id
- function
  - /entity/software_spec.id
- ai_service
  - /entity/software_spec.id
The entity fields can be searched like this:

/entity.type=tensorflow_2.14 -> search=%2Fentity%2Ftype%3Dtensorflow_2.14
Possible values: length ≥ 1

Retrieve all AI services

curl --request GET 'https://{cluster_url}/ml/v4/ai_services?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service with the specified identifier. If rev query parameter is provided, rev=latest will fetch the latest revision. A call with rev={revision_number} will fetch the given revision_number record. Either space_id or project_id has to be provided and is mandatory.

GET /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.read

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
rev
string
The revision number of the resource.

Example: 2

Retrieve a AI service

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Update the AI service with the provided patch data. The following fields can be patched:

/tags
/name
/description
/custom

PATCH /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.update

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

JsonPatchOperation[]

Input For Patch. This is the patch body which corresponds to the JavaScript Object Notation (JSON) Patch standard (RFC 6902).

Update AI Services

curl --request PATCH "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
[
  {
    "op": "replace",
    "path": "/description",
    "value": "New Description"
  }
]
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

200
AI service has been patched successfully
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the AI service with the specified identifier. This will delete all revisions of this flow as well. For each revision all attachments will also be deleted.

DELETE /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.delete

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Delete the AI service

curl --request DELETE "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Status Code

204
AI service deleted
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Upload the flow code. AI services expect a zip file that contains the code files that make up the flow.

PUT /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.add

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

application/gzipbinary

A gzip file containing code files.

Upload the flow code

curl --request PUT "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/gzip"
Copy to clipboard

Response

Status Code

201
AI service code uploaded
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Download the AI service code. It is possible to download the code for a given revision of the flow. AI services expect a zip file that contains the code files that make up the flow.

GET /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.read

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
rev
string
The revision number of the resource.

Example: 2

Download the AI service code

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&rev=1&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

binary

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new AI service revision. The current metadata and content for id will be taken and a new revision created. Either space_id or project_id has to be provided and is mandatory.

POST /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.create

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

RevisionEntitySpaceProjectRequest

The details for the revision.

Examples:

Create new AI service revision

curl --request POST "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "commit_message": "New Code"
}
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

201
AI service revision created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service revisions.

GET /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.list

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Retrieve AI service revisions

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Status Code

200
AI service revisions
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Transcribe audio into text.

Since watsonx.ai 2.2.1.

POST /ml/v1/audio/transcriptions

Auditing

Calling this method generates the following auditing event.

pm-20.audio-transcriptions.send

Request

Form Parameters

model
Required*
string
The model to use for audio transcriptions.
file
Required*
string
The path to a mp3 or wav audio file to transcribe.
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
language
string
Optional target language to which to transcribe; for example, fr for French. Default is English.

Response

Response Body

object

Audio transcriptions response fields.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: audio_transcriptions

An audio transcriptions example.

{
  "model": "openai/whisper-tiny",
  "text": "the ending was terrific.",
  "created_at": "2023-07-21T16:52:32.190Z",
  "token_count": 8
}
Copy to clipboard

Create a new AutoAI RAG that will find the best RAG pattern from the data that is provided in the request.

Since watsonx.ai 2.1.0.

POST /ml/v1/autoai/rags

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

AutoAIRAGRequest

The details of the AutoAI RAG run with the data used to find the best RAG patterns.

Create AutoAI RAG job

curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
'{
  "name": "AutoAI RAG Example",
  "description": "AutoAI RAG Example description",
  "parameters": {
    "constraints": {
      "max_number_of_rag_patterns": 4
    },
    "optimization": {
      "metrics": [
        "answer_correctness"
      ]
    },
    "output_logs": true
  },
  "project_id": "dc178286-21d1-4262-9000-e543cf4c7742",
  "input_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/4cc2f990-cd83-4e62-bd61-33b21605cf0e?project_id=dc178286-21d1-4262-9000-e543cf4c7742",
        "id": "4cc2f990-cd83-4e62-bd61-33b21605cf0e"
      }
    }
  ],
  "test_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/d0d1607f-1ac1-4a88-8098-c5c8b6e4b78a?project_id=dc178286-21d1-4262-9000-e543cf4c7742",
        "id": "d0d1607f-1ac1-4a88-8098-c5c8b6e4b78a"
      }
    }
  ],
  "results_reference": {
    "type": "fs",
    "location": {
      "path": "/projects/dc178286-21d1-4262-9000-e543cf4c7742/assets/auto_ml/auto_ml.e274332a-cd3f-4d31-83bc-5072d6dfb535/wml_data"
    }
  },
  "hardware_spec": {
    "id": "a6c4923b-b8e4-444c-9f43-8a7ec3020110",
    "name": "L"
  }
}'
Copy to clipboard

Response

Response Body

AutoAIRAGResponse

The response of an AutoAI RAG run.

Status Code

201
Created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of AutoAI RAG requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

Since watsonx.ai 2.1.0.

GET /ml/v1/autoai/rags

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Retrieve the AutoAI RAG runs

curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

AutoRAGResultResources

A paginated list of training definitions.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get the results of an AutoAI RAG run, or details if the job failed.

Since watsonx.ai 2.1.0.

GET /ml/v1/autoai/rags/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Get AutoAI RAG job

curl --request GET 'https://{cluster_url}/ml/v1/autoai/rags/{id}?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

AutoAIRAGResponse

The response of an AutoAI RAG run.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "description": "My autoai rag experiment for 2023 financial documents",
    "name": "AutoAI RAG"
  },
  "entity": {
    "timestamp": "2023-09-22T02:52:03.324Z",
    "hardware_spec": {
      "id": "c076e82c-b2a7-4d20-9c0f-1f0c2fdf5a24",
      "name": "L"
    },
    "parameters": {
      "constraints": {
        "embedding_models": [
          "ibm/slate-125m-english-rtrvr"
        ],
        "generation": {
          "foundation_models": [
            {
              "model_id": "meta-llama/llama-3-3-70b-instruct\","
            },
            {
              "model_id": "mistralai/mixtral-8x7b-instruct-v01"
            }
          ]
        },
        "max_number_of_rag_patterns": 8
      },
      "optimization": {
        "metrics": [
          "answer_correctness"
        ]
      },
      "output_logs": true
    },
    "input_data_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "path": "files/document.pdf"
        }
      }
    ],
    "test_data_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "path": "files/qa_document.json"
        }
      }
    ],
    "vector_store_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        }
      }
    ],
    "results_reference": {
      "type": "container",
      "location": {
        "path": "results_autoai",
        "training": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5",
        "training_status": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training-status.json",
        "assets_path": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets",
        "training_log": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training.log"
      }
    },
    "results": [
      {
        "metrics": {
          "test_data": [
            {
              "metric_name": "answer_correctness",
              "mean": 0.51,
              "ci_high": 0.68,
              "ci_low": 0.43
            }
          ]
        },
        "context": {
          "rag_pattern": {
            "composition_steps": [
              "vector_store",
              "chunking",
              "embeddings",
              "retrieval",
              "generation"
            ],
            "location": {
              "evaluation_results": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/evaluation_results.json",
              "indexing_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/indexing_notebook.ipynb",
              "inference_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_notebook.ipynb",
              "inference_service_code": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_service_code.gz",
              "inference_service_metadata": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_service_metadata.json"
            },
            "name": "Pattern 1",
            "settings": {
              "vector_store": {
                "datasource_type": "milvus",
                "index_name": "autoai_rag_1234_iteration_5_index",
                "distance_metric": "euclidean",
                "operation": "upsert",
                "schema": {
                  "id": "autoai_rag_1.0.0",
                  "name": "AutoAI RAG document schema",
                  "type": "struct",
                  "fields": [
                    {
                      "name": "text",
                      "description": "text field",
                      "type": "string",
                      "role": "text"
                    },
                    {
                      "name": "document_id",
                      "description": "document name field",
                      "type": "string",
                      "role": "document_name"
                    },
                    {
                      "name": "start_index",
                      "description": "chunk starting token position in the source document",
                      "type": "number",
                      "role": "start_index"
                    },
                    {
                      "name": "sequence_number",
                      "description": "chunk number per document",
                      "type": "number",
                      "role": "sequence_number"
                    },
                    {
                      "name": "vector",
                      "description": "vector embeddings",
                      "type": "array",
                      "role": "vector_embeddings"
                    }
                  ]
                }
              },
              "chunking": {
                "method": "recursive",
                "chunk_size": 256,
                "chunk_overlap": 64
              },
              "embeddings": {
                "truncate_strategy": "left",
                "truncate_input_tokens": 384,
                "model_id": "ibm/slate-125m-english-rtrvr"
              },
              "retrieval": {
                "method": "simple",
                "number_of_chunks": 5
              },
              "generation": {
                "model_id": "meta-llama/llama-3-1-70b-instruct",
                "prompt_template_text": "Answer the following questions based on provided context:\\n ...",
                "context_template_text": "[Document]\n{document}\n[End]",
                "word_to_token_ratio": 2.2
              }
            }
          },
          "iteration": 1,
          "max_combinations": 160
        }
      }
    ],
    "status": {
      "state": "running",
      "step": "vector_store",
      "message": {
        "level": "info",
        "text": "Pipeline 1 of 8 is completed."
      },
      "running_at": "2023-08-04T13:22:48.000Z"
    }
  }
}
Copy to clipboard

Cancel or delete the specified AutoAI RAG run, once deleted all trace of the run job is gone.

Since watsonx.ai 2.1.0.

DELETE /ml/v1/autoai/rags/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

Cancel or delete an AutoAI RAG run

curl --request DELETE "https://{cluster_url}/ml/v1/autoai/rags/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Status Code

204
Deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the custom foundation models.

In order to deploy a custom foundation model using one of the models in this list you need to follow the following steps:

Create a model asset, in a space or a project, by providing the custom foundation model details as shown below:

curl -X POST "https://{cluster_url}/ml/v4/models?version=2024-01-29" \
  -H "Authorization: Bearer <replace with your token>" \
  -H "content-type: application/json" \
  --data '{
            "name": "replace_with_a_meaningful_name",
            "space_id": "replace_with_your_space_id",
            "foundation_model": {
              "model_id": "replace_with_your_model_id"
            },
            "type": "custom_foundation_model_1.0",
            "software_spec": {
              "name": "watsonx-cfm-caikit-1.0"
            }
          }'

Notes:

The model type must be custom_foundation_model_1.0.
The software spec name must be watsonx-cfm-caikit-1.0.

Create a custom foundation model deployment as described in the create deployment documentation.

Since watsonx.ai 1.1.x.

GET /ml/v4/custom_foundation_models

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

get custom models

curl --request GET 'https://{cpd_cluster}/ml/v4/custom_foundation_models?version=2023-05-02&limit=10'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

CustomFoundationModelResources

Pagination information and list of models and common parameters.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: The custom foundation models.

The list of custom foundation models that were created and registered.

{
  "total_count": 1,
  "limit": 10,
  "first": {
    "href": "https://{cpd_cluster}/ml/v4/custom_foundation_models"
  },
  "resources": [
    {
      "model_id": "my_flan_t5_xl",
      "description": "A tuned version of flan_t5_xl",
      "tags": [
        "flan_t5_xl"
      ],
      "parameters": [
        {
          "name": "max_batch_weight",
          "display_name": "Maximum batch weight",
          "default": 10000,
          "description": "The maximum batch weight that is allowed for this model.",
          "type": "number",
          "min": 0,
          "max": 100000
        }
      ]
    }
  ],
  "parameters": [
    {
      "name": "max_batch_weight",
      "display_name": "Maximum batch weight",
      "default": 1000,
      "description": "The maximum batch weight that is allowed for all models.",
      "type": "number",
      "min": 0,
      "max": 10000
    }
  ]
}
Copy to clipboard

Create a new deployment, currently the only supported type is online. If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed. If this is a deployment for a custom foundation model then the online object must exist, the asset object must exist and point to the model object that describes the custom foundation model, and the hardware_spec is mandatory. Note that the base_model_id will be returned and will be the base model id that is defined in the model asset (asset.id). If this is a deployment for a fine tuned model then the asset.id must point to the model that was created after the fine tuning. In case of a fine tuned model with a template, the field base_deployment_id will be the tuned model deployment. Pre-defined hardware specifications are provided for custom and base foundation model deployments:

WX-S: 1 GPU, Request 1 CPU, Limit 2 CPU and 60 GB (Request and Limit) - 1B to 20B parameters
WX-M: 2 GPU, Request 2 CPU, Limit 3 CPU and 120 GB (Request and Limit) - 21B to 40B parameters
WX-L: 4 GPU, Request 4 CPU, Limit 5 CPU and 240 GB (Request and Limit) - 41B to 80B parameters
WX-XL: 8 GPU, Request 8 CPU, Limit 9 CPU and 600 GB (Request and Limit) - 81B to 200B parameters

A prompt template can be used in conjunction with a custom foundation model by specifying the prompt_template object with the id point to the prompt template.

POST /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentResourcePrototype

The deployment request entity.

The following important fields are described for each use case:

Prompt template:
- base_model_id: required
- prompt_template.id: required
- online: required
- hardware_spec: forbidden
- hardware_request: forbidden
- response deployed_asset_type: foundation_model
Custom foundation model:
- asset.id: required
- online: required
- online.parameters.foundation_model: optional
- hardware_spec: required
- hardware_request: forbidden
- base_model_id: forbidden
- base_deployment_id: forbidden
- response deployed_asset_type: custom_foundation_model
Custom foundation model with template:
- base_deployment_id: required
- prompt_template.id: required
- online: required
- online.parameters.foundation_model: forbidden
- hardware_spec: forbidden
- hardware_request: forbidden
- asset.id: forbidden
- base_model_id: forbidden
- response deployed_asset_type: custom_foundation_model
Fine tuned model:
- asset.id: required
- online: required
- online.parameters.foundation_model: optional
- hardware_spec: required
- hardware_request: forbidden
- base_model_id: forbidden
- base_deployment_id: forbidden
- response deployed_asset_type: fine_tune
Fine tuned model with template:
- base_deployment_id: required
- prompt_template.id: required
- online: required
- online.parameters.foundation_model: forbidden
- hardware_spec: forbidden
- hardware_request: forbidden
- asset.id: forbidden
- base_model_id: forbidden
- response deployed_asset_type: fine_tune
Base Foundation model:
- asset.id: required
- online: required
- online.parameters.foundation_model: optional
- hardware_spec: required
- base_model_id: forbidden
- base_deployment_id: forbidden
- response deployed_asset_type: base_foundation_model
Base Foundation model for LoRA:
- asset.id: required
- online: required
- online.parameters.foundation_model.enable_lora: required
- hardware_spec: required
- base_model_id: forbidden
- base_deployment_id: forbidden
- response deployed_asset_type: base_foundation_model
LoRA adapter model:
- asset.id: required
- base_deployment_id: required
- online: required
- online.parameters.foundation_model: forbidden
- hardware_spec: forbidden
- base_model_id: forbidden
- response deployed_asset_type: lora_adapter

Examples:

Deploy base foundation model for inference

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
  "name": "my_fm",
  "asset": {
    "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
  },
  "online": {
    "parameters": {
      "foundation_model": {
        "max_batch_weight": 10000,
        "max_sequence_length": 8192,
      }
    }
  },
  "hardware_spec": {
    "name": "WX-S",
    "num_nodes": 1
  }
}'
Copy to clipboard

Deploy base foundation model for LoRA

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
  "name": "my_fm",
  "asset": {
    "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
  },
  "online": {
    "parameters": {
      "foundation_model": {
        "max_batch_weight": 10000,
        "max_sequence_length": 8192,
        "enable_lora": true,
        "max_gpu_loras": 8,
        "max_cpu_loras": 16,
        "max_lora_rank": 32
      }
    }
  },
  "hardware_spec": {
    "name": "WX-S",
    "num_nodes": 1
  }
}'
Copy to clipboard

Deploy LoRA adapter

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
  "name": "my_lora_adapter",
  "asset": {
    "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
  },
  "online": {},
  "base_deployment_id": "bdda3999-1012-45bd-a726-045d8774e622"
}'
Copy to clipboard

A prompt tune deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt tuned model deployment",
    "tags": ["classification"],
    "asset": {
        "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {}
}
Copy to clipboard

A prompt template deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": ["classification"],
    "prompt_template": {
        "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "base_model_id": "google/flan-t5-xl",
    "online": {}
}
Copy to clipboard

A custom foundation model deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    "name": "my_tuned_flan"
    "asset": {
        "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "online": {
        "parameters": {
            "serving_name": "myflan"
         }
    }
}
Copy to clipboard

A curated foundational model deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
    "name": "my_granite_13b_chat_v2",
    "asset": {
        "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
    },
    "base_model_id": "ibm/granite-13b-chat-v2-curated",
    "hardware_request": {
        "size": "gpu_s",
        "num_nodes": 1
    },
    "online": {
        "parameters": {
            "serving_name": "granite_13b_chat_v2"
         }
    }
}
Copy to clipboard

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

202
Deployment created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 202: A prompt template deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "prompt_template": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 202: A custom foundation model deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "my_tuned_flan"
  },
  "entity": {
    "asset": {
      "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "online": {
      "parameters": {
        "serving_name": "myflan",
        "foundation_model": {
          "max_batch_weight": 10000,
          "max_sequence_length": 8192
        }
      }
    },
    "deployed_asset_type": "custom_foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/myflan/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/myflan/text/generation_stream",
          "sse": true,
          "uses_serving_name": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 202: A prompt template deployment with a custom foundation model.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "my_tuned_flan_template"
  },
  "entity": {
    "base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc",
    "prompt_template": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {
      "parameters": {
        "serving_name": "myflan_template"
      }
    },
    "deployed_asset_type": "custom_foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://<cluster_url>ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/myflan_template/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        },
        {
          "url": "https://<cluster_url>/ml/v1/deployments/myflan_template/text/generation_stream",
          "sse": true,
          "uses_serving_name": true
        }
      ]
    }
  }
}
Copy to clipboard

Retrieve the list of deployments for the specified space or project.

GET /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
serving_name
string
Retrieves the deployment, if any, that contains this serving_name.

Example: classification
tag.value
string
Retrieves only the resources with the given tag value.
asset_id
string
Retrieves only the resources with the given asset_id, asset_id would be the model id.
prompt_template_id
string
Retrieves only the resources with the given prompt_template_id.
name
string
Retrieves only the resources with the given name.
type
string
Retrieves the resources filtered with the given type. There are the deployment types as well as an additional prompt_template if the deployment type includes a prompt template.

The supported deployment types are (see the description for deployed_asset_type in the deployment entity):
1. foundation_model - when a prompt template is used on a pre-deployed IBM provided model.
2. custom_foundation_model - when a custom foundation model is deployed.
3. lora_adapter - when a lora adapter model is deployed.
4. fine_tune - when a fine tune model is deployed. These can be combined with the flag prompt_template like this:
5. type=foundation_model - return all prompt template deployments.
6. type=foundation_model and prompt_template - return all prompt template deployments - this is the same as the previous query because a foundation_model can only exist with a prompt template.
7. type=custom_foundation_model - return all custom model deployments.
8. type=custom_foundation_model and prompt_template - return all custom model deployments with a prompt template.
9. type=prompt_template - return all deployments with a prompt template.
state
string
Retrieves the resources filtered by state. Allowed values are initializing, updating, ready and failed.
conflict
boolean
Returns whether serving_name is available for use or not. This query parameter cannot be combined with any other parameter except for serving_name.

Default: false

Retrieve list of deployments

curl --request GET 'https://{cluster_url}/ml/v4/deployments?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&
serving_name=ibm&asset_id=259efabd-7850-40fc-843d-6dddcfc286d1
&state=ready&version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Response Body

DeploymentResourceCollection

The deployment resources.

Status Code

200
OK.
204
serving_name is available for use. Returned when serving_name and conflict query parameters are used.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.
409
Returned when serving_name and conflict query parameters are used. The response body will contain the reason.

No Sample Response

This method does not specify any sample responses.

Retrieve the deployment details with the specified identifier.

GET /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.read

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Retrieve deployment details

curl --request GET "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

200
Deployment details.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A prompt template deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "prompt_template": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 200: A custom foundation model deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "my_tuned_flan"
  },
  "entity": {
    "asset": {
      "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "hardware_spec": {
      "id": "WX-S",
      "num_nodes": 1
    },
    "online": {
      "parameters": {
        "serving_name": "myflan",
        "foundation_model": {
          "max_batch_weight": 10000,
          "max_sequence_length": 8192
        }
      }
    },
    "deployed_asset_type": "custom_foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
          "sse": true,
          "uses_serving_name": true
        }
      ]
    }
  }
}
Copy to clipboard

Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.

/name
/description
/tags
/custom
/online/parameters
/asset - replace only
/prompt_template - replace only
/hardware_spec
/hardware_request
/base_model_id - replace only (applicable only to prompt template deployments referring to IBM base foundation models) Since CloudPak for Data 5.0.3.

The PATCH operation with path specified as /online/parameters can be used to update the serving_name.

PATCH /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.update

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

application/json-patch+jsonJsonPatchOperation[]

The json patch.

Update the deployment metadata.

curl --request PATCH "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
[
  {
    "op": "replace",
    "path": "/description",
    "value": "New Description"
  }
]
Copy to clipboard

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

202
Deployment accepted
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the deployment with the specified identifier.

DELETE /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.delete

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Delete deployment

curl --request DELETE 'https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Status Code

204
Deployment deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

This API is legacy, consider using Deployment Text Chat.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextGenRequest

From a given prompt, infer the next tokens.

Examples:

prompt tune

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
}'
Copy to clipboard

prompt template

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000,
    "prompt_variables": {
      "name": "joe",
      "count": 3
    },
  },
}'
Copy to clipboard

Response

Response Body

TextGenResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A prompt template response.

The generated text from the model along with other details for a prompt template.

{
  "model_id": "google/flan-ul2",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "4,000 km",
      "generated_token_count": 4,
      "input_token_count": 12,
      "stop_reason": "eos_token"
    }
  ]
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

This API is legacy, consider using Deployment Text Chat Stream.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned, also the rank and top_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextGenRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

prompt tune

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
}'
Copy to clipboard

prompt template

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000,
    "prompt_variables": {
      "name": "joe",
      "count": 3
    },
  },
}'
Copy to clipboard

Response

Response Body

TextGenResponse[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infers the next chat message for a given deployment. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation.

Related guides:

Deployment
Prompt template
Text chat

If stream is true, this operation will return the output tokens in a server-sent events (SSE) stream.

POST /ml/v1/deployments/{id_or_name}/chat/completions

Auditing

Calling this method generates the following auditing event.

chat-completions.send

Request

Custom Headers

Accept
string
Allowable values: [application/json,text/event-stream]

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Request Body

Required*

DeploymentTextChatRequest

From a given prompt, infer the next chat message.

Examples:

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation. Content-Type: text/event-stream if stream is true.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A chat prompt template response.

The generated text from the model along with other details for a prompt template.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "ibm/granite-3-2b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 2020 World Series was played at the Globe Life Field in Arlington, Texas.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 27,
    "prompt_tokens": 186,
    "total_tokens": 213
  }
}
Copy to clipboard

Status 200: A chat prompt template with system_prompt and context response.

The generated text from the model along with other details for a prompt template.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "ibm/granite-3-2b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I am Granite Chat, created by IBM. I am here to assist you. Today is Wednesday.tomorrow is Thursday.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 32,
    "prompt_tokens": 154,
    "total_tokens": 186
  }
}
Copy to clipboard

Infers the next chat message for a given deployment. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation.

Related guides:

Deployment
Prompt template
Text chat

You can also use Deployment Chat Completions to achieve the same result.

POST /ml/v1/deployments/{id_or_name}/text/chat

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextChatRequest

From a given prompt, infer the next chat message.

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infers the next chat message for a given deployment. This operation will return the output tokens as a stream of events. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation. Related guides:

Deployment
Prompt template
Text chat

You can also use Deployment Chat Completions with stream option set to true to achieve the same result.

POST /ml/v1/deployments/{id_or_name}/text/chat_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextChatRequest

From a given prompt, infer the next chat message in a server-sent events (SSE) stream.

Response

Response Body

TextChatStreamItem[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate forecasts, or predictions for future time points, given historical time series data.

POST /ml/v1/deployments/{id_or_name}/time_series/forecast

Auditing

Calling this method generates the following auditing event.

pm-20.time-series-forecast.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTSForecastResource

The forecast request.

Examples:

Response

Response Body

TSForecastResponse

The time series forecast response.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "model_id": "bc35d16e-dd21-472e-9cde-c6c3ad88e3b5",
  "created_at": "2020-05-02T16:27:51Z",
  "results": [
    {
      "date": [
        "2020-01-05T02:00:00",
        "2020-01-05T03:00:00",
        "2020-01-06T00:00:00"
      ],
      "ID1": [
        "D1",
        "D1",
        "D1"
      ],
      "TARGET1": [
        1.86,
        3.24,
        6.78
      ]
    }
  ],
  "input_data_points": 512,
  "output_data_points": 1024
}
Copy to clipboard

Create a fine tuning job that will fine tune an LLM.

Since CloudPak for Data 5.0.3.

POST /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

FineTuningRequest

The details of the fine tuning job with the data used to tune the LLM.

Examples:

Fine Tuning File System

curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16'
-H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
  "name": "Example - Full fine tuning",
  "auto_update_model": true,
  "parameters": {
    "base_model": {"model_id": "meta-llama/llama-3-1-8b"},
    "task_id": "classification",
    "accumulate_steps": 1,
    "num_epochs": 10,
    "learning_rate": 0.00005,
    "batch_size": 16,
    "max_seq_length": 2048,
    "response_template": "
### Response:",
    "verbalizer": "### Input: {{input}} 

### Response: {{output}}",
    "gpu": {"num": 1},
    "gradient_checkpointing": true
  },
  "results_reference": {
    "connection": {},
    "location": {"path": "fine-tuning/experiment3"},
    "type": "fs"
  },
  "training_data_references": [
    {
      "connection": {},
      "location": {
        "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
        "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
      },
      "type": "data_asset"
    }
  ]
}'
Copy to clipboard

Lora Fine Tuning File System

curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16'
-H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
  "name": "Example - Lora fine tuning",
  "auto_update_model": true,
  "parameters": {
    "base_model": {"model_id": "meta-llama/llama-3-1-8b"},
    "task_id": "classification",
    "accumulate_steps": 1,
    "num_epochs": 10,
    "learning_rate": 0.00005,
    "batch_size": 16,
    "max_seq_length": 2048,
    "response_template": "
### Response:",
    "verbalizer": "### Input: {{input}} 

### Response: {{output}}",
    "gpu": {"num": 1},
    "peft_parameters": {
      "type": "lora",
      "rank": 16,
      "target_modules": ["all-linear"],
      "lora_alpha": 32,
      "lora_dropout": 0.05
    },
    "gradient_checkpointing": true
  },
  "results_reference": {
    "connection": {},
    "location": {"path": "fine-tuning/experiment4"},
    "type": "fs"
  },
  "training_data_references": [
    {
      "connection": {},
      "location": {
        "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
        "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
      },
      "type": "data_asset"
    }
  ]
}'
Copy to clipboard

QLora Fine Tuning File System

curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16'
-H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
  "name": "Example - QLora fine tuning",
  "auto_update_model": false,
  "parameters": {
    "base_model"     : {"model_id": "meta-llama/llama-3-1-70b-gptq"},
    "gpu": {"num": 1},
    "peft_parameters": {"type": "qlora"}
  },
  "results_reference": {
    "connection": {},
    "location": {"path"  : "fine-tuning/experiment5"},
    "type": "fs"
  },
  "training_data_references": [
    {
      "connection": {},
      "location": {
        "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
        "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
      },
      "type": "data_asset"
    }
  ]
}'
Copy to clipboard

Response

Response Body

FineTuningResource

The response of a fine tuning job.

Status Code

201
Created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201: full fine tuning with file system results reference

A fine tuning response with a file system result reference.

Since watsonx.ai 2.2.1.

{
  "entity": {
    "auto_update_model": true,
    "parameters": {
      "accumulate_steps": 1,
      "base_model": {
        "model_id": "meta-llama/llama-3-1-8b"
      },
      "batch_size": 2,
      "gpu": {
        "num": 1
      },
      "gradient_checkpointing": true,
      "learning_rate": 0.00005,
      "max_seq_length": 2048,
      "num_epochs": 10,
      "response_template": "\n### Response:",
      "task_id": "classification",
      "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}"
    },
    "results_reference": {
      "connection": {},
      "location": {
        "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3",
        "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53",
        "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/training-status.json",
        "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/model",
        "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets/320e8a31-a696-4ea1-afa6-440fd3ac8e53/resources/wml_model/request.json",
        "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/data/fine_tunings/training.log",
        "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets"
      },
      "type": "fs"
    },
    "status": {
      "completed_at": "2025-07-31T15:16:44.876Z",
      "metrics": [
        {
          "context": {
            "fine_tuning": {
              "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets/320e8a31-a696-4ea1-afa6-440fd3ac8e53/resources/training_logs.jsonl"
            }
          },
          "fine_tuning_metrics": {
            "training_loss": [
              {
                "epoch": 1,
                "step": 101,
                "timestamp": "2025-07-31T15:09:31.354986",
                "value": 2.3847
              },
              {
                "epoch": 2,
                "step": 202,
                "timestamp": "2025-07-31T15:10:14.620318",
                "value": 0.8851
              },
              {
                "epoch": 3,
                "step": 303,
                "timestamp": "2025-07-31T15:10:58.228559",
                "value": 0.457
              },
              {
                "epoch": 4,
                "step": 404,
                "timestamp": "2025-07-31T15:11:41.689645",
                "value": 0.2944
              },
              {
                "epoch": 5,
                "step": 505,
                "timestamp": "2025-07-31T15:12:24.963389",
                "value": 0.2052
              },
              {
                "epoch": 6,
                "step": 606,
                "timestamp": "2025-07-31T15:13:08.119829",
                "value": 0.1561
              },
              {
                "epoch": 7,
                "step": 707,
                "timestamp": "2025-07-31T15:13:51.672343",
                "value": 0.1255
              },
              {
                "epoch": 8,
                "step": 808,
                "timestamp": "2025-07-31T15:14:35.192001",
                "value": 0.1089
              },
              {
                "epoch": 9,
                "step": 909,
                "timestamp": "2025-07-31T15:15:18.613572",
                "value": 0.1013
              },
              {
                "epoch": 10,
                "step": 1010,
                "timestamp": "2025-07-31T15:16:01.777923",
                "value": 0.0983
              }
            ]
          },
          "timestamp": "2025-07-31T15:16:02.286Z"
        }
      ],
      "running_at": "2025-07-31T15:05:15.787Z",
      "state": "completed"
    },
    "training_data_references": [
      {
        "connection": {},
        "location": {
          "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2",
          "id": "c195ff4c-c287-475c-a7e0-7159c9381511"
        },
        "type": "data_asset"
      }
    ],
    "tuned_model": {
      "id": "1b0cb90f-c1b8-4e27-afc4-f59797c62bcf",
      "name": "model-320e8a31-a696-4ea1-afa6-440fd3ac8e53"
    }
  },
  "metadata": {
    "created_at": "2025-07-31T14:59:29.150Z",
    "id": "320e8a31-a696-4ea1-afa6-440fd3ac8e53",
    "modified_at": "2025-07-31T15:16:44.887Z",
    "name": "Example - Full fine tuning",
    "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2"
  }
}
Copy to clipboard

Status 201: lora fine tuning with file system results reference

A lora fine tuning response with a file system result reference.

Since watsonx.ai 2.2.1.

{
  "entity": {
    "auto_update_model": true,
    "parameters": {
      "accumulate_steps": 1,
      "base_model": {
        "model_id": "meta-llama/llama-3-1-8b"
      },
      "batch_size": 32,
      "gpu": {
        "num": 1
      },
      "gradient_checkpointing": true,
      "learning_rate": 0.00005,
      "max_seq_length": 2048,
      "num_epochs": 10,
      "peft_parameters": {
        "lora_alpha": 32,
        "lora_dropout": 0.05,
        "rank": 16,
        "target_modules": [
          "all-linear"
        ],
        "type": "lora"
      },
      "response_template": "\n### Response:",
      "task_id": "classification",
      "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}"
    },
    "results_reference": {
      "connection": {},
      "location": {
        "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4",
        "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd",
        "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/training-status.json",
        "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/model",
        "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets/928bb582-7872-40bc-875f-c56d8676b9bd/resources/wml_model/request.json",
        "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/data/fine_tunings/training.log",
        "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets"
      },
      "type": "fs"
    },
    "status": {
      "completed_at": "2025-07-31T15:25:57.603Z",
      "metrics": [
        {
          "context": {
            "fine_tuning": {
              "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets/928bb582-7872-40bc-875f-c56d8676b9bd/resources/training_logs.jsonl"
            }
          },
          "fine_tuning_metrics": {
            "training_loss": [
              {
                "epoch": 1,
                "step": 7,
                "timestamp": "2025-07-31T15:24:49.806135",
                "value": 2.9094
              },
              {
                "epoch": 2,
                "step": 14,
                "timestamp": "2025-07-31T15:24:55.845440",
                "value": 2.102
              },
              {
                "epoch": 3,
                "step": 21,
                "timestamp": "2025-07-31T15:25:01.923967",
                "value": 1.7618
              },
              {
                "epoch": 4,
                "step": 28,
                "timestamp": "2025-07-31T15:25:07.940105",
                "value": 1.7091
              },
              {
                "epoch": 5,
                "step": 35,
                "timestamp": "2025-07-31T15:25:14.032330",
                "value": 1.5757
              },
              {
                "epoch": 6,
                "step": 42,
                "timestamp": "2025-07-31T15:25:20.094345",
                "value": 1.4984
              },
              {
                "epoch": 7,
                "step": 49,
                "timestamp": "2025-07-31T15:25:26.189688",
                "value": 1.4543
              },
              {
                "epoch": 8,
                "step": 56,
                "timestamp": "2025-07-31T15:25:32.269618",
                "value": 1.4443
              },
              {
                "epoch": 9,
                "step": 63,
                "timestamp": "2025-07-31T15:25:38.357029",
                "value": 1.3861
              },
              {
                "epoch": 10,
                "step": 70,
                "timestamp": "2025-07-31T15:25:44.431574",
                "value": 1.3456
              }
            ]
          },
          "timestamp": "2025-07-31T15:25:45.535Z"
        }
      ],
      "running_at": "2025-07-31T15:22:17.260Z",
      "state": "completed"
    },
    "training_data_references": [
      {
        "connection": {},
        "location": {
          "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2",
          "id": "c195ff4c-c287-475c-a7e0-7159c9381511"
        },
        "type": "data_asset"
      }
    ],
    "tuned_model": {
      "id": "5079b31b-a050-4ae2-a55c-864b91055ad8",
      "name": "model-928bb582-7872-40bc-875f-c56d8676b9bd"
    }
  },
  "metadata": {
    "created_at": "2025-07-31T15:21:18.328Z",
    "id": "928bb582-7872-40bc-875f-c56d8676b9bd",
    "modified_at": "2025-07-31T15:25:57.613Z",
    "name": "Example - Lora fine tuning",
    "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2"
  }
}
Copy to clipboard

Status 201: qlora fine tuning with file system results reference

A qlora tuning response with a file system result reference.

Since watsonx.ai 2.2.1.

{
  "entity": {
    "auto_update_model": false,
    "parameters": {
      "accumulate_steps": 1,
      "base_model": {
        "model_id": "meta-llama/llama-3-1-70b-gptq"
      },
      "batch_size": 5,
      "gpu": {
        "num": 1
      },
      "gradient_checkpointing": true,
      "learning_rate": 0.00001,
      "max_seq_length": 1024,
      "num_epochs": 10,
      "peft_parameters": {
        "lora_alpha": 32,
        "lora_dropout": 0.05,
        "rank": 8,
        "target_modules": [],
        "type": "qlora"
      },
      "response_template": "\n### Response:",
      "task_id": "generation",
      "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}"
    },
    "results_reference": {
      "connection": {},
      "location": {
        "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5",
        "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e",
        "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/training-status.json",
        "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/model",
        "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets/b3ed41c5-d7e1-47ec-b110-d1796f64577e/resources/wml_model/request.json",
        "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/data/fine_tunings/training.log",
        "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets"
      },
      "type": "fs"
    },
    "status": {
      "completed_at": "2025-07-31T16:38:38.856Z",
      "metrics": [
        {
          "context": {
            "fine_tuning": {
              "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets/b3ed41c5-d7e1-47ec-b110-d1796f64577e/resources/training_logs.jsonl"
            }
          },
          "fine_tuning_metrics": {
            "training_loss": [
              {
                "epoch": 1,
                "step": 41,
                "timestamp": "2025-07-31T16:28:05.466593",
                "value": 2.9058
              },
              {
                "epoch": 2,
                "step": 82,
                "timestamp": "2025-07-31T16:29:13.709110",
                "value": 2.5783
              },
              {
                "epoch": 3,
                "step": 123,
                "timestamp": "2025-07-31T16:30:21.861881",
                "value": 1.9994
              },
              {
                "epoch": 4,
                "step": 164,
                "timestamp": "2025-07-31T16:31:29.929725",
                "value": 1.7515
              },
              {
                "epoch": 5,
                "step": 205,
                "timestamp": "2025-07-31T16:32:38.147384",
                "value": 1.6546
              },
              {
                "epoch": 6,
                "step": 246,
                "timestamp": "2025-07-31T16:33:46.835779",
                "value": 1.6284
              },
              {
                "epoch": 7,
                "step": 287,
                "timestamp": "2025-07-31T16:34:55.447423",
                "value": 1.6256
              },
              {
                "epoch": 8,
                "step": 328,
                "timestamp": "2025-07-31T16:36:04.004780",
                "value": 1.6072
              },
              {
                "epoch": 9,
                "step": 369,
                "timestamp": "2025-07-31T16:37:12.342551",
                "value": 1.5981
              },
              {
                "epoch": 10,
                "step": 410,
                "timestamp": "2025-07-31T16:38:21.105589",
                "value": 1.5792
              }
            ]
          },
          "timestamp": "2025-07-31T16:38:22.615Z"
        }
      ],
      "running_at": "2025-07-31T16:23:29.325Z",
      "state": "completed"
    },
    "training_data_references": [
      {
        "connection": {},
        "location": {
          "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2",
          "id": "c195ff4c-c287-475c-a7e0-7159c9381511"
        },
        "type": "data_asset"
      }
    ],
    "tuned_model": {
      "name": "model-b3ed41c5-d7e1-47ec-b110-d1796f64577e"
    }
  },
  "metadata": {
    "created_at": "2025-07-31T16:22:51.651Z",
    "id": "b3ed41c5-d7e1-47ec-b110-d1796f64577e",
    "modified_at": "2025-07-31T16:38:38.866Z",
    "name": "Example - QLora fine tuning",
    "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2"
  }
}
Copy to clipboard

Retrieve the list of fine tuning jobs for the specified space or project.

Since CloudPak for Data 5.0.3.

GET /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned.

Possible values: value ≤ 200
Default: 100
total_count
boolean
Compute the total count. May have performance impact.
tag.value
string
Return only the resources with the given tag value.
state
string
Filter based on on the job state: queued, running, completed, failed etc.
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

FineTuningResources

System details.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get the results of a fine tuning job, or details if the job failed.

Since CloudPak for Data 5.0.3.

GET /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.get

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

FineTuningResource

The response of a fine tuning job.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete a fine tuning job if it exists, once deleted all trace of the job is gone.

Since CloudPak for Data 5.0.3.

DELETE /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.delete

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job, request output and metadata.

Response

Status Code

204
Deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of deployed foundation models.

GET /ml/v1/foundation_model_specs

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

filters

string

A set of filters to specify the list of models, filters are described as the pattern shown below.

 pattern: tfilter[,tfilter][:(or|and)]
 tfilter: filter | !filter
   filter: Requires existence of the filter.
   !filter: Requires absence of the filter.
 filter: one of
   modelid_*:     Filters by model id.
                  Namely, select a model with a specific model id.
   provider_*:    Filters by provider.
                  Namely, select all models with a specific provider.
   source_*:      Filters by source.
                  Namely, select all models with a specific source.
   input_tier_*:  Filters by input tier.
                  Namely, select all models with a specific input tier.
   output_tier_*: Filters by output tier.
                  Namely, select all models with a specific output tier.
   tier_*:        Filters by tier.
                  Namely, select all models with a specific input or output tier.
   task_*:        Filters by task id.
                  Namely, select all models that support a specific task id.
   lifecycle_*:   Filters by lifecycle state.
                  Namely, select all models that are currently in the specified lifecycle state.
   function_*:    Filters by function. Since CloudPak for Data `5.0.0`.
                  Namely, select all models that support a specific function.

Possible values: 1 ≤ length ≤ 1000, Value must match regular expression ^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$

Example: modelid_ibm/granite-13b-instruct-v1,modelid_ibm/granite-13b-instruct-v2:or

tech_preview
boolean
See all the Tech Preview models if entitled.

Default: false

get foundation models

curl --request GET 'https://{cluster_url}/ml/v1/foundation_model_specs?version=2019-10-25&filters=function_time_series_forecast'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

FoundationModels

System details.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
404
The specified resource was not found.

Example responses

Status 200: The list of models.

The models that are currently deployed in the cluster.

{
  "total_count": 1,
  "limit": 100,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02"
  },
  "resources": [
    {
      "model_id": "bigcode/starcoder",
      "label": "starcoder-15.5b",
      "provider": "BigCode",
      "source": "Hugging Face",
      "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions",
      "tasks": [
        {
          "id": "code",
          "ratings": {
            "quality": 3
          }
        }
      ],
      "min_shot_size": 0,
      "input_tier": "class_2",
      "output_tier": "class_2",
      "number_params": "15.5b"
    }
  ]
}
Copy to clipboard

Retrieve the list of tasks that are supported by the foundation models.

GET /ml/v1/foundation_model_tasks

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Response

Response Body

FoundationModelTasks

System details.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
404
The specified resource was not found.

Example responses

Status 200: The list of tasks.

The tasks that are currently supported by models deployed in the cluster.

{
  "total_count": 1,
  "limit": 100,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02"
  },
  "resources": [
    {
      "task_id": "question_answering",
      "label": "Question answering",
      "rank": 1,
      "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance."
    }
  ]
}
Copy to clipboard

Create a new notebook

either from scratch
or by copying another notebook.

To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to your project or space storage using Assets-files API and then reference it with the attribute file_reference. The other required attributes are name, project/space and runtime. The attribute runtime is used to specify the environment on which the notebook runs. Either project or space must be specified in the request body.

To copy a notebook, you only need to provide name and source_guid in the request body.

POST /v2/notebooks

Request

Request Body

Required*

One of

Change Schema Parameter List

Specification of the notebook to be created.

Example:

Response

Response Body

One of

Change Schema Parameter List

Notebook information in a project as returned by a GET request.

Status Code

201
Success. Created and returned a new notebook asset. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.
429
The number of requests has exceeded the rate limit.

Example responses

Status 201: A notebook created in a project from scratch

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python3",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 201: A notebook created in a space from scratch

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "space_id": "92ae0e27-9b11-4de9-a646-d46ca3c183d4"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python3",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "spark33py39-92ae0e27-9b11-4de9-a646-d46ca3c183d4",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?space_id=92ae0e27-9b11-4de9-a646-d46ca3c183d4"
  }
}
Copy to clipboard

Status 201: A notebook created by copying another notebook

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python3",
        "language": "python3"
      },
      "originates_from": {
        "type": "notebook",
        "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Status 429: Rate limit error with status code 429

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "rate_limit",
      "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later."
    }
  ]
}
Copy to clipboard

Retrieve the details of a large number of notebooks inside a project.

POST /v2/notebooks/list

Request

Query Parameters

project_id
Required*
string
The guid of the project.
include
Required*
string
Additional info that will be included into the notebook details. Possible values are:
- runtime

Request Body

Required*

NotebookListBody

Payload for a notebook list request.

Examples:

Response

Response Body

NotebooksResourceList

A list of notebook info as returned by a list query.

Status Code

200
Success. Returned a list of notebook assets. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A list of notebooks

{
  "total_results": 1,
  "resources": [
    {
      "metadata": {
        "guid": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d"
      },
      "entity": {
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "asset": {
          "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
          "asset_type": "notebook",
          "created_at": "2021-07-01T12:37:01Z",
          "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
          "version": 2,
          "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
        }
      }
    }
  ]
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Delete a particular notebook, including the notebook asset.

DELETE /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Status Code

204
Successful request. Notebook is deleted.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Revert the main notebook to a version.

PUT /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the main notebook.

Request Body

Required*

NotebookRevertBody

Payload for a request to revert to a specific notebook version.

Examples:

Response

Response Body

One of

Change Schema Parameter List

Notebook information in a project as returned by a GET request.

Status Code

200
Success. Reverted the main notebook to a version. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A reverted notebook

{
  "metadata": {
    "name": "my notebook v4.2",
    "description": "this is my notebook v4.2",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python39",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Update a particular notebook.

PATCH /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Request Body

Required*

NotebookUpdateBody

Payload for a notebook update request.

Examples:

Response

Response Body

Notebook

Notebook information as returned by a GET request.

Status Code

200
Success. Updated the notebook. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: An updated notebook

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python39",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
      "spark_monitoring_enabled": false
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Promote a notebook from project to space.

POST /v2/notebooks/{notebook_guid}/promote

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Query Parameters

version_guid
Required*
string
The guid of the notebook version.
project_id
Required*
string
The id of the project from which a notebook will be promoted.

Request Body

Required*

NotebookPromoteBody

Body parameters for promoting a notebook. space_id is required. name and description are optional. If not specified, the name and description of the source notebook in project will be used.

Response

Response Body

NotebookInSpaceWithoutRuntime

Notebook information in a space as returned by promoting a notebook from project to space.

Status Code

200
Success. Returned the notebook asset in the space.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Create a version of a given notebook.

POST /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Response Body

One of

Change Schema Parameter List

A notebook version in a project.

Status Code

200
Success. Returned the notebook version definition.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A notebook version in a project

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 200: A notebook version in a space

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

List all versions of a particular notebook.

GET /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Response Body

One of

Change Schema Parameter List

A list of notebook versions in a project.

Status Code

200
Success. Returned a list of versions of the notebook.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A list of notebook versions in a project

{
  "total_results": 1,
  "resources": [
    {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  ]
}
Copy to clipboard

Status 200: A list of notebook versions in a space

{
  "total_results": 1,
  "resources": [
    {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  ]
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Retrieve a particular version of a notebook.

GET /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.
version_guid
Required*
string
The guid of the version.

Response

Response Body

One of

Change Schema Parameter List

A notebook version in a project.

Status Code

200
Success. Returned the version definition.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A notebook version in a project

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 200: A notebook version in a space

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Delete a particular version of a given notebook.

DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.
version_guid
Required*
string
The guid of the version.

Response

Status Code

204
Success. The version is deleted.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

This creates a new prompt with the provided parameters.

POST /v1/prompts

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptPost

Response

Response Body

wxPromptResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt / prompt template with the given id.

GET /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
restrict_model_parameters
string
Only return a set of model parameters compatiable with inferencing

Default: true

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This updates a prompt / prompt template with the given id.

PATCH /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptPatch

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt / prompt template with the given id.

DELETE /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Modifies the current locked state of a prompt.

PUT /v1/prompts/{prompt_id}/lock

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
force
boolean
Override a lock if it is currently taken.

Request Body

Required*

promptLock

Response

Response Body

promptLock

Status Code

200
Ok - Returned when lock change is successful
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Retrieves the current locked state of a prompt.

GET /v1/prompts/{prompt_id}/lock

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

promptLock

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Computes the inference input string based on state of a prompt. Optionally replaces template params

POST /v1/prompts/{prompt_id}/input

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

wxPromptInputRequest

Response

Response Body

object

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This adds new chat items to the given prompt.

POST /v1/prompts/{prompt_id}/chat_items

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

chatItem[]

Response

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This creates a new prompt session.

POST /v1/prompt_sessions

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptSession

Response

Response Body

wxPromptResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt session with the given id.

GET /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
prefetch
boolean
Include the most recent entry

Response

Response Body

wxPromptSession

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This updates a prompt session with the given id.

PATCH /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

object

Response

Response Body

wxPromptSession

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt session with the given id.

DELETE /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This creates a new prompt associated with the given session.

POST /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptSessionEntry

Response

Response Body

wxPromptSessionEntry

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

List entries from a given session.

GET /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
bookmark
string
Bookmark from a previously limited get request

Possible values: Value must match regular expression [a-zA-Z0-9-]*
limit
string
Limit for results to retrieve, default 20

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

wxPromptSessionEntryList

Status Code

200
Success - Returned when search completes
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This adds new chat items to the given entry.

POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

chatItem[]

Response

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Modifies the current locked state of a prompt session.

PUT /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
force
boolean
Override a lock if it is currently taken.

Request Body

Required*

promptLock

Response

Response Body

promptLock

Status Code

200
Ok - Returned when lock change is successful
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Retrieves the current locked state of a prompt session.

GET /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

promptLock

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt session entry with the given id.

GET /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt session entry with the given id.

DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters.

If stream is true, this operation will return the output tokens in a server-sent events (SSE) stream.

Since watsonx.ai 2.2.0.

POST /ml/v1/chat/completions

Auditing

Calling this method generates the following auditing event.

chat-completions.send

Request

Custom Headers

Accept
string
Allowable values: [application/json,text/event-stream]

Request Body

Required*

ChatCompletionsRequest

From a given prompt, infer the next tokens.

Examples:

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation. Content-Type: text/event-stream if stream is true.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: chat_completions

A chat example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model": "meta-llama/llama-3-8b-instruct",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 47,
    "prompt_tokens": 59,
    "total_tokens": 106
  }
}
Copy to clipboard

Status 200: tool_call

A tool calling example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model": "meta-llama/llama-3-8b-instruct",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n  \"location\": \"Boston, MA\",\n  \"unit\": \"fahrenheit\"\n}\n"
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 18,
    "prompt_tokens": 19,
    "total_tokens": 37
  }
}
Copy to clipboard

Status 200: json_mode

A chat example with json output.

{
  "id": "cmpl-09945b25c805491fb49e15439b8e5d84",
  "model": "meta-llama/llama-3-8b-instruct",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 35,
    "prompt_tokens": 20,
    "total_tokens": 55
  }
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters.

You can also use /v1/chat/completions to achieve the same result.

Since watsonx.ai 2.1.0.

POST /ml/v1/text/chat

Auditing

Calling this method generates the following auditing event.

pm-20.text-chat.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextChatRequest

From a given prompt, infer the next tokens.

Examples:

model_id
Required*
string
The model to use for the chat completion.

Please refer to the list of models.
messages
Required*
The messages for this chat session.

Possible values: 1 ≤ number of items ≤ 1000
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
tools
Tool functions that can be called with the response.

Possible values: 1 ≤ number of items ≤ 128
tool_choice_option
string
Specify either tool_choice_option to allow the model to pick or tool_choice to force the model to call a tool.

Using auto means the model can pick between generating a message or calling one or more tools. Default is auto.

Using none means the model will not call any tool and instead generates a message.

Using required means the model must call one or more tools.

Allowable values: [auto,none,required]
tool_choice
Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. Specify either tool_choice_option to allow the model to pick or tool_choice to force the model to call a tool.
guided_choice
string[]
If specified, the output will be exactly one of the choices.
guided_regex
string
If specified, the output will follow the regex pattern.
guided_grammar
string
If specified, the output will follow the context free grammar.
guided_json
object
If specified, the output will follow the JSON schema. See the JSON Schema reference for documentation about the format.
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }
chat_template_kwargs
object
Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:
{ "thinking": true }
frequency_penalty
number
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Possible values: -2 < value < 2
Default: 0
include_reasoning
boolean
Whether to include reasoning_content in the response. Default is true.
logit_bias
object
Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:
{ "1003": -100, "1004": -100 }
logprobs
boolean
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

Default: false
top_logprobs
integer
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option logprobs must be set to true if this parameter is used.

Possible values: 0 ≤ value ≤ 20
max_completion_tokens
integer
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.

Default: 1024
max_tokens
integer
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.

This value is now deprecated in favor of max_completion_tokens. If specified together with max_completion_tokens, max_tokens will be ignored.

Default: 1024
n
integer
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

Default: 1
presence_penalty
number
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Possible values: -2 < value < 2
Default: 0
reasoning_effort
string
A lower reasoning effort can result in faster responses, fewer tokens used, and shorter reasoning_content in the responses. Supported values are low, medium, and high.

Allowable values: [low,medium,high]
response_format
The chat response format parameters.
seed
integer
Random number generator seed to use in sampling mode for experimental repeatability.

Example: 41
stop
string[]
Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.

Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:
[ "this", "the" ]
stream
boolean
If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.

Example: true
temperature
number
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.

Possible values: 0 < value < 2
Default: 1
top_p
number
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

Possible values: 0 < value < 1
Default: 1
time_limit
integer
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.

Possible values: value > 0
Example: 600000

text chat

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Who won the world series in 2020?"
        }
      ]
    },
    {
      "role": "assistant",
      "content": "The Los Angeles Dodgers won the World Series in 2020."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Where was it played?"
        }
      ]
    }
  ],
  "max_tokens": 100,
  "temperature": 0,
  "time_limit": 1000
}'
Copy to clipboard

tool call

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is the weather like in Boston today?"
        }
      ]
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "description": "The city, e.g. San Francisco, CA",
              "type": "string"
            },
            "unit": {
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "type": "string"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "get_current_weather",
    }
  }
}'
Copy to clipboard

json mode

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "response_format": {
    "type": "json_object"
  },
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant designed to output JSON."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "user",
          "text": "Who won the world series in 2020?"
        }
      ]
    }
  ]
}'
Copy to clipboard

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: text_chat

A text chat example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 47,
    "prompt_tokens": 59,
    "total_tokens": 106
  }
}
Copy to clipboard

Status 200: tool_call

A tool calling example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n  \"location\": \"Boston, MA\",\n  \"unit\": \"fahrenheit\"\n}\n"
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 18,
    "prompt_tokens": 19,
    "total_tokens": 37
  }
}
Copy to clipboard

Status 200: json_mode

A text chat example with json output.

{
  "id": "cmpl-09945b25c805491fb49e15439b8e5d84",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 35,
    "prompt_tokens": 20,
    "total_tokens": 55
  }
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

This operation will set stream in the request body to true and return the output tokens as a stream of events.

You can also use /v1/chat/completions with stream option set to true to achieve the same result.

Since watsonx.ai 2.1.0.

POST /ml/v1/text/chat_stream

Auditing

Calling this method generates the following auditing event.

pm-20.text-chat.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextChatRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

model_id
Required*
string
The model to use for the chat completion.

Please refer to the list of models.
messages
Required*
The messages for this chat session.

Possible values: 1 ≤ number of items ≤ 1000
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
tools
Tool functions that can be called with the response.

Possible values: 1 ≤ number of items ≤ 128
tool_choice_option
string
Specify either tool_choice_option to allow the model to pick or tool_choice to force the model to call a tool.

Using auto means the model can pick between generating a message or calling one or more tools. Default is auto.

Using none means the model will not call any tool and instead generates a message.

Using required means the model must call one or more tools.

Allowable values: [auto,none,required]
tool_choice
Specifying a particular tool via {"type": "function", "function": {"name": "my_function"}} forces the model to call that tool. Specify either tool_choice_option to allow the model to pick or tool_choice to force the model to call a tool.
guided_choice
string[]
If specified, the output will be exactly one of the choices.
guided_regex
string
If specified, the output will follow the regex pattern.
guided_grammar
string
If specified, the output will follow the context free grammar.
guided_json
object
If specified, the output will follow the JSON schema. See the JSON Schema reference for documentation about the format.
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }
chat_template_kwargs
object
Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:
{ "thinking": true }
frequency_penalty
number
Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Possible values: -2 < value < 2
Default: 0
include_reasoning
boolean
Whether to include reasoning_content in the response. Default is true.
logit_bias
object
Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:
{ "1003": -100, "1004": -100 }
logprobs
boolean
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.

Default: false
top_logprobs
integer
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option logprobs must be set to true if this parameter is used.

Possible values: 0 ≤ value ≤ 20
max_completion_tokens
integer
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.

Default: 1024
max_tokens
integer
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.

This value is now deprecated in favor of max_completion_tokens. If specified together with max_completion_tokens, max_tokens will be ignored.

Default: 1024
n
integer
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.

Default: 1
presence_penalty
number
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Possible values: -2 < value < 2
Default: 0
reasoning_effort
string
A lower reasoning effort can result in faster responses, fewer tokens used, and shorter reasoning_content in the responses. Supported values are low, medium, and high.

Allowable values: [low,medium,high]
response_format
The chat response format parameters.
seed
integer
Random number generator seed to use in sampling mode for experimental repeatability.

Example: 41
stop
string[]
Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.

Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:
[ "this", "the" ]
stream
boolean
If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.

Example: true
temperature
number
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

We generally recommend altering this or top_p but not both.

Possible values: 0 < value < 2
Default: 1
top_p
number
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

We generally recommend altering this or temperature but not both.

Possible values: 0 < value < 1
Default: 1
time_limit
integer
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.

Possible values: value > 0
Example: 600000

Response

Response Body

TextChatStreamItem[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

This operation is used mainly for Hate And Profanity (HAP) and Personal Identifiable Information (PII) filtering.

This is a detection-only end-point. It supports natural language input and output and returns the result of the detection. It can be configured for HAP, PII, or any combination with other available detectors.

Since CloudPak for Data 5.1.0.

POST /ml/v1/text/detection

Auditing

Calling this method generates the following auditing event.

pm-20.text-detection-content.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextDetectionContentRequest

From a content, detect text.

Examples:

input
Required*
string
The text to be examined.
detectors
Required*
The detectors to use, these can be IBM provided HAP or PII detectors or a custom content detector.
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

detect pii

curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "input": "my text to check",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "detectors": {
    "pii": {}
  }
}'
Copy to clipboard

detect hap

curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "input": "my text to check",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "detectors": {
    "hap": {
      "threshold": 0.5
    }
  }
}'
Copy to clipboard

detect multiple

curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "input": "my text to check",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "detectors": {
    "pii": {},
    "hap": {
      "threshold": 0.6
    }
  }
}'
Copy to clipboard

Response

Response Body

TextDetectionContentResponse

The response for text detection.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: text PII detection

A PII text detection example.

{
  "detections": [
    {
      "start": 20,
      "end": 24,
      "detection_type": "pii",
      "detection": "xxxx",
      "score": 0.846
    }
  ]
}
Copy to clipboard

Status 200: A HAP text detection example.

A HAP text detection example.

{
  "detections": [
    {
      "start": 122,
      "end": 239,
      "detection_type": "hap",
      "detection": "xxxxxxxxxxxxxxxxxxxxxxxxxx",
      "score": 0.846
    }
  ]
}
Copy to clipboard

Status 200: text detection with multiple detectors

A text detection with multiple detectors.

{
  "detections": [
    {
      "start": 20,
      "end": 24,
      "detection_type": "pii",
      "detection": "xxxx",
      "score": 0.846
    },
    {
      "start": 122,
      "end": 239,
      "detection_type": "hap",
      "detection": "xxxxxxxxxxxxxxxxxxxxxxxxxx",
      "score": 0.846
    }
  ]
}
Copy to clipboard

This operation supports context relevance and faithfulness (or groundedness).

The input is analyzed, along with the context information, and the model will return any detections that it found.

POST /ml/v1/text/detection/context

Auditing

Calling this method generates the following auditing event.

pm-20.text-detection-context.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextDetectionContextRequest

From a content, detect text using a detection model.

Since CloudPak for Data 5.1.0.

input
Required*
string
The text to be examined.
detectors
Required*
The detectors to use, this is a map of detector-name with a map of optional key/value pairs.
context_type
Required*
string
The type of the context.

Allowable values: [docs]
context
Required*
Context documents.

Possible values: 1 ≤ number of items ≤ 100
Examples:
[ "https://en.wikipedia.org/wiki/IBM", "https://research.ibm.com/" ]
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

Response

Response Body

TextDetectionContextResponse

The response for text context detection.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

This operation supports answer relevance.

The prompt is analyzed, along with the generated text, and the model will return any detections that it found.

Since CloudPak for Data 5.1.0.

POST /ml/v1/text/detection/generated

Auditing

Calling this method generates the following auditing event.

pm-20.text-detection-generated.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextDetectionGeneratedRequest

From a content, detect text using a detection model.

prompt
Required*
string
The text prompt.
generated_text
Required*
string
The generated text.
detectors
Required*
The detectors to use, this is a map of detector-name with a map of optional key/value pairs.
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

Response

Response Body

TextDetectionGeneratedResponse

The response for generated text detection.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate embeddings from text input.

See the documentation for a description of text embeddings.

Since watsonx.ai 2.0.0.

POST /ml/v1/text/embeddings

Auditing

Calling this method generates the following auditing event.

pm-20.text-embeddings.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

EmbeddingsRequest

The text input for a given model to be used to generate the embeddings.

Examples:

model_id
Required*
string
The id of the model to be used for this request. Please refer to the list of models.
inputs
Required*
string[]
The input text.

Possible values: number of items ≤ 1000
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
parameters
Parameters for text embedding requests.
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

generate embeddings

curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "inputs": [
    "Youth craves thrills while adulthood cherishes wisdom.",
    "Youth seeks ambition while adulthood finds contentment.",
    "Dreams chased in youth while goals pursued in adulthood."
  ],
  "model_id": "slate",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

EmbeddingsResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

An array of embeddings for each input string.

{
  "model_id": "slate",
  "results": [
    {
      "embedding": [
        -0.006929283,
        -0.005336422,
        -0.024047505
      ]
    }
  ],
  "created_at": "2024-02-21T17:32:28Z",
  "input_token_count": 10
}
Copy to clipboard

Start a request to extract text and metadata from documents.

See the documentation for a description of text extraction.

Since watsonx.ai 2.1.0.

POST /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextExtractionRequest

The input for the text extraction request.

Examples:

simple request

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "tables_processing": {
      "enabled": true
    }
  }
}'
Copy to clipboard

ocr request

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "ocr": {
      "languages_list": [
        "en"
      ]
    },
    "tables_processing": {
      "enabled": false
    }
  }
}'
Copy to clipboard

multiple outputs

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "parameters": {
    "requested_outputs": [
      "assembly",
      "md"
    ],
    "mode": "high_quality",
    "ocr_mode": "enabled"
  }
}'
Copy to clipboard

Response

Response Body

TextExtractionResponse

The text extraction response.

Status Code

201
Created. The Content-Location header will contain the URI reference to the created resource.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201: A simple response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "tables_processing": {
        "enabled": true
      }
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 201: A container response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "container",
      "location": {
        "path": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "container",
      "location": {
        "path": "results/"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled"
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 201: An OCR response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "ocr": {
        "languages_list": [
          "en",
          "fr"
        ]
      }
    },
    "tables_processing": {
      "enabled": false
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 201: Multiple outputs.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly",
        "md"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled"
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Retrieve the list of text extraction requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

Since watsonx.ai 2.1.0.

GET /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Response

Response Body

TextExtractionResources

A paginated list of resources.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: Get all text extraction requests.

{
  "limit": 10,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions"
  },
  "resources": [
    {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "results": {
          "status": "completed",
          "number_pages_processed": 3,
          "running_at": "2023-05-02T16:28:03Z",
          "completed_at": "2023-05-02T16:29:31Z"
        }
      }
    }
  ]
}
Copy to clipboard

Retrieve the text extraction request with the specified identifier.

Note that there is a retention period of 2 days. If this retention period is exceeded then the request will be deleted and the results no longer available. In this case this operation will return 404.

Since watsonx.ai 2.1.0.

GET /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.get

Request

Path Parameters

id
Required*
string
The identifier of the extraction request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

get results

curl --request GET 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

TextExtractionResponse

The text extraction response.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "tables_processing": {
        "enabled": true
      }
    },
    "results": {
      "status": "running",
      "number_pages_processed": 2,
      "running_at": "2023-05-02T16:28:03Z"
    }
  }
}
Copy to clipboard

Status 200: An ocr response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "ocr": {
        "languages_list": [
          "en",
          "fr"
        ]
      },
      "tables_processing": {
        "enabled": false
      }
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 200: Multiple outputs.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly",
        "md"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled"
    },
    "results": {
      "status": "running",
      "number_pages_processed": 2,
      "running_at": "2023-05-02T16:28:03Z"
    }
  }
}
Copy to clipboard

Cancel the specified text extraction request and delete any associated results.

Since watsonx.ai 2.1.0.

DELETE /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.delete

Request

Path Parameters

id
Required*
string
The identifier of the extraction request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

delete results

curl --request DELETE 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Status Code

204
Request deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters.

This API is legacy, consider using Text Chat.

POST /ml/v1/text/generation

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextGenRequest

From a given prompt, infer the next tokens.

Examples:

input
Required*
string
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
model_id
Required*
string
The id of the model to be used for this request. Please refer to the list of models.

Example: google/flan-ul2
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f

parameters

Properties that control the model and response.

Examples:

moderations
Properties that control the moderations, for usages such as Hate and profanity (HAP) and Personal identifiable information (PII) filtering. This list can be extended with new types of moderations.
Examples:
{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-t5-xxl",
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextGenResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A response without moderations.

The generated text from the model along with other details.

{
  "model_id": "google/flan-ul2",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "4,000 km",
      "generated_token_count": 4,
      "input_token_count": 12,
      "stop_reason": "eos_token"
    }
  ]
}
Copy to clipboard

Status 200: A response with moderations.

The generated text from the model along with other details.

{
  "model_id": "google/flan-t5-xl",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
      "generated_token_count": 118,
      "input_token_count": 11,
      "stop_reason": "eos_token",
      "moderations": {
        "pii": [
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 74,
              "end": 88
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 200,
              "end": 212
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 244,
              "end": 259
            },
            "entity": "EmailAddress"
          }
        ]
      }
    }
  ]
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

This API is legacy, consider using Text Chat Stream.

POST /ml/v1/text/generation_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextGenRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

input
Required*
string
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
model_id
Required*
string
The id of the model to be used for this request. Please refer to the list of models.

Example: google/flan-ul2
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f

parameters

Properties that control the model and response.

Examples:

moderations
Properties that control the moderations, for usages such as Hate and profanity (HAP) and Personal identifiable information (PII) filtering. This list can be extended with new types of moderations.
Examples:
{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-t5-xxl",
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextGenResponse[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Rerank texts based on some queries.

POST /ml/v1/text/rerank

Auditing

Calling this method generates the following auditing event.

pm-20.text-rerank.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

RerankRequest

The input texts and the queries for reranking.

Examples:

model_id
Required*
string
The id of the model to be used for this request. Please refer to the list of models.
inputs
Required*
The rank input strings.

Possible values: 0 ≤ number of items ≤ 1000
query
Required*
string
The rank query.
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
parameters
The properties used for reranking.
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

sample request

curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    {
      "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
    },
    {
      "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
    }
  ],
  "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
  "parameters": {
    "return_options": {
      "top_n": 2
    }
  }
}'
Copy to clipboard

Response

Response Body

RerankResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

An array of embeddings for each input string.

{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "results": [
    {
      "index": 1,
      "score": 0.7461
    },
    {
      "index": 0,
      "score": 0.8274
    }
  ],
  "created_at": "2024-02-21T17:32:28Z",
  "input_token_count": 20
}
Copy to clipboard

The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.

POST /ml/v1/text/tokenization

Auditing

Calling this method generates the following auditing event.

pm-20.text-tokenization.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextTokenizeRequest

The input string to tokenize.

Examples:

model_id
Required*
string
The id of the model to be used for this request. Please refer to the list of models.

Example: google/flan-ul2
input
Required*
string
The input string to tokenize.

Example: Write a tagline for an alumni association: Together we
space_id
string
The space that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 3fc54cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 12ac4cf1-252f-424b-b52d-5cdd9814987f
parameters
The parameters for text tokenization.
crypto
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.

Setup Instructions

To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.

For IBM Key Protect

Step 1: Generate Wrapped DEK

Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
1. Navigate to your root key in the Key Protect dashboard
2. Click the three-dot menu → Select Envelope encryption
3. Select Wrap key for me
4. Click Wrap key
5. Copy the displayed Ciphertext value (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.

Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>
```
Step 3: Create a task_credentials object with the type key_manager_api_key by making a curl call such as the following:
```
curl --request POST \
  --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \
  --header 'authorization: Bearer <TOKEN>'\
  --header 'content-type: application/json' \
  --data '{
  "name": "task credentials",
  "description": "This is my task credentials",
  "type": "key_manager_api_key"
}'
```
Replace name and description fields accordingly.

For AWS KMS

Step 1: Generate Encrypted DEK (CiphertextBlob):

Use the AWS CLI to generate a wrapped DEK:
```
aws kms generate-data-key \
  --key-id <KMS_KEY_ID> \
  --key-spec AES_256 \
  --output text \
  --query CiphertextBlob
```
Important Notes:
- Save CiphertextBlob value securely
- Do NOT decode the base64 output (keep it encoded for key_ref)
Step 2: Format Key Reference (key_ref)

Construct the key_ref parameter using the format below:
```
arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>
```
Step 3: Follow the official guide to configure Account Delegation.

This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:
{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-ul2,",
  "input": "Write a tagline for an alumni association: Together we",
  "parameters": {
    "return_tokens": true
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextTokenizeResponse

The tokenization result.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: The response with the token count.

The response with the token count and the tokens, if requested.

{
  "model_id": "google/flan-ul2",
  "result": {
    "token_count": 11,
    "tokens": [
      "Write",
      "a",
      "tag",
      "line",
      "for",
      "an",
      "alumni",
      "associ",
      "ation:",
      "Together",
      "we"
    ]
  }
}
Copy to clipboard

Generate forecasts, or predictions for future time points, given historical time series data.

POST /ml/v1/time_series/forecast

Auditing

Calling this method generates the following auditing event.

pm-20.time-series-forecast.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TSForecastRequest

The forecast request.

Examples:

Response

Response Body

TSForecastResponse

The time series forecast response.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "model_id": "ibm/ttm-1024-96-r2",
  "created_at": "2020-05-02T16:27:51Z",
  "results": [
    {
      "date": [
        "2020-01-05T02:00:00",
        "2020-01-05T03:00:00",
        "2020-01-06T00:00:00"
      ],
      "ID1": [
        "D1",
        "D1",
        "D1"
      ],
      "TARGET1": [
        1.86,
        3.24,
        6.78
      ]
    }
  ],
  "input_data_points": 512,
  "output_data_points": 1024
}
Copy to clipboard

Create a new watsonx.ai training in a project or a space.

In order to deploy the tuned model you need to follow the following steps:

Create a WML model asset, in a space or a project, by providing the request.json as shown below:

curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
  -H "Authorization: Bearer <replace with your token>" \
  -H "content-type: application/json" \
  --data '{
     "name": "replace_with_a_meaningful_name",
     "space_id": "replace_with_your_space_id",
     "type": "prompt_tune_1.0",
     "software_spec": {
       "name": "watsonx-textgen-fm-1.0"
     },
     "metrics": [ from the training job ],
     "training": {
       "id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
       "base_model": {
         "model_id": "google/flan-t5-xl"
       },
       "task_id": "generation",
       "verbalizer": "Input: {{input}} Output:"
     },
     "training_data_references": [
       {
         "connection": {
           "id": "20933468-7e8a-4706-bc90-f0a09332b263"
         },
         "id": "file_to_tune1.json",
         "location": {
           "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
           "path": "file_to_tune1.json"
         },
         "type": "connection_asset"
       }
     ]
   }'

Notes:

If you used the training request field auto_update_model: true then you can skip this step as the model will have been saved at the end of the training job.
Rather than creating the payload for the model you can use the generated request.json that was stored in the results_reference field, look for the path in the field entity.results_reference.location.model_request_path.
The model type must be prompt_tune_1.0.
The software spec name must be watsonx-textgen-fm-1.0.

Create a tuned model deployment as described in the create deployment documentation.

POST /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

pm-20.training.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TrainingResourcePrototype

The training_data_references contain the training datasets and the results_reference the connection where results will be stored.

Examples:

Prompt tuning

curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "name": "my-prompt-tune-training",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "prompt_tuning": {
    "base_model": {
      "model_id": "google/flan-t5-xl"
    },
    "task_id": "classification",
    "tuning_type": "prompt_tuning",
    "num_epochs": 30,
    "learning_rate": 0.4,
    "accumulate_steps": 3,
    "batch_size": 10,
    "max_input_tokens": 100,
    "max_output_tokens": 100
  },
  "training_data_references": [
    {
      "id": "tune1_data.json",
      "location": {
        "path": "tune1_data.json"
      },
      "type": "container"
    }
  ],
  "auto_update_model": true,
  "results_reference": {
    "location": {
      "path": "tune1/results"
    },
    "type": "container"
  }
}'
Copy to clipboard

Response

Response Body

TrainingResource

Training resource.

Status Code

201
The training job has been created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of trainings for the specified space or project.

GET /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

pm-20.training.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50
total_count
boolean
Compute the total count. May have performance impact.
tag.value
string
Return only the resources with the given tag value.
state
string
Filter based on on the training job state.

Allowable values: [queued,pending,running,storing,completed,failed,canceled]
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

TrainingResourceCollection

Information for paging when querying resources.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the training with the specified identifier.

GET /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

pm-20.training.get

Request

Path Parameters

training_id
Required*
string
The training identifier.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

TrainingResource

Training resource.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Cancel or delete the specified training, once deleted all trace of the job is gone.

DELETE /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

pm-20.training.delete

Request

Path Parameters

training_id
Required*
string
The training identifier.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

Response

Status Code

204
Training cancelled.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

This creates a new vector index with the provided parameters.

POST /v1/vector_indexes

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPost

Response

Response Body

vectorIndexResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 201: create_vector_index

Create a vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-New",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This retrieves a vector index with the given id.

GET /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

vectorIndexResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: get_vector_indexes

Get vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-New",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This updates a vector index with the given id.

PATCH /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPatch

Response

Response Body

vectorIndexResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: patch_vector_indexes

Response with updated vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-Patched",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This deletes a vector index with the given id.

DELETE /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

TO BE USED ONLY WITH IN-MEMORY VECTOR STORE. This is to update the attachments/objects associated with the vector index.

PUT /v1/vector_indexes/{index_id}/attachment

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPut

Examples:

Response

Response Body

vectorIndexResponse

Status Code

200
Ok - Returned when the attachment is successfull.
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This creates a new vector index with the provided parameters.

POST /v1/transactional_vector_indexes

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexTransactionalPost

Response

Response Body

vectorIndexResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 201: create_vector_index

Create a vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-New",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This updates a vector index with the given id.

PATCH /v1/transactional_vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPatch

Response

Response Body

vectorIndexResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: patch_vector_index_transactional

Response with updated vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-Patched",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This retrieves a vector index with the given id.

GET /v1/transactional_vector_indexes/{index_id}/status

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

vectorIndexStatusResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: get_vector_index_status

Get vector index.

{
  "status": "COMPLETED",
  "asset": {
    "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
    "name": "Milvus-VI-New",
    "description": "",
    "created_at": 1739888788777,
    "created_by": "IBMid-6910003SE8",
    "last_updated_at": 1739888804362,
    "last_updated_by": "IBMid-6910003SE8",
    "data_assets": [
      "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
    ],
    "store": {
      "type": "watsonx.data",
      "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
      "index": "wx_test_collection_japanese",
      "new_index": true,
      "database": "default"
    },
    "settings": {
      "chunk_size": 2000,
      "chunk_overlap": 200,
      "split_pdf_pages": true,
      "top_k": 3,
      "rerank": false,
      "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
      "schema_fields": {
        "document_name": "document_name",
        "text": "text",
        "page_number": "page"
      }
    },
    "build": {
      "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
      "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
    },
    "status": "ready"
  }
}
Copy to clipboard

Introduction to IBM watsonx.ai software

Endpoint URLs

Disabling SSL verification

Authentication

Error handling

Error response

Errors

Additional headers

API change log

14 March 2024

Versioning

Active Version Dates

Data References

Methods

Create a new AI service

Auditing

Request

Query Parameters

version

Request Body

space_id

name

software_spec

description

tags

code_type

documentation

custom

any property

tooling

any property

Response

Response Body

metadata

entity

software_spec

code_type

documentation

custom

any property

tooling

any property

system

Status Code

201

400

401

403

404

Retrieve the AI services

Auditing

Request

Query Parameters

version

space_id

project_id

start

limit

tag.value

search

Response

Status Code

200

400

401

403

404

No Sample Response

Retrieve the AI service

Auditing

Request

Path Parameters

id

Query Parameters

version

space_id

project_id

rev

Response

Response Body