watsonx.ai | IBM Cloud API Docs

Introduction to IBM watsonx.ai as a Service

Last updated: 2025-07-02

Using IBM watsonx.ai as a Service APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).

If you are looking for the IBM watsonx.ai software APIs, see here.

Step-by-step instructions on how to use IBM watsonx.ai as a Service can be found here.

There is a specialized python library that is available to access this REST API.

Endpoint URLs

The following URL represents the base URLs for the watsonx.ai API endpoints. When you call the API, use the URL and add the path for each method to form the complete API endpoint for your requests.

Dallas: https://us-south.ml.cloud.ibm.com
Frankfurt - https://eu-de.ml.cloud.ibm.com
London - https://eu-gb.ml.cloud.ibm.com
Tokyo - https://jp-tok.ml.cloud.ibm.com
Sydney - https://au-syd.ml.cloud.ibm.com
Toronto - https://ca-tor.ml.cloud.ibm.com
Mumbai - https://ap-south-1.aws.wxai.ibm.com

Note that for prompts, notebooks, vector indexes and agent tools the base URLs are the following:

Dallas: https://api.dataplatform.cloud.ibm.com/wx
Frankfurt - https://api.eu-de.dataplatform.cloud.ibm.com/wx
London - https://api.eu-gb.dataplatform.cloud.ibm.com/wx
Tokyo - https://api.jp-tok.dataplatform.cloud.ibm.com/wx
Sydney - https://api.au-syd.dai.cloud.ibm.com/wx
Toronto - https://api.ca-tor.dai.cloud.ibm.com/wx
Mumbai - https://api.ap-south-1.aws.data.ibm.com/wx

Example request to a Dallas endpoint:

curl -H "Authorization: Bearer {token}" -X {request_method} "https://us-south.ml.cloud.ibm.com/{method_endpoint}"

Replace {request_method}, and {method_endpoint} in this example with the values for your particular API call. See the Authentication section below for more details about the bearer {token}.

Authentication

This API uses IBM Cloud Identity and Access Management (IAM) to authenticate requests.

To work with the API, authenticate your application or service by including your IBM Cloud IAM access token in API requests.

IAM authentication. Replace {token} and {url}/{method} with your service credentials.

curl -H "Authorization:Bearer {token}" -X "{url}/{method}"

Authorization: Bearer {token}

For example, if the token is tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5 in the service credentials, include the credentials in your call like this:

curl -H "Authorization:Bearer tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5" -X "https://us-south.ml.cloud.ibm.com/ml/v4/models"

Error handling

This API uses standard HTTP response codes to indicate whether a method completed successfully. A 200 type response indicates success.

HTTP Code	Description	Recovery
`200`	Success	The request was successful.
`400`	Bad Request	The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body.
`401`	Unauthorized	You are not authorized to make this request. Log in and try again or provide a valid token. See Authenticating with IAM tokens for instructions on logging in. If this error persists, contact the account owner to check your permissions.
`403`	Forbidden	The supplied authentication is not authorized.
`404`	Not Found	The requested resource could not be found.

Note that 429 and 503 errors may mean that the model is overloaded or unavailable, check the error description for more details.

Error response

Name	Description
trace	An identifier that can be used to trace the request. This can be set using `X-Global-Transaction-Id`.
errors	The list of errors.

Errors

Name	Description
code	A simple string code that should convey the general sense of the error.
message	The message that describes the error.
more_info	A reference to a more detailed explanation when available.

Additional headers

Some additional headers might be required to make successful requests to the API. Those additional headers are described below.

An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.

If there is not a transaction ID that is passed in, then one is generated randomly.

API change log

In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API. The change log lists changes that have been made, ordered by the date they were released. Changes to existing API versions are designed to be compatible with existing client applications, if this is not the case then a new version date will be created.

14 March 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically.

18 April 2024

The /ml/v1/text/embeddings API was added to watsonx.ai, this is a non-breaking change and just adds this single API operation.

Versioning

API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.

When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.

Active Version Dates

Version date	Summary of changes
`2024-03-14`	Publication of the `/ml/v1` APIs.

Data References

Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires the use of connection_asset or data_asset reference types. These reference types are created within a space or a project and are referenced in requests to represent input data and results locations. These types contain two parameter objects, connection and location, which require different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied to the location object whereas using a connection_asset requires the connection_id for the connection object and different location fields depending on the data source type.

Example connection_asset payload:

{
  "training_data_references": [
    {
      "type": "connection_asset",
      "connection": {
        "id": "<connection_guid>"
      },
      "location": {
        "<wdp-properties depending on the type>": "<value depending on the type>"
      }
    }
  ]
}

Example data_asset payload:

{
  "training_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/<asset_id>?space_id=<space_id>"
      }
    }
  ]
}

Example container payload:

{
  "training_data_references": [
    {
      "location":{
        "path":"filename_in_project_or_space"
      },
      "type":"container"
    }
  ]
}

Activity Tracker events

You can monitor API activity within your account by using the IBM Cloud Activity Tracker service. Whenever an API method is called, an event is generated that you can then track and audit from within Activity Tracker. The specific event type is listed for each individual method.

Create a new AI service with the given payload. A AI service is some code that can be deployed as a deployment.

POST /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

AIServiceRequest

Payload for creating the AI service. Either space_id or project_id has to be provided and is mandatory.

Examples:

Create request AI Service

curl --request POST 'https://{cluster_url}/ml/v4/ai_services?version=2024-10-17'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "name": "ai-service-1",
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "software_spec": {
    "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
  },
  "documentation": {
    "request": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "parameters": {
            "properties": {
              "max_new_tokens": {
                "type": "integer"
              },
              "top_p": {
                "type": "number"
              }
            },
            "required": [
              "max_new_tokens",
              "top_p"
            ]
          }
        },
        "required": [
          "query"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "image": {
            "type": "string",
            "format": "binary"
          }
        },
        "required": [
          "image"
        ]
      }
    },
    "response": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "result": {
            "type": "string"
          }
        },
        "required": [
          "query",
          "result"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "string",
        "format": "binary"
      }
    }
  }
}'
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

201
AI service created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201: The AI service response.

The response with the result.

{
  "metadata": {
    "id": "b53c5118-b1ca-43ef-a597-ef839ff7129f",
    "name": "ai-app-1",
    "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "created_at": "2023-05-02T16:27:51Z"
  },
  "entity": {
    "software_spec": {
      "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
    },
    "documentation": {
      "request": {
        "application/json": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "query": {
              "type": "string"
            },
            "parameters": {
              "properties": {
                "max_new_tokens": {
                  "type": "integer"
                },
                "top_p": {
                  "type": "number"
                }
              },
              "required": [
                "max_new_tokens",
                "top_p"
              ]
            }
          },
          "required": [
            "query"
          ]
        },
        "application/png": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "image": {
              "type": "string",
              "format": "binary"
            }
          },
          "required": [
            "image"
          ]
        }
      },
      "response": {
        "application/json": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "object",
          "properties": {
            "query": {
              "type": "string"
            },
            "result": {
              "type": "string"
            }
          },
          "required": [
            "query",
            "result"
          ]
        },
        "application/png": {
          "$schema": "http://json-schema.org/draft-07/schema#",
          "type": "string",
          "format": "binary"
        }
      }
    }
  }
}
Copy to clipboard

Retrieve the AI services for the specified space or project.

GET /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50
tag.value
string
Return only the resources with the given tag values, separated by or or and to support multiple tags.

Example: tf2.0 or tf2.1
search
string
Returns only resources that match this search string. The path to the field must be the complete path to the field, and this field must be one of the indexed fields for this resource type. Note that the search string must be URL encoded.

Possible values: length ≥ 1

Retrieve all AI services

curl --request GET 'https://{cluster_url}/ml/v4/ai_services?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Response Body

AIServiceResources

A paginated list of AI services.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service with the specified identifier. If rev query parameter is provided, rev=latest will fetch the latest revision. A call with rev={revision_number} will fetch the given revision_number record. Either space_id or project_id has to be provided and is mandatory.

GET /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.read

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
rev
string
The revision number of the resource.

Example: 2

Retrieve a AI service

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Update the AI service with the provided patch data. The following fields can be patched:

/tags
/name
/description
/custom

PATCH /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.update

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

JsonPatchOperation[]

Input For Patch. This is the patch body which corresponds to the JavaScript Object Notation (JSON) Patch standard (RFC 6902).

Update AI Services

curl --request PATCH "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
[
  {
    "op": "replace",
    "path": "/description",
    "value": "New Description"
  }
]
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

200
AI service has been patched successfully
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the AI service with the specified identifier. This will delete all revisions of this flow as well. For each revision all attachments will also be deleted.

DELETE /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.delete

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Delete the AI service

curl --request DELETE "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Status Code

204
AI service deleted
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Upload the flow code. AI services expect a zip file that contains the code files that make up the flow.

PUT /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.add

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

application/gzipbinary

A gzip file containing code files.

Upload the flow code

curl --request PUT "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/gzip"
Copy to clipboard

Response

Response Body

AIServiceContentMetadata

The metadata related to the attachment.

Status Code

201
AI service code uploaded
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Download the AI service code. It is possible to download the code for a given revision of the flow. AI services expect a zip file that contains the code files that make up the flow.

GET /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.read

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
rev
string
The revision number of the resource.

Example: 2

Download the AI service code

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&rev=1&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

binary

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new AI service revision. The current metadata and content for id will be taken and a new revision created. Either space_id or project_id has to be provided and is mandatory.

POST /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.create

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

RevisionEntitySpaceProjectRequest

The details for the revision.

Examples:

Create new AI service revision

curl --request POST "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "commit_message": "New Code"
}
Copy to clipboard

Response

Response Body

AIServiceResource

The information for a flow.

Status Code

201
AI service revision created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service revisions.

GET /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

pm-20.ai_service.list

Request

Path Parameters

id
Required*
string
AI service identifier.

Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Retrieve AI service revisions

curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

AIServiceResources

A paginated list of AI services.

Status Code

200
AI service revisions
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new AutoAI RAG that will find the best RAG pattern from the data that is provided in the request.

POST /ml/v1/autoai/rags

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

AutoAIRAGRequest

The details of the AutoAI RAG run with the data used to find the best RAG patterns.

Create AutoAI RAG job

curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "name": "AutoAI RAG #1",
  "description": "My autorag experiment for 2023 financial documents.",
  "hardware_spec":{
    "id": "c076e82c-b2a7-4d20-9c0f-1f0c2fdf5a24",
    "name": "L"
  },
  "parameters":{
    "constraints":{
      "embedding_models": ["ibm/slate.30m.english.rtrvr", "ibm/slate.125m.english.rtrvr"],
      "foundation_models":["meta-llama/llama-2-13b", "mistralai/mixtral-8x7b-instruct-v0-1", "ibm/granite-13b-instruct-v2"],
      "max_number_of_rag_patterns": 8
    },
    "optimization":{
      "metrics":["answer_correctness"]
    },
    "output_logs": true
  },
  "input_data_references":[
    {
      "type": "connection_asset",
      "connection": {
        "id": "d118eb8c-b0da-44f4-abf4-c4ecba4a496a"
      },
      "location":{
        "bucket": "autorag-documents-datasets",
        "file_name": "docs/document_1.txt"
      }
    },
    {
      "type": "connection_asset",
      "connection": {
        "id": "d118eb8c-b0da-44f4-abf4-c4ecba4a496a"
      },
      "location":{
        "bucket": "autorag-documents-datasets",
        "file_name": "docs/document_2.txt"
      }
    },
  ],
  "test_data_references":[
    {
      "type": "connection_asset",
      "connection": {
        "id": "d118eb8c-b0da-44f4-abf4-c4ecba4a496a"
      },
      "location":{
        "bucket": "autorag-documents-datasets",
        "file_name": "benchmarks/q_and_a_data.json"
      }
    }
  ],
  "vector_store_references":[
    {
      "type": "connection_asset",
      "connection": {
        "id": "497956b8-626f-4800-901d-3bcba21c6770"
      }
    }
  ],
  "results_reference": {
    "type": "container",
    "connection": {
    },
    "location": {
      "path": "results_autoai"
    }
  },
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

AutoAIRAGResponse

The response of an AutoAI RAG run.

Status Code

201
Created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of AutoAI RAG requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

GET /ml/v1/autoai/rags

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Retrieve the AutoAI RAG runs

curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

AutoRAGResultResources

A paginated list of training definitions.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get the results of an AutoAI RAG run, or details if the job failed.

GET /ml/v1/autoai/rags/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Get an AutoAI RAG run

curl --request GET "https://{cluster_url}/ml/v1/autoai/rags/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

AutoAIRAGResponse

The response of an AutoAI RAG run.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "description": "My autoai rag experiment for 2023 financial documents",
    "name": "AutoAI RAG"
  },
  "entity": {
    "timestamp": "2023-09-22T02:52:03.324Z",
    "hardware_spec": {
      "id": "c076e82c-b2a7-4d20-9c0f-1f0c2fdf5a24",
      "name": "L"
    },
    "parameters": {
      "constraints": {
        "embedding_models": [
          "ibm/slate-125m-english-rtrvr"
        ],
        "foundation_models": [
          "meta-llama/llama-3-70b-instruct",
          "mistralai/mixtral-8x7b-instruct-v01",
          "ibm/granite-13b-chat-v2"
        ],
        "max_number_of_rag_patterns": 8
      },
      "optimization": {
        "metrics": [
          "answer_correctness"
        ]
      },
      "output_logs": true
    },
    "input_data_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "path": "files/document.pdf"
        }
      }
    ],
    "test_data_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "path": "files/qa_document.json"
        }
      }
    ],
    "vector_store_references": [
      {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        }
      }
    ],
    "results_reference": {
      "type": "container",
      "location": {
        "path": "results_autoai",
        "training": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5",
        "training_status": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training-status.json",
        "assets_path": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets",
        "training_log": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training.log"
      }
    },
    "results": [
      {
        "metrics": {
          "test_data": [
            {
              "metric_name": "answer_correctness",
              "mean": 0.51,
              "ci_high": 0.68,
              "ci_low": 0.43
            }
          ]
        },
        "context": {
          "rag_pattern": {
            "composition_steps": [
              "vector_store",
              "chunking",
              "embeddings",
              "retrieval",
              "generation"
            ],
            "location": {
              "evaluation_results": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/evaluation_results.json",
              "indexing_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/indexing_notebook.ipynb",
              "inference_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_notebook.ipynb"
            },
            "name": "Pattern 1",
            "settings": {
              "vector_store": {
                "datasource_type": "milvus",
                "index_name": "autoai_rag_1234_iteration_5_index",
                "distance_metric": "euclidean",
                "operation": "upsert",
                "schema": {
                  "id": "autoai_rag_1.0.0",
                  "name": "AutoAI RAG document schema",
                  "type": "struct",
                  "fields": [
                    {
                      "name": "text",
                      "description": "text field",
                      "type": "string",
                      "role": "text"
                    },
                    {
                      "name": "document_id",
                      "description": "document name field",
                      "type": "string",
                      "role": "document_name"
                    },
                    {
                      "name": "start_index",
                      "description": "chunk starting token position in the source document",
                      "type": "number",
                      "role": "start_index"
                    },
                    {
                      "name": "sequence_number",
                      "description": "chunk number per document",
                      "type": "number",
                      "role": "sequence_number"
                    },
                    {
                      "name": "vector",
                      "description": "vector embeddings",
                      "type": "array",
                      "role": "vector_embeddings"
                    }
                  ]
                }
              },
              "chunking": {
                "method": "recursive",
                "chunk_size": 256,
                "chunk_overlap": 64
              },
              "embeddings": {
                "truncate_strategy": "left",
                "truncate_input_tokens": 384,
                "model_id": "ibm/slate-125m-english-rtrvr"
              },
              "retrieval": {
                "method": "simple",
                "number_of_chunks": 5
              },
              "generation": {
                "model_id": "meta-llama/llama-3-1-70b-instruct",
                "prompt_template_text": "Answer the following questions based on provided context:\\n ...",
                "context_template_text": "[Document]\n{document}\n[End]",
                "word_to_token_ratio": 2.2
              }
            }
          },
          "iteration": 1,
          "max_combinations": 160
        }
      }
    ],
    "status": {
      "state": "running",
      "step": "vector_store",
      "message": {
        "level": "info",
        "text": "Pipeline 1 of 8 is completed."
      },
      "running_at": "2023-08-04T13:22:48.000Z"
    }
  }
}
Copy to clipboard

Cancel or delete the specified AutoAI RAG run, once deleted all trace of the run job is gone.

DELETE /ml/v1/autoai/rags/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

Cancel or delete an AutoAI RAG run

curl --request DELETE "https://{cluster_url}/ml/v1/autoai/rags/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Status Code

204
Deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new deployment, currently the only supported type is online.

If this is a deployment for a prompt tune then the asset object must exist and the id must be the id of the model that was created after the prompt training.

If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed.

POST /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentResourcePrototype

The deployment request entity.

The following important fields are described for each use case:

Prompt template:
- base_model_id: required
- promt_template.id: required
- online: required
- hardware_spec: forbidden
- hardware_request: forbidden
- response deployed_asset_type: foundation_model
Prompt tune:
- asset.id: required
- online: required
- hardware_spec: forbidden
- hardware_request: forbidden
- base_model_id: forbidden
- response deployed_asset_type: prompt_tune
Custom foundation model:
- asset.id: required
- online: required
- online.parameters.foundation_model: optional
- hardware_spec: forbidden
- hardware_request: required
- base_model_id: forbidden
- base_deployment_id: forbidden
- response deployed_asset_type: custom_foundation_model
Deploy on Demand model:
- asset.id: required
- online: required
- online.parameters.foundation_model: forbidden
- hardware_spec: forbidden
- hardware_request: forbidden
- base_model_id: forbidden
- base_deployment_id: forbidden
- space_id: required
- project_id: forbidden
- response deployed_asset_type: curated_foundation_model

Examples:

A prompt tune deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt tuned model deployment",
    "tags": ["classification"],
    "asset": {
        "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {}
}
Copy to clipboard

A prompt template deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": ["classification"],
    "prompt_template": {
        "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "base_model_id": "google/flan-t5-xl",
    "online": {}
}
Copy to clipboard

A custom foundation model deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    "name": "my_tuned_flan"
    "asset": {
        "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "online": {
        "parameters": {
            "serving_name": "myflan"
         }
    }
}
Copy to clipboard

A curated foundational model deployment

curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d {
    "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
    "name": "my_granite_13b_chat_v2",
    "asset": {
        "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
    },
    "base_model_id": "ibm/granite-13b-chat-v2-curated",
    "hardware_request": {
        "size": "gpu_s",
        "num_nodes": 1
    },
    "online": {
        "parameters": {
            "serving_name": "granite_13b_chat_v2"
         }
    }
}
Copy to clipboard

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

202
Deployment created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 202: A prompt tune deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt tuned model deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "asset": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "prompt_tune",
    "base_model_id": "google/flan-ul2",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 202: A prompt template deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "prompt_template": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 202: A custom foundation model deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "my_tuned_flan"
  },
  "entity": {
    "asset": {
      "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "online": {
      "parameters": {
        "serving_name": "myflan"
      }
    },
    "deployed_asset_type": "custom_foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
          "sse": true,
          "uses_serving_name": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 202: A curated foundation model.

{
  "metadata": {
    "id": "c9240431-8697-42ad-8ab3-1cced97fc6db",
    "created_at": "2024-12-12T10:42:52.298Z",
    "name": "my_granite_13b_chat_v2",
    "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03"
  },
  "entity": {
    "asset": {
      "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
    },
    "base_model_id": "ibm/granite-13b-chat-v2-curated",
    "deployed_asset_type": "curated_foundation_model",
    "hardware_request": {
      "num_nodes": 1,
      "size": "gpu_s"
    },
    "online": {
      "parameters": {
        "serving_name": "granite_13b_chat_v2"
      }
    },
    "status": {
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/curated_test_22/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/curated_test_22/text/generation_stream",
          "uses_serving_name": true,
          "sse": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/c9240431-8697-42ad-8ab3-1cced97fc6db/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/c9240431-8697-42ad-8ab3-1cced97fc6db/text/generation_stream",
          "sse": true
        }
      ],
      "state": "ready"
    }
  }
}
Copy to clipboard

Retrieve the list of deployments for the specified space or project.

GET /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
serving_name
string
Retrieves the deployment, if any, that contains this serving_name.

Example: classification
tag.value
string
Retrieves only the resources with the given tag value.
asset_id
string
Retrieves only the resources with the given asset_id, asset_id would be the model id.
prompt_template_id
string
Retrieves only the resources with the given prompt_template_id.
name
string
Retrieves only the resources with the given name.
type
string
Retrieves the resources filtered with the given type. There are the deployment types as well as an additional prompt_template if the deployment type includes a prompt template.

The supported deployment types are (see the description for deployed_asset_type in the deployment entity):
1. prompt_tune - when a prompt tuned model is deployed.
2. foundation_model - when a prompt template is used on a pre-deployed IBM provided model.
3. custom_foundation_model - when a custom foundation model is deployed.
These can be combined with the flag prompt_template like this:
1. type=prompt_tune - return all prompt tuned model deployments.
2. type=prompt_tune and prompt_template - return all prompt tuned model deployments with a prompt template.
3. type=foundation_model - return all prompt template deployments.
4. type=foundation_model and prompt_template - return all prompt template deployments - this is the same as the previous query because a foundation_model can only exist with a prompt template.
5. type=prompt_template - return all deployments with a prompt template.
state
string
Retrieves the resources filtered by state. Allowed values are initializing, updating, ready and failed.
conflict
boolean
Returns whether serving_name is available for use or not. This query parameter cannot be combined with any other parameter except for serving_name.

Default: false

Retrieve list of deployments

curl --request GET 'https://{cluster_url}/ml/v4/deployments?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&
serving_name=ibm&asset_id=259efabd-7850-40fc-843d-6dddcfc286d1
&state=ready&version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Response Body

DeploymentResourceCollection

The deployment resources.

Status Code

200
OK.
204
serving_name is available for use. Returned when serving_name and conflict query parameters are used.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.
409
Returned when serving_name and conflict query parameters are used. The response body will contain the reason.

Example responses

Status 200: Get all prompt tune deployments.

{
  "limit": 10,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments"
  },
  "resources": [
    {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "deployed_asset_type": "prompt_tune",
        "online": {},
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  ]
}
Copy to clipboard

Retrieve the deployment details with the specified identifier.

GET /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.read

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Retrieve deployment details

curl --request GET "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

200
Deployment details.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A prompt tune deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt tuned model deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "asset": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "prompt_tune",
    "base_model_id": "google/flan-ul2",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 200: A prompt template deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "text_classification",
    "description": "Classification prompt template deployment",
    "tags": [
      "classification"
    ]
  },
  "entity": {
    "prompt_template": {
      "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
    },
    "online": {},
    "deployed_asset_type": "foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
          "sse": true
        }
      ]
    }
  }
}
Copy to clipboard

Status 200: A custom foundation model deployment.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "my_tuned_flan"
  },
  "entity": {
    "asset": {
      "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
    },
    "online": {
      "parameters": {
        "serving_name": "myflan"
      }
    },
    "deployed_asset_type": "custom_foundation_model",
    "base_model_id": "google/flan-t5-xl",
    "status": {
      "state": "ready",
      "message": {
        "level": "info",
        "text": "The deployment is successful"
      },
      "inference": [
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
          "uses_serving_name": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
          "sse": true
        },
        {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
          "sse": true,
          "uses_serving_name": true
        }
      ]
    }
  }
}
Copy to clipboard

Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.

/name
/description
/tags
/custom
/online/parameters
/asset - replace only
/prompt_template - replace only
/hardware_spec
/hardware_request
/base_model_id - replace only (applicable only to prompt template deployments referring to IBM base foundation models)

The PATCH operation with path specified as /online/parameters can be used to update the serving_name.

PATCH /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.update

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Request Body

Required*

application/json-patch+jsonJsonPatchOperation[]

The json patch.

Update the deployment metadata.

curl --request PATCH "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02"
-H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
-H "Content-Type: application/json"
-H "Accept: application/json"
-d 
[
  {
    "op": "replace",
    "path": "/description",
    "value": "New Description",
  }
]
Copy to clipboard

Response

Response Body

DeploymentResource

A deployment resource.

Status Code

202
Deployment accepted
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the deployment with the specified identifier.

DELETE /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

pm-20.deployment.delete

Request

Path Parameters

deployment_id
Required*
string
The deployment id.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Delete deployment

curl --request DELETE 'https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Status Code

204
Deployment deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

This API is legacy, consider using Deployment Text Chat.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextGenRequest

From a given prompt, infer the next tokens.

Examples:

prompt tune

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
}'
Copy to clipboard

prompt template

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000,
    "prompt_variables": {
      "name": "joe",
      "count": 3
    },
  },
}'
Copy to clipboard

Response

Response Body

TextGenResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A prompt tune response.

The generated text from the model along with other details for a prompt tune.

{
  "model_id": "google/flan-ul2",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "4,000 km",
      "generated_token_count": 4,
      "input_token_count": 12,
      "stop_reason": "eos_token"
    }
  ]
}
Copy to clipboard

Status 200: A prompt tune response with moderations.

The generated text from the model along with other details for a prompt tune with moderations.

{
  "model_id": "google/flan-t5-xl",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
      "generated_token_count": 118,
      "input_token_count": 11,
      "stop_reason": "eos_token",
      "moderations": {
        "pii": [
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 74,
              "end": 88
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 200,
              "end": 212
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 244,
              "end": 259
            },
            "entity": "EmailAddress"
          }
        ]
      }
    }
  ]
}
Copy to clipboard

Status 200: A prompt template response.

The generated text from the model along with other details for a prompt template.

{
  "model_id": "google/flan-ul2",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "4,000 km",
      "generated_token_count": 4,
      "input_token_count": 12,
      "stop_reason": "eos_token"
    }
  ]
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

This API is legacy, consider using Deployment Text Chat Stream.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned, also the rank and top_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextGenRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

prompt tune

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
}'
Copy to clipboard

prompt template

curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000,
    "prompt_variables": {
      "name": "joe",
      "count": 3
    },
  },
}'
Copy to clipboard

Response

Response Body

TextGenResponse[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer the next chat message for a given deployment. The deployment must reference a prompt template which has input_mode set to chat. The model to the chat request will be from the deployment base_model_id. Parameters to the chat request will be from the prompt template model_parameters. Related guides: Deployment, Prompt template, Text chat.

If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

POST /ml/v1/deployments/{id_or_name}/text/chat

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. The deployment must reference a prompt template with input_mode chat.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextChatRequest

From a given prompt, infer the next chat message.

Examples:

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A chat prompt template response.

The generated text from the model along with other details for a prompt template.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "ibm/granite-3-2b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 2020 World Series was played at the Globe Life Field in Arlington, Texas.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 27,
    "prompt_tokens": 186,
    "total_tokens": 213
  }
}
Copy to clipboard

Status 200: A chat prompt template with system_prompt and context response.

The generated text from the model along with other details for a prompt template.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "ibm/granite-3-2b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I am Granite Chat, created by IBM. I am here to assist you. Today is Wednesday.tomorrow is Thursday.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 32,
    "prompt_tokens": 154,
    "total_tokens": 186
  }
}
Copy to clipboard

Infer the next chat message for a given deployment. This operation will return the output tokens as a stream of events. The deployment must reference a prompt template which has input_mode set to chat. The model to the chat request will be from the deployment base_model_id. Parameters to the chat request will be from the prompt template model_parameters. Related guides: Deployment, Prompt template, Text chat.

If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

POST /ml/v1/deployments/{id_or_name}/text/chat_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction. The deployment must reference a prompt template with input_mode chat.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTextChatRequest

From a given prompt, infer the next chat message in a server-sent events (SSE) stream.

Examples:

Response

Response Body

TextChatStreamItem[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate forecasts, or predictions for future time points, given historical time series data.

POST /ml/v1/deployments/{id_or_name}/time_series/forecast

Auditing

Calling this method generates the following auditing event.

pm-20.time-series-forecast.send

Request

Path Parameters

id_or_name
Required*
string
The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DeploymentTSForecastResource

The forecast request.

Examples:

Response

Response Body

TSForecastResponse

The time series forecast response.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "model_id": "bc35d16e-dd21-472e-9cde-c6c3ad88e3b5",
  "created_at": "2020-05-02T16:27:51Z",
  "results": [
    {
      "date": [
        "2020-01-05T02:00:00",
        "2020-01-05T03:00:00",
        "2020-01-06T00:00:00"
      ],
      "ID1": [
        "D1",
        "D1",
        "D1"
      ],
      "TARGET1": [
        1.86,
        3.24,
        6.78
      ]
    }
  ],
  "input_data_points": 512,
  "output_data_points": 1024
}
Copy to clipboard

Create a fine tuning job that will fine tune an LLM.

POST /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

FineTuningRequest

The details of the fine tuning job with the data used to tune the LLM.

InstructLab Fine Tuning

curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "name": "Instruct Lab Fine Tuning",
  "project_id": "dc178286-21d1-4262-9000-e543cf4c7742",
  "type": "ilab",
  "training_data_references": [
    {
      "type": "data_asset",
      "location": {
        "id": "4cc2f990-cd83-4e62-bd61-33b21605cf0e",
        "href": ""
      }
    }
  ],
  "results_reference": {
    "type": "container",
    "location": {
      "path": "."
    }
  }
}'
Copy to clipboard

Response

Response Body

FineTuningResource

The response of a fine tuning job.

Status Code

201
Created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of fine tuning jobs for the specified space or project.

GET /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned.

Possible values: value ≤ 200
Default: 100
total_count
boolean
Compute the total count. May have performance impact.
tag.value
string
Return only the resources with the given tag value.
state
string
Filter based on on the job state: queued, running, completed, failed etc.
type
string
The type of Fine Tuning training. The type is set to ilab for InstructLab training.

Allowable values: [ilab]
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

FineTuningResources

System details.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get the results of a fine tuning job, or details if the job failed.

GET /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.get

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

FineTuningResource

The response of a fine tuning job.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete a fine tuning job if it exists, once deleted all trace of the job is gone.

DELETE /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.fine-tuning.delete

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

Response

Status Code

204
Deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of deployed foundation models.

GET /ml/v1/foundation_model_specs

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

filters

string

A set of filters to specify the list of models, filters are described as the pattern shown below.

 pattern: tfilter[,tfilter][:(or|and)]
 tfilter: filter | !filter
   filter: Requires existence of the filter.
   !filter: Requires absence of the filter.
 filter: one of
   modelid_*:     Filters by model id.
                  Namely, select a model with a specific model id.
   provider_*:    Filters by provider.
                  Namely, select all models with a specific provider.
   source_*:      Filters by source.
                  Namely, select all models with a specific source.
   input_tier_*:  Filters by input tier.
                  Namely, select all models with a specific input tier.
   output_tier_*: Filters by output tier.
                  Namely, select all models with a specific output tier.
   tier_*:        Filters by tier.
                  Namely, select all models with a specific input or output tier.
   task_*:        Filters by task id.
                  Namely, select all models that support a specific task id.
   lifecycle_*:   Filters by lifecycle state.
                  Namely, select all models that are currently in the specified lifecycle state.
   function_*:    Filters by function. 
                  Namely, select all models that support a specific function.

Possible values: 1 ≤ length ≤ 1000, Value must match regular expression ^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$

Example: modelid_ibm/granite-13b-instruct-v1,modelid_ibm/granite-13b-instruct-v2:or

tech_preview
boolean
See all the Tech Preview models if entitled.

Default: false

get foundation models

curl --request GET 'https://{cluster_url}/ml/v1/foundation_model_specs?version=2019-10-25&filters=function_time_series_forecast'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

FoundationModels

System details.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
404
The specified resource was not found.

Example responses

Status 200: The list of models.

The models that are currently deployed in the cluster.

{
  "total_count": 1,
  "limit": 100,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02"
  },
  "resources": [
    {
      "model_id": "bigcode/starcoder",
      "label": "starcoder-15.5b",
      "provider": "BigCode",
      "source": "Hugging Face",
      "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions",
      "tasks": [
        {
          "id": "code",
          "ratings": {
            "quality": 3
          }
        }
      ],
      "min_shot_size": 0,
      "input_tier": "class_2",
      "output_tier": "class_2",
      "number_params": "15.5b"
    }
  ]
}
Copy to clipboard

Retrieve the list of tasks that are supported by the foundation models.

GET /ml/v1/foundation_model_tasks

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Response

Response Body

FoundationModelTasks

System details.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
404
The specified resource was not found.

Example responses

Status 200: The list of tasks.

The tasks that are currently supported by models deployed in the cluster.

{
  "total_count": 1,
  "limit": 100,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02"
  },
  "resources": [
    {
      "task_id": "question_answering",
      "label": "Question answering",
      "rank": 1,
      "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance."
    }
  ]
}
Copy to clipboard

Create a new notebook

either from scratch
or by copying another notebook.

To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to the project Cloud Object Storage (COS) and then reference it with the attribute file_reference. The other required attributes are name, project and runtime. The attribute runtime is used to specify the environment on which the notebook runs.

To copy a notebook, you only need to provide name and source_guid in the request body.

POST /v2/notebooks

Request

Request Body

Required*

One of

Change Schema Parameter List

Specification of the notebook to be created.

Example:

Response

Response Body

One of

Change Schema Parameter List

Notebook information in a project as returned by a GET request.

Status Code

201
Success. Created and returned a new notebook asset. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.
429
The number of requests has exceeded the rate limit.

Example responses

Status 201: A notebook created in a project from scratch

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python3",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 201: A notebook created by copying another notebook

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python3",
        "language": "python3"
      },
      "originates_from": {
        "type": "notebook",
        "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Status 429: Rate limit error with status code 429

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "rate_limit",
      "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later."
    }
  ]
}
Copy to clipboard

Retrieve the details of a large number of notebooks inside a project.

POST /v2/notebooks/list

Request

Query Parameters

project_id
Required*
string
The guid of the project.
include
Required*
string
Additional info that will be included into the notebook details. Possible values are:
- runtime

Request Body

Required*

NotebookListBody

Payload for a notebook list request.

Examples:

Response

Response Body

NotebooksResourceList

A list of notebook info as returned by a list query.

Status Code

200
Success. Returned a list of notebook assets. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A list of notebooks

{
  "total_results": 1,
  "resources": [
    {
      "metadata": {
        "guid": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d"
      },
      "entity": {
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "asset": {
          "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
          "asset_type": "notebook",
          "created_at": "2021-07-01T12:37:01Z",
          "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
          "version": 2,
          "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
        }
      }
    }
  ]
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Delete a particular notebook, including the notebook asset.

DELETE /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Status Code

204
Successful request. Notebook is deleted.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Revert the main notebook to a version.

PUT /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the main notebook.

Request Body

Required*

NotebookRevertBody

Payload for a request to revert to a specific notebook version.

Examples:

Response

Response Body

NotebookInProject

Notebook information in a project as returned by a GET request.

Status Code

200
Success. Reverted the main notebook to a version. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A reverted notebook

{
  "metadata": {
    "name": "my notebook v4.2",
    "description": "this is my notebook v4.2",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python39",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
      "spark_monitoring_enabled": true
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Update a particular notebook.

PATCH /v2/notebooks/{notebook_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Request Body

Required*

NotebookUpdateBody

Payload for a notebook update request.

Examples:

Response

Response Body

Notebook

Notebook information as returned by a GET request.

Status Code

200
Success. Updated the notebook. Format follows v2/assets.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: An updated notebook

{
  "metadata": {
    "name": "my notebook",
    "description": "this is my notebook",
    "asset_type": "notebook",
    "created": 1540471021134,
    "created_at": "2021-07-01T12:37:01Z",
    "owner_id": "IBMid-310000SG2Y",
    "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
    "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
    "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  },
  "entity": {
    "notebook": {
      "kernel": {
        "display_name": "Python 3.9 with Spark",
        "name": "python39",
        "language": "python3"
      },
      "originates_from": {
        "type": "blank"
      }
    },
    "runtime": {
      "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
      "spark_monitoring_enabled": false
    },
    "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Create a version of a given notebook.

POST /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Response Body

NotebookVersionInProject

A notebook version in a project.

Status Code

200
Success. Returned the notebook version definition.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A notebook version in a project

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

List all versions of a particular notebook.

GET /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.

Response

Response Body

NotebookVersionsListInProject

A list of notebook versions in a project.

Status Code

200
Success. Returned a list of versions of the notebook.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A list of notebook versions in a project

{
  "total_results": 1,
  "resources": [
    {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  ]
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Retrieve a particular version of a notebook.

GET /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.
version_guid
Required*
string
The guid of the version.

Response

Response Body

NotebookVersionInProject

A notebook version in a project.

Status Code

200
Success. Returned the version definition.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 200: A notebook version in a project

{
  "metadata": {
    "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
    "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
    "created_at": 1543681714106
  },
  "entity": {
    "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
    "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
    "created_by_iui": "IBMid-123456ABCD",
    "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
    "rev_id": 1
  }
}
Copy to clipboard

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

Delete a particular version of a given notebook.

DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

notebook_guid
Required*
string
The guid of the notebook.
version_guid
Required*
string
The guid of the version.

Response

Status Code

204
Success. The version is deleted.
400
Bad request. One of the fields has invalid format/content.
401
Unauthorized. No/Malformed authentication provided.
403
Forbidden. User is not allowed to perform the target operation.

Example responses

Status 400: Bad request error with status code 400

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_type",
      "message": "The `project` field needs to be a uuid v4, but is 12345.",
      "target": {
        "type": "field",
        "name": "project"
      }
    }
  ]
}
Copy to clipboard

Status 401: Authentication error with status code 401

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "invalid_auth_token",
      "message": "The IAM bearer token is not valid.",
      "target": {
        "type": "header",
        "name": "Authentication"
      }
    }
  ]
}
Copy to clipboard

Status 403: Authorization error with status code 403

{
  "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
  "errors": [
    {
      "code": "endpoint_access_forbidden",
      "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
    }
  ]
}
Copy to clipboard

This creates a new prompt with the provided parameters.

POST /v1/prompts

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptPost

Response

Response Body

wxPromptResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt / prompt template with the given id.

GET /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
restrict_model_parameters
string
Only return a set of model parameters compatiable with inferencing

Default: true

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This updates a prompt / prompt template with the given id.

PATCH /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptPatch

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt / prompt template with the given id.

DELETE /v1/prompts/{prompt_id}

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Modifies the current locked state of a prompt.

PUT /v1/prompts/{prompt_id}/lock

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
force
boolean
Override a lock if it is currently taken.

Request Body

Required*

promptLock

Response

Response Body

promptLock

Status Code

200
Ok - Returned when lock change is successful
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Retrieves the current locked state of a prompt.

GET /v1/prompts/{prompt_id}/lock

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

promptLock

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Computes the inference input string based on state of a prompt. Optionally replaces template params

POST /v1/prompts/{prompt_id}/input

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

wxPromptInputRequest

Response

Response Body

object

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This adds new chat items to the given prompt.

POST /v1/prompts/{prompt_id}/chat_items

Request

Path Parameters

prompt_id
Required*
string
Prompt ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
space_id
string
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

chatItem[]

Response

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This creates a new prompt session.

POST /v1/prompt_sessions

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptSession

Response

Response Body

wxPromptResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt session with the given id.

GET /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
prefetch
boolean
Include the most recent entry

Response

Response Body

wxPromptSession

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This updates a prompt session with the given id.

PATCH /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

object

Response

Response Body

wxPromptSession

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt session with the given id.

DELETE /v1/prompt_sessions/{session_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This creates a new prompt associated with the given session.

POST /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxPromptSessionEntry

Response

Response Body

wxPromptSessionEntry

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

List entries from a given session.

GET /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
bookmark
string
Bookmark from a previously limited get request

Possible values: Value must match regular expression [a-zA-Z0-9-]*
limit
string
Limit for results to retrieve, default 20

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

wxPromptSessionEntryList

Status Code

200
Success - Returned when search completes
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This adds new chat items to the given entry.

POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

chatItem[]

Response

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Modifies the current locked state of a prompt session.

PUT /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*
force
boolean
Override a lock if it is currently taken.

Request Body

Required*

promptLock

Response

Response Body

promptLock

Status Code

200
Ok - Returned when lock change is successful
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Retrieves the current locked state of a prompt session.

GET /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

promptLock

Status Code

200
Ok - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This retrieves a prompt session entry with the given id.

GET /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

wxPromptResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

This deletes a prompt session entry with the given id.

DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

session_id
Required*
string
Prompt Session ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*
entry_id
Required*
string
Prompt Session Entry ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters.

POST /ml/v1/text/chat

Auditing

Calling this method generates the following auditing event.

pm-20.text-chat.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextChatRequest

From a given prompt, infer the next tokens.

Examples:

text chat

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Who won the world series in 2020?"
        }
      ]
    },
    {
      "role": "assistant",
      "content": "The Los Angeles Dodgers won the World Series in 2020."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Where was it played?"
        }
      ]
    }
  ],
  "max_tokens": 100,
  "temperature": 0,
  "time_limit": 1000
}'
Copy to clipboard

tool call

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What is the weather like in Boston today?"
        }
      ]
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "description": "The city, e.g. San Francisco, CA",
              "type": "string"
            },
            "unit": {
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "type": "string"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "get_current_weather",
    }
  }
}'
Copy to clipboard

json mode

curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "response_format": {
    "type": "json_object"
  },
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant designed to output JSON."
    },
    {
      "role": "user",
      "content": [
        {
          "type": "user",
          "text": "Who won the world series in 2020?"
        }
      ]
    }
  ]
}'
Copy to clipboard

Response

Response Body

TextChatResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: text_chat

A text chat example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 47,
    "prompt_tokens": 59,
    "total_tokens": 106
  }
}
Copy to clipboard

Status 200: tool_call

A tool calling example.

{
  "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "tool_calls": [
          {
            "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99",
            "type": "function",
            "function": {
              "name": "get_current_weather",
              "arguments": "{\n  \"location\": \"Boston, MA\",\n  \"unit\": \"fahrenheit\"\n}\n"
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 18,
    "prompt_tokens": 19,
    "total_tokens": 37
  }
}
Copy to clipboard

Status 200: json_mode

A text chat example with json output.

{
  "id": "cmpl-09945b25c805491fb49e15439b8e5d84",
  "model_id": "meta-llama/llama-3-8b-instruct",
  "created": 1689958352,
  "created_at": "2023-07-21T16:52:32.190Z",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "completion_tokens": 35,
    "prompt_tokens": 20,
    "total_tokens": 55
  }
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

POST /ml/v1/text/chat_stream

Auditing

Calling this method generates the following auditing event.

pm-20.text-chat.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextChatRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Response

Response Body

TextChatStreamItem[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate embeddings from text input.

See the documentation for a description of text embeddings.

POST /ml/v1/text/embeddings

Auditing

Calling this method generates the following auditing event.

pm-20.text-embeddings.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

EmbeddingsRequest

The text input for a given model to be used to generate the embeddings.

Examples:

generate embeddings

curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "inputs": [
    "Youth craves thrills while adulthood cherishes wisdom.",
    "Youth seeks ambition while adulthood finds contentment.",
    "Dreams chased in youth while goals pursued in adulthood."
  ],
  "model_id": "slate",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

EmbeddingsResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

An array of embeddings for each input string.

{
  "model_id": "slate",
  "results": [
    {
      "embedding": [
        -0.006929283,
        -0.005336422,
        -0.024047505
      ]
    }
  ],
  "created_at": "2024-02-21T17:32:28Z",
  "input_token_count": 10
}
Copy to clipboard

Start a request to extract text and metadata from documents.

See the documentation for a description of text extraction.

POST /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextExtractionRequest

The input for the text extraction request.

Examples:

simple request

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "tables_processing": {
      "enabled": true
    }
  }
}'
Copy to clipboard

ocr request

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "ocr": {
      "languages_list": [
        "en"
      ]
    },
    "tables_processing": {
      "enabled": false
    }
  }
}'
Copy to clipboard

multiple outputs

curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "parameters": {
    "requested_outputs": [
      "assembly",
      "md"
    ],
    "mode": "high_quality",
    "ocr_mode": "enabled"
  }
}'
Copy to clipboard

Response

Response Body

TextExtractionResponse

The text extraction response.

Status Code

201
Created. The Content-Location header will contain the URI reference to the created resource.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201: A simple response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "tables_processing": {
        "enabled": true
      }
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 201: An OCR response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "ocr": {
        "languages_list": [
          "en",
          "fr"
        ]
      },
      "tables_processing": {
        "enabled": false
      }
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 201: Multiple outputs.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly",
        "md"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled"
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Retrieve the list of text extraction requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

GET /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50

Response

Response Body

TextExtractionResources

A paginated list of resources.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: Get all text extraction requests.

{
  "limit": 10,
  "first": {
    "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions"
  },
  "resources": [
    {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "results": {
          "status": "completed",
          "number_pages_processed": 3,
          "running_at": "2023-05-02T16:28:03Z",
          "completed_at": "2023-05-02T16:29:31Z"
        }
      }
    }
  ]
}
Copy to clipboard

Retrieve the text extraction request with the specified identifier.

Note that there is a retention period of 2 days. If this retention period is exceeded then the request will be deleted and the results no longer available. In this case this operation will return 404.

GET /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.get

Request

Path Parameters

id
Required*
string
The identifier of the extraction request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

get results

curl --request GET 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
Copy to clipboard

Response

Response Body

TextExtractionResponse

The text extraction response.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "tables_processing": {
        "enabled": true
      }
    },
    "results": {
      "status": "running",
      "number_pages_processed": 2,
      "running_at": "2023-05-02T16:28:03Z"
    }
  }
}
Copy to clipboard

Status 200: An ocr response.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "steps": {
      "ocr": {
        "languages_list": [
          "en",
          "fr"
        ]
      },
      "tables_processing": {
        "enabled": false
      }
    },
    "results": {
      "status": "submitted",
      "number_pages_processed": 0
    }
  }
}
Copy to clipboard

Status 200: Multiple outputs.

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "created_at": "2023-05-02T16:27:51Z",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "name": "extract"
  },
  "entity": {
    "document_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
      },
      "location": {
        "file_name": "files/document.pdf"
      }
    },
    "results_reference": {
      "type": "connection_asset",
      "connection": {
        "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
      },
      "location": {
        "file_name": "results"
      }
    },
    "parameters": {
      "requested_outputs": [
        "assembly",
        "md"
      ],
      "mode": "high_quality",
      "ocr_mode": "enabled"
    },
    "results": {
      "status": "running",
      "number_pages_processed": 2,
      "running_at": "2023-05-02T16:28:03Z"
    }
  }
}
Copy to clipboard

Cancel the specified text extraction request and delete any associated results.

DELETE /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

pm-20.text-extraction.delete

Request

Path Parameters

id
Required*
string
The identifier of the extraction request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

delete results

curl --request DELETE 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Status Code

204
Request deleted.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer the next tokens for a given deployed model with a set of parameters.

This API is legacy, consider using Text Chat.

POST /ml/v1/text/generation

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextGenRequest

From a given prompt, infer the next tokens.

Examples:

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-t5-xxl",
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextGenResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A response without moderations.

The generated text from the model along with other details.

{
  "model_id": "google/flan-ul2",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "4,000 km",
      "generated_token_count": 4,
      "input_token_count": 12,
      "stop_reason": "eos_token"
    }
  ]
}
Copy to clipboard

Status 200: A response with moderations.

The generated text from the model along with other details.

{
  "model_id": "google/flan-t5-xl",
  "created_at": "2023-07-21T16:52:32.190Z",
  "results": [
    {
      "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
      "generated_token_count": 118,
      "input_token_count": 11,
      "stop_reason": "eos_token",
      "moderations": {
        "pii": [
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 74,
              "end": 88
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 200,
              "end": 212
            },
            "entity": "PhoneNumber"
          },
          {
            "score": 0.8,
            "input": false,
            "position": {
              "start": 244,
              "end": 259
            },
            "entity": "EmailAddress"
          }
        ]
      }
    }
  ]
}
Copy to clipboard

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

This API is legacy, consider using Text Chat Stream.

POST /ml/v1/text/generation_stream

Auditing

Calling this method generates the following auditing event.

pm-20.foundation-model.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextGenRequest

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-t5-xxl",
  "input": "how far is paris from bangalore:",
  "parameters": {
    "max_new_tokens": 100,
    "time_limit": 1000
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextGenResponse[]

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

200
Successful operation (Content-Type: text/event-stream).
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Rerank texts based on some queries.

POST /ml/v1/text/rerank

Auditing

Calling this method generates the following auditing event.

pm-20.text-rerank.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

RerankRequest

The input texts and the queries for reranking.

Examples:

sample request

curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    {
      "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
    },
    {
      "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
    }
  ],
  "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
  "parameters": {
    "return_options": {
      "top_n": 2
    }
  }
}'
Copy to clipboard

Response

Response Body

RerankResponse

System details.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

An array of embeddings for each input string.

{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "results": [
    {
      "index": 1,
      "score": 0.7461
    },
    {
      "index": 0,
      "score": 0.8274
    }
  ],
  "created_at": "2024-02-21T17:32:28Z",
  "input_token_count": 20
}
Copy to clipboard

The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.

POST /ml/v1/text/tokenization

Auditing

Calling this method generates the following auditing event.

pm-20.text-tokenization.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TextTokenizeRequest

The input string to tokenize.

Examples:

post request

curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "model_id": "google/flan-ul2,",
  "input": "Write a tagline for an alumni association: Together we",
  "parameters": {
    "return_tokens": true
  },
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
}'
Copy to clipboard

Response

Response Body

TextTokenizeResponse

The tokenization result.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: The response with the token count.

The response with the token count and the tokens, if requested.

{
  "model_id": "google/flan-ul2",
  "result": {
    "token_count": 11,
    "tokens": [
      "Write",
      "a",
      "tag",
      "line",
      "for",
      "an",
      "alumni",
      "associ",
      "ation:",
      "Together",
      "we"
    ]
  }
}
Copy to clipboard

Generate forecasts, or predictions for future time points, given historical time series data.

POST /ml/v1/time_series/forecast

Auditing

Calling this method generates the following auditing event.

pm-20.time-series-forecast.send

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TSForecastRequest

The forecast request.

Examples:

sample request

curl --request POST 'https://{cluster_url}/ml/v1/time_series/forecast?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Accept: application/json'
-d '{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "model_id": "ibm/ttm-1024-96-r2",
  "schema": {
    "timestamp_column": "date",
    "id_columns": [
      "ID1"
    ]
  },
  "data": {
    "date": [
      "2020-01-01T00:00:00",
      "2020-01-01T01:00:00",
      "2020-01-05T01:00:00"
    ],
    "ID1": [
      "D1",
      "D1",
      "D1"
    ],
    "TARGET1": [
      1.46,
      2.34,
      4.55
    ]
  }
}'
Copy to clipboard

Response

Response Body

TSForecastResponse

The time series forecast response.

Status Code

200
Successful operation
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200: A sample response.

{
  "model_id": "ibm/ttm-1024-96-r2",
  "created_at": "2020-05-02T16:27:51Z",
  "results": [
    {
      "date": [
        "2020-01-05T02:00:00",
        "2020-01-05T03:00:00",
        "2020-01-06T00:00:00"
      ],
      "ID1": [
        "D1",
        "D1",
        "D1"
      ],
      "TARGET1": [
        1.86,
        3.24,
        6.78
      ]
    }
  ],
  "input_data_points": 512,
  "output_data_points": 1024
}
Copy to clipboard

Create a new watsonx.ai training in a project or a space.

The details of the base model and parameters for the training must be provided in the prompt_tuning object.

In order to deploy the tuned model you need to follow the following steps:

Create a WML model asset, in a space or a project, by providing the request.json as shown below:

curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
  -H "Authorization: Bearer <replace with your token>" \
  -H "content-type: application/json" \
  --data '{
     "name": "replace_with_a_meaningful_name",
     "space_id": "replace_with_your_space_id",
     "type": "prompt_tune_1.0",
     "software_spec": {
       "name": "watsonx-textgen-fm-1.0"
     },
     "metrics": [ from the training job ],
     "training": {
       "id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
       "base_model": {
         "model_id": "google/flan-t5-xl"
       },
       "task_id": "generation",
       "verbalizer": "Input: {{input}} Output:"
     },
     "training_data_references": [
       {
         "connection": {
           "id": "20933468-7e8a-4706-bc90-f0a09332b263"
         },
         "id": "file_to_tune1.json",
         "location": {
           "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
           "path": "file_to_tune1.json"
         },
         "type": "connection_asset"
       }
     ]
   }'

Notes:

If you used the training request field auto_update_model: true then you can skip this step as the model will have been saved at the end of the training job.
Rather than creating the payload for the model you can use the generated request.json that was stored in the results_reference field, look for the path in the field entity.results_reference.location.model_request_path.
The model type must be prompt_tune_1.0.
The software spec name must be watsonx-textgen-fm-1.0.

Create a tuned model deployment as described in the create deployment documentation.

POST /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

pm-20.training.create

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TrainingResourcePrototype

The training_data_references contain the training datasets and the results_reference the connection where results will be stored.

Examples:

Prompt tuning

curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "name": "my-prompt-tune-training",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "prompt_tuning": {
    "base_model": {
      "model_id": "google/flan-t5-xl"
    },
    "task_id": "classification",
    "tuning_type": "prompt_tuning",
    "num_epochs": 30,
    "learning_rate": 0.4,
    "accumulate_steps": 3,
    "batch_size": 10,
    "max_input_tokens": 100,
    "max_output_tokens": 100
  },
  "training_data_references": [
    {
      "id": "tune1_data.json",
      "location": {
        "path": "tune1_data.json"
      },
      "type": "container"
    }
  ],
  "auto_update_model": true,
  "results_reference": {
    "location": {
      "path": "tune1/results"
    },
    "type": "container"
  }
}'
Copy to clipboard

Response

Response Body

TrainingResource

Training resource.

Status Code

201
The training job has been created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 201

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "name": "my-prompt-training",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "created_at": "2023-08-04T13:22:47.000Z"
  },
  "entity": {
    "prompt_tuning": {
      "base_model": {
        "model_id": "google/flan-t5-xl"
      },
      "task_id": "classification"
    },
    "training_data_references": [
      {
        "id": "tune1_data.json",
        "location": {
          "path": "tune1_data.json"
        },
        "type": "container"
      }
    ],
    "auto_update_model": true,
    "results_reference": {
      "location": {
        "path": "tune1/results",
        "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
        "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
        "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
        "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
      },
      "type": "container"
    },
    "status": {
      "state": "completed",
      "running_at": "2023-08-04T13:22:48.000Z",
      "completed_at": "2023-08-04T13:22:55.289Z",
      "metrics": [
        {
          "iteration": 0,
          "ml_metrics": {
            "loss": 4.49988
          },
          "timestamp": "2023-09-22T02:52:03.324Z"
        },
        {
          "iteration": 1,
          "ml_metrics": {
            "loss": 3.86884
          },
          "timestamp": "2023-09-22T02:52:03.689Z"
        },
        {
          "iteration": 2,
          "ml_metrics": {
            "loss": 4.05115
          },
          "timestamp": "2023-09-22T02:52:04.053Z"
        }
      ]
    }
  }
}
Copy to clipboard

Retrieve the list of trainings for the specified space or project.

GET /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

pm-20.training.list

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
start
string
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.
limit
integer
How many resources should be returned. By default limit is 100. Max limit allowed is 200.

Possible values: 1 ≤ value ≤ 200
Default: 100
Example: 50
total_count
boolean
Compute the total count. May have performance impact.
tag.value
string
Return only the resources with the given tag value.
state
string
Filter based on on the training job state.

Allowable values: [queued,pending,running,storing,completed,failed,canceled]
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

TrainingResourceCollection

Information for paging when querying resources.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200

{
  "limit": 100,
  "first": {
    "href": "https://{cluster_url}/ml/v4/trainings"
  },
  "total_count": 1,
  "resources": [
    {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }
  ]
}
Copy to clipboard

Retrieve the training with the specified identifier.

GET /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

pm-20.training.get

Request

Path Parameters

training_id
Required*
string
The training identifier.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Response Body

TrainingResource

Training resource.

Status Code

200
OK.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

Example responses

Status 200

{
  "metadata": {
    "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
    "name": "my-prompt-training",
    "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
    "created_at": "2023-08-04T13:22:47.000Z"
  },
  "entity": {
    "prompt_tuning": {
      "base_model": {
        "model_id": "google/flan-t5-xl"
      },
      "task_id": "classification"
    },
    "training_data_references": [
      {
        "id": "tune1_data.json",
        "location": {
          "path": "tune1_data.json"
        },
        "type": "container"
      }
    ],
    "auto_update_model": true,
    "results_reference": {
      "location": {
        "path": "tune1/results",
        "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
        "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
        "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
        "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
      },
      "type": "container"
    },
    "status": {
      "state": "completed",
      "running_at": "2023-08-04T13:22:48.000Z",
      "completed_at": "2023-08-04T13:22:55.289Z",
      "metrics": [
        {
          "iteration": 0,
          "ml_metrics": {
            "loss": 4.49988
          },
          "timestamp": "2023-09-22T02:52:03.324Z"
        },
        {
          "iteration": 1,
          "ml_metrics": {
            "loss": 3.86884
          },
          "timestamp": "2023-09-22T02:52:03.689Z"
        },
        {
          "iteration": 2,
          "ml_metrics": {
            "loss": 4.05115
          },
          "timestamp": "2023-09-22T02:52:04.053Z"
        }
      ]
    }
  }
}
Copy to clipboard

Cancel or delete the specified training, once deleted all trace of the job is gone.

DELETE /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

pm-20.training.delete

Request

Path Parameters

training_id
Required*
string
The training identifier.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
hard_delete
boolean
Set to true in order to also delete the job or request metadata.

Response

Status Code

204
Training cancelled.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

This creates a new vector index with the provided parameters.

POST /v1/vector_indexes

Request

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPost

Response

Response Body

vectorIndexResponse

Status Code

201
Created - Returned when created
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 201: create_vector_index

Create a vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-New",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This retrieves a vector index with the given id.

GET /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Response Body

vectorIndexResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: get_vector_indexes

Get vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-New",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This updates a vector index with the given id.

PATCH /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPatch

Response

Response Body

vectorIndexResponse

Status Code

200
OK - Returned from GET when it succeeds
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: patch_vector_indexes

Response with updated vector index.

{
  "id": "43499a2a-7656-43d6-8ce0-374d34449d4f",
  "name": "Milvus-VI-Patched",
  "description": "",
  "created_at": 1739888788777,
  "created_by": "IBMid-6910003SE8",
  "last_updated_at": 1739888804362,
  "last_updated_by": "IBMid-6910003SE8",
  "data_assets": [
    "9624a20d-ecd0-450e-b7d2-9941ce7d1c57"
  ],
  "store": {
    "type": "watsonx.data",
    "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a",
    "index": "wx_test_collection_japanese",
    "new_index": true,
    "database": "default"
  },
  "settings": {
    "chunk_size": 2000,
    "chunk_overlap": 200,
    "split_pdf_pages": true,
    "top_k": 3,
    "rerank": false,
    "embedding_model_id": "sentence-transformers/all-minilm-l6-v2",
    "schema_fields": {
      "document_name": "document_name",
      "text": "text",
      "page_number": "page"
    }
  },
  "build": {
    "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c",
    "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35"
  },
  "status": "ready"
}
Copy to clipboard

This deletes a vector index with the given id.

DELETE /v1/vector_indexes/{index_id}

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

204
No Content - Returned on success
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

TO BE USED ONLY WITH IN-MEMORY VECTOR STORE. This is to update the attachments/objects associated with the vector index.

PUT /v1/vector_indexes/{index_id}/attachment

Request

Path Parameters

index_id
Required*
string
Vector index ID

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

project_id
Required*
string
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

wxVectorIndexPut

Examples:

Response

Response Body

vectorIndexResponse

Status Code

200
Ok - Returned when the attachment is successfull.
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Create a document extraction.

POST /ml/v1/tuning/documents

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

DocumentExtractionRequest

The properties that are part of a request that supports spaces and projects. Either space_id or project_id has to be provided and is mandatory.

Response

Response Body

DocumentExtractionResource

The response from getting a specified document extraction job.

Status Code

201
The document extraction job has been created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

Get document extractions.

GET /ml/v1/tuning/documents

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

DocumentExtractionResources

The response of getting all document extraction jobs.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

Get document extraction.

GET /ml/v1/tuning/documents/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

DocumentExtractionResource

The response from getting a specified document extraction job.

Status Code

200
The document extraction job has been created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Cancel the specified document extraction and remove it.

DELETE /ml/v1/tuning/documents/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
hard_delete
boolean
Set to true in order to also delete the job metadata information.

Response

Status Code

204
Document extraction cancelled.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a synthetic unstructured data generation job.

POST /v1/synthetic_data/generation/unstructured

Request

Request Body

Required*

SDGUnstructuredGenerationRequest

The details needed to create unstructured synthetic data generation job.

The seed_data_reference.type must be container and the results_reference.type must also be container.

Response

Response Body

SDGUnstructuredGenerationResource

The response from getting a specified synthetic data generation job.

Status Code

201
The synthetic data generation job has been created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

Create a synthetic data generation job.

POST /ml/v1/tuning/synthetic_data

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

SyntheticDataGenerationRequest

The details needed to create synthetic data generation job.

The data_reference.type must be taxonomy_asset and the results_reference.type will normally be something like connection_asset or data_asset.

Response

Response Body

SyntheticDataGenerationResource

The response from getting a specified synthetic data generation job.

Status Code

201
The synthetic data generation job has been created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

GET /ml/v1/tuning/synthetic_data

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

SyntheticDataGenerationResources

The response of getting all synthetic data generation jobs.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

GET /ml/v1/tuning/synthetic_data/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

SyntheticDataGenerationResource

The response from getting a specified synthetic data generation job.

Status Code

200
The synthetic data generation job has been created
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Cancel the synthetic data generation and remove it.

DELETE /ml/v1/tuning/synthetic_data/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
hard_delete
boolean
Set to true in order to also delete the job metadata information.

Response

Status Code

204
Synthetic data generation cancelled.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a taxonomy job.

POST /ml/v1/tuning/taxonomies_imports

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07

Request Body

Required*

TaxonomyRequest

The request fields to create Taxonomy job.

The data_reference.type must be set to github.

InstructLab Taxonomy

curl --request POST 'https://{cluster_url}/ml/v1/tuning/taxonomies_imports?version=2023-10-25'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "name": "taxonomyName",
  "description": "Taxonomy",
  "project_id": "bfdae754-f0ef-45c6-a982-50b222f82015",
  "data_reference": {
    "type": "github",
    "location": {
      "secret_manager_url": "https://5db94803-9c37-498b-b4bd-d601ac4a0518.eu-gb.secrets-manager.test.appdomain.cloud/api/v2/secrets",
      "secret_id": "539f678e-3436-5d70-5c62-e98250bf0427",
      "path": "."
    }
  }
}'
Copy to clipboard

Response

Response Body

TaxonomyResource

The response fields from a Taxonomy request.

Status Code

201
The taxonomy job has been created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

GET /ml/v1/tuning/taxonomies_imports

Request

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

TaxonomyResources

The list of Taxonomy jobs in specified project or space.

Status Code

200
OK
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.

No Sample Response

This method does not specify any sample responses.

GET /ml/v1/tuning/taxonomies_imports/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

Response

Response Body

TaxonomyResource

The response fields from a Taxonomy request.

Status Code

200
The taxonomy job has been created.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Cancel or delete the taxonomy job.

DELETE /ml/v1/tuning/taxonomies_imports/{id}

Request

Path Parameters

id
Required*
string
The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

version
Required*
date
The version date for the API of the form YYYY-MM-DD.

Example: 2023-07-07
project_id
string
The project that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc
space_id
string
The space that contains the resource. Either space_id or project_id query parameter has to be given.

Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$
Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f
hard_delete
boolean
Set to true in order to also delete the job metadata information.

Response

Status Code

204
Taxonomy cancelled.
400
Bad request, the response body should contain the reason.
401
Unauthorized.
403
Forbidden, an authentication error including trying to access an unauthorized space or project.
404
The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate a chat completion based on the provided messages and parameters using the provided model.

POST /ml/gateway/v1/chat/completions

Request

Request Body

Required*

ChatsRequest

Chat Completion Request

Response

Response Body

ChatsResponse

A chat completion response generated by a model.

Status Code

201
Created Successfully
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "object": "chat.completion",
  "id": "chatcmpl-B9MHDbslfkBeAs8l4bebGdFOJ6PeG",
  "created": 1741570283,
  "model": "gpt-4o-2024-08-06",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "tool_calls": []
      },
      "content_filter_results": {
        "jailbreak": {
          "detected": true,
          "filtered": true
        }
      },
      "logprobs": {
        "content": [
          {
            "bytes": [
              123,
              67,
              80,
              102,
              23,
              68
            ],
            "logprob": 0.6773386835778668,
            "token": "The ",
            "top_logprobs": [
              {
                "bytes": [
                  123,
                  67,
                  80
                ],
                "logprob": 0.6773386835778668,
                "token": "capital "
              },
              {
                "bytes": [
                  123,
                  67,
                  80
                ],
                "logprob": 0.00022722178296200756,
                "token": "country "
              }
            ]
          }
        ],
        "refusal": []
      },
      "finish_reason": "stop"
    }
  ],
  "prompt_filter_results": [
    {
      "index": 0,
      "content_filter_results": {
        "profanity": {
          "detected": true,
          "filtered": true
        }
      }
    }
  ],
  "service_tier": "auto",
  "system_fingerprint": "fp_fc9f1d7035",
  "usage": {
    "completion_tokens": 281,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 91,
      "audio_tokens": 0,
      "reasoning_tokens": 74,
      "rejected_prediction_tokens": 0
    },
    "prompt_tokens": 66,
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 13
    },
    "total_tokens": 347
  },
  "cached": true
}
Copy to clipboard

Generate a text completion based on the provided prompt and parameters using the provided model.

POST /ml/gateway/v1/completions

Request

Request Body

Required*

CompletionsRequest

Completion Request

Response

Response Body

CompletionsResponse

A legacy text completion response generated by a model.

Status Code

201
Created Successfully
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "object": "text_completion",
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "created": 1589478378,
  "model": "gpt-4-turbo",
  "choices": [
    {
      "index": 0,
      "text": "The capital of France is Paris.",
      "logprobs": {
        "text_offset": [
          0,
          4,
          12,
          15,
          22,
          27
        ],
        "token_logprobs": [
          0.6773386835778668,
          0.3199521581969428,
          0.0019025046958103897,
          0.00022722178296200756,
          0.0001769605025016923,
          0.0001769605025016923
        ],
        "tokens": [
          "The ",
          "capital ",
          "of ",
          "France ",
          "is ",
          "Paris."
        ],
        "top_logprobs": []
      },
      "finish_reason": "stop"
    },
    {
      "index": 1,
      "text": "France's capital city is Paris.",
      "logprobs": {
        "text_offset": [
          6,
          9,
          16,
          22,
          25
        ],
        "token_logprobs": [
          0.3199521581969428,
          0.0019025046958103897,
          0.00022722178296200756,
          0.0001769605025016923,
          0.0001769605025016923
        ],
        "tokens": [
          "France",
          "'s capital",
          " city ",
          "is ",
          "Paris."
        ],
        "top_logprobs": []
      },
      "finish_reason": "stop"
    }
  ],
  "system_fingerprint": "fp_fc9f1d7035",
  "usage": {
    "completion_tokens": 281,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 91,
      "audio_tokens": 0,
      "reasoning_tokens": 74,
      "rejected_prediction_tokens": 0
    },
    "prompt_tokens": 66,
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 13
    },
    "total_tokens": 347
  },
  "cached": true
}
Copy to clipboard

Generate embeddings based on the provided input using the provided model.

POST /ml/gateway/v1/embeddings

Request

Request Body

Required*

EmbeddingsEmbeddingRequest

Embedding Request

Response

Response Body

EmbeddingsEmbeddingResponse

An embeddings response generated by a model.

Status Code

201
Created Successfully
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "object": "list",
  "model": "gpt-4o",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        0.0023064255,
        -0.009327292,
        -0.0028842222
      ]
    }
  ],
  "usage": {
    "completion_tokens": 281,
    "completion_tokens_details": {
      "accepted_prediction_tokens": 91,
      "audio_tokens": 0,
      "reasoning_tokens": 74,
      "rejected_prediction_tokens": 0
    },
    "prompt_tokens": 66,
    "prompt_tokens_details": {
      "audio_tokens": 0,
      "cached_tokens": 13
    },
    "total_tokens": 347
  }
}
Copy to clipboard

Lists all configured model details aggregated across all configured providers.

GET /ml/gateway/v1/models

Request

No Request Parameters

This method does not accept any request parameters.

Response

Response Body

ModelCollection

A list of models.

Status Code

200
List of all configured models.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "object": "model",
      "id": "gpt-3.5-turbo-456723",
      "alias": "gpt-3.5-turbo",
      "created": 1677649963,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-3.5",
        "region": "us-east-1"
      },
      "owned_by": "azureOpenai:my-azure-provider",
      "uuid": "123e4567-e89b-12d3-a456-426614174000"
    },
    {
      "object": "model",
      "id": "gpt-4o-mini-2024-07-18",
      "alias": "gpt-4o-mini",
      "created": 1677193491,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-4o",
        "region": "us-east-1"
      },
      "owned_by": "openai",
      "uuid": "123e0987-d89c-45d6-a789-426614174000"
    }
  ]
}
Copy to clipboard

Retrieves a specific model configuration by model UUID.

GET /ml/gateway/v1/models/{model_uuid}

Request

Path Parameters

model_uuid
Required*
string
Model UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Response Body

Model

Configuration for a model, Large Language Model (LLM) or otherwise, that's available through a configured model provider. For example, id could be set to "gpt-o", which is the official server-side name of the model. The alias field can be used by clients to refer to that model in a more convenient or custom manner. When a client provides the alias instead of the official name, the middleware will map the alias back to the underlying id (e.g., "gpt-o") and execute requests against the correct model.

Status Code

200
Model configuration details.
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

Example responses

Status 200

{
  "object": "model",
  "id": "gpt-3.5-turbo-456723",
  "alias": "gpt-3.5-turbo",
  "created": 1677649963,
  "metadata": {
    "cost": 0.02,
    "model_family": "gpt-3.5",
    "region": "us-east-1"
  },
  "owned_by": "openai:vllm4",
  "uuid": "123e4567-e89b-12d3-a456-426614174000"
}
Copy to clipboard

Removes a specific model configuration from the tenant by UUID.

DELETE /ml/gateway/v1/models/{model_uuid}

Request

Path Parameters

model_uuid
Required*
string
Model UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Status Code

204
No Content
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Retrieves details of the currently authenticated tenant.

GET /ml/gateway/v1/tenant

Request

No Request Parameters

This method does not accept any request parameters.

Response

Response Body

Tenant

Information about a tenancy.

Status Code

200
Tenant Details
401
Unauthorized
404
Not Found
500
Internal Server Error

Example responses

Status 200

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-tenant",
  "remote_credential_store": {
    "ibm_cloud_secret_manager": {
      "base_url": "https://xxxx.xxxx.secrets-manager.appdomain.cloud",
      "group": "AccessGroupId-56c5e703-80d4-4f06-a7e6-844618ec39b3"
    }
  }
}
Copy to clipboard

Creates a new tenant.

POST /ml/gateway/v1/tenant

Request

Request Body

Required*

TenantCreateTenantRequest

Create Tenant Request

Response

Response Body

Tenant

Information about a tenancy.

Status Code

201
Created Tenant
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-tenant",
  "remote_credential_store": {
    "ibm_cloud_secret_manager": {
      "base_url": "https://xxxx.xxxx.secrets-manager.appdomain.cloud",
      "group": "AccessGroupId-56c5e703-80d4-4f06-a7e6-844618ec39b3"
    }
  }
}
Copy to clipboard

Replaces details of the currently authenticated tenant's information.

PUT /ml/gateway/v1/tenant

Request

Request Body

Required*

ReplaceTenantRequest

Replacement tenant information details.

Response

Response Body

Tenant

Information about a tenancy.

Status Code

200
Replaced tenant information.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-tenant"
}
Copy to clipboard

Updates details of the currently authenticated tenant's information.

PATCH /ml/gateway/v1/tenant

Request

Request Body

Required*

UpdateTenantRequest

Updated tenant information details.

Response

Response Body

Tenant

Information about a tenancy.

Status Code

200
Updated tenant information.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-tenant"
}
Copy to clipboard

Deletes an existing tenant.

DELETE /ml/gateway/v1/tenant

Request

No Request Parameters

This method does not accept any request parameters.

Response

Status Code

204
Tenant Deleted Successfully
401
Unauthorized
403
Forbidden
404
Not Found
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Lists all tenant policies.

GET /ml/gateway/v1/policies

Request

No Request Parameters

This method does not accept any request parameters.

Response

Response Body

TenantPolicyCollection

A list of tenant policies.

Status Code

200
List of Policies
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "action": "write",
      "effect": "allow",
      "resource": "model:62a04a11-07bf-5309-a78e-95323dbbc333",
      "subject": "AccessGroupId-56c5e703-80d4-4f06-a7e6-844618ec39b3"
    },
    {
      "action": "read",
      "effect": "deny",
      "resource": "model:6d9234a11-07bf-q309-a38e-95323dbbc333",
      "subject": "AccessGroupId-203dad03-123d4-f4e206-a7e6-844618e5321"
    }
  ]
}
Copy to clipboard

Creates a new policy.

POST /ml/gateway/v1/policies

Request

Request Body

Required*

TenantPolicy

Policy configuration

Examples:

Response

Status Code

204
No Content
400
Bad Request
401
Unauthorized
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Deletes the specified policy.

DELETE /ml/gateway/v1/policies

Request

Request Body

Required*

TenantPolicy

Policy configuration

Response

Status Code

204
No Content
400
Bad Request
401
Unauthorized
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Lists all providers.

GET /ml/gateway/v1/providers

Request

No Request Parameters

This method does not accept any request parameters.

Response

Response Body

ProviderCollection

A list of model providers.

Status Code

200
List of Providers
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "uuid": "56c5e703-80d4-4f06-a7e6-844618ec39b3",
      "name": "my-openai-provider",
      "type": "openai",
      "data": {
        "base_url": "https://api.openai.com/v1",
        "apikey": "AIzaSyDaGmWKa4JsXZ-HjGw7ISLn_3namBGewQe"
      },
      "models": [
        {
          "object": "model",
          "id": "gpt-3.5-turbo-456723",
          "alias": "gpt-3.5-turbo",
          "created": 1677649963,
          "metadata": {
            "cost": 0.02,
            "model_family": "gpt-3.5",
            "region": "us-east-1"
          },
          "owned_by": "openai:vllm4",
          "uuid": "123e4567-e89b-12d3-a456-426614174000"
        },
        {
          "object": "model",
          "id": "gpt-4o-mini-2024-07-18",
          "alias": "gpt-4o-mini",
          "created": 1677193491,
          "metadata": {
            "cost": 0.02,
            "model_family": "gpt-4o",
            "region": "us-east-1"
          },
          "owned_by": "openai:vllm4",
          "uuid": "123e0987-d89c-45d6-a789-426614174000"
        }
      ]
    }
  ]
}
Copy to clipboard

Creates a new Anthropic provider configuration with the supplied details.

POST /ml/gateway/v1/providers/anthropic

Request

Request Body

Required*

ProviderRequestAnthropicConfig

Anthropic provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created Anthropic provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-anthropic-provider",
  "type": "anthropic"
}
Copy to clipboard

Creates a new Azure OpenAI provider configuration with the supplied details.

POST /ml/gateway/v1/providers/azure_openai

Request

Request Body

Required*

ProviderRequestAzureOpenaiConfig

Azure OpenAI provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created Azure OpenAI provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-azure-openai-provider",
  "type": "azureOpenai"
}
Copy to clipboard

Creates a new AWS Bedrock provider configuration with the supplied details.

POST /ml/gateway/v1/providers/bedrock

Request

Request Body

Required*

ProviderRequestAwsBedrockConfig

AWS Bedrock provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created AWS Bedrock provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-aws-bedrock-provider",
  "type": "bedrock"
}
Copy to clipboard

Creates a new Cerebras provider configuration with the supplied details.

POST /ml/gateway/v1/providers/cerebras

Request

Request Body

Required*

ProviderRequestCerebrasConfig

Cerebras provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created Cerebras provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-cerebras-provider",
  "type": "cerebras"
}
Copy to clipboard

Creates a new Nvidia NIM provider configuration with the supplied details.

POST /ml/gateway/v1/providers/nim

Request

Request Body

Required*

ProviderRequestNvidiaNimConfig

Nvidia NIM provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created Nvidia NIM provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-nvidia-nim-provider",
  "type": "nvidiaNim"
}
Copy to clipboard

Creates a new OpenAI provider configuration with the supplied details.

POST /ml/gateway/v1/providers/openai

Request

Request Body

Required*

ProviderRequestOpenaiConfig

OpenAI provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created OpenAI provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-openai-provider",
  "type": "openai"
}
Copy to clipboard

Searches for providers by name.

GET /ml/gateway/v1/providers/search

Request

Query Parameters

name
string
Provider name to search for

Response

Response Body

ProviderCollection

A list of model providers.

Status Code

200
List of matching Providers
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "uuid": "56c5e703-80d4-4f06-a7e6-844618ec39b3",
      "name": "my-openai-provider",
      "type": "openai",
      "data": {
        "base_url": "https://api.openai.com/v1",
        "apikey": "sk-proj-2_IN3221...IWZkA"
      },
      "models": [
        {
          "object": "model",
          "id": "gpt-3.5-turbo-456723",
          "alias": "gpt-3.5-turbo",
          "created": 1677649963,
          "metadata": {
            "cost": 0.02,
            "model_family": "gpt-3.5",
            "region": "us-east-1"
          },
          "owned_by": "openai:vllm4",
          "uuid": "123e4567-e89b-12d3-a456-426614174000"
        },
        {
          "object": "model",
          "id": "gpt-4o-mini-2024-07-18",
          "alias": "gpt-4o-mini",
          "created": 1677193491,
          "metadata": {
            "cost": 0.02,
            "model_family": "gpt-4o",
            "region": "us-east-1"
          },
          "owned_by": "openai:vllm4",
          "uuid": "123e0987-d89c-45d6-a789-426614174000"
        }
      ]
    }
  ]
}
Copy to clipboard

Creates a new IBM WatsonX.ai provider configuration with the supplied details.

POST /ml/gateway/v1/providers/watsonxai

Request

Request Body

Required*

ProviderRequestWatsonxaiConfig

IBM WatsonX.ai provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

201
Created IBM Watsonx.ai provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-watsonxai-provider",
  "type": "watsonxai"
}
Copy to clipboard

Retrieves the details of a specific provider.

GET /ml/gateway/v1/providers/{provider_uuid}

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Response Body

Provider

A model provider configured for a tenant.

Status Code

200
Provider Details
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "56c5e703-80d4-4f06-a7e6-844618ec39b3",
  "name": "vllm4",
  "type": "openai",
  "data": {
    "base_url": "https://api.openai.com/v1",
    "apikey": "AIzaSyDaGmWKa4JsXZ-HjGw7ISLn_3namBGewQe"
  },
  "models": [
    {
      "object": "model",
      "id": "gpt-3.5-turbo-456723",
      "alias": "gpt-3.5-turbo",
      "created": 1677649963,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-3.5",
        "region": "us-east-1"
      },
      "owned_by": "openai:vllm4",
      "uuid": "123e4567-e89b-12d3-a456-426614174000"
    },
    {
      "object": "model",
      "id": "gpt-4o-mini-2024-07-18",
      "alias": "gpt-4o-mini",
      "created": 1677193491,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-4o",
        "region": "us-east-1"
      },
      "owned_by": "openai:vllm4",
      "uuid": "123e0987-d89c-45d6-a789-426614174000"
    }
  ]
}
Copy to clipboard

Deletes the specified provider.

DELETE /ml/gateway/v1/providers/{provider_uuid}

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Status Code

204
No Content
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Replaces an existing Anthropic provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/anthropic

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestAnthropicConfig

Replacement Anthropic provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced Anthropic provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "anthropic-prod2",
  "type": "anthropic"
}
Copy to clipboard

Replaces an existing Azure OpenAI provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/azure_openai

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestAzureOpenaiConfig

Replacement Azure OpenAI provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced Azure OpenAI provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "azure-provider",
  "type": "azureOpenai"
}
Copy to clipboard

Replaces an existing AWS Bedrock provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/bedrock

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestAwsBedrockConfig

Replacement AWS Bedrock provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced AWS Bedrock provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "bedrock-provider",
  "type": "bedrock"
}
Copy to clipboard

Replaces an existing Cerebras provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/cerebras

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestCerebrasConfig

Replacement Cerebras provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced Cerebras provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "cerebrasProvider2",
  "type": "cerebras"
}
Copy to clipboard

Lists all model configurations for the specified provider.

GET /ml/gateway/v1/providers/{provider_uuid}/models

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Response Body

ModelCollection

A list of models.

Status Code

200
List of Models
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "object": "model",
      "id": "gpt-3.5-turbo-456723",
      "alias": "gpt-3.5-turbo",
      "created": 1677649963,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-3.5",
        "region": "us-east-1"
      },
      "owned_by": "openai:vllm4",
      "uuid": "123e4567-e89b-12d3-a456-426614174000"
    },
    {
      "object": "model",
      "id": "gpt-4o-mini-2024-07-18",
      "alias": "gpt-4o-mini",
      "created": 1677193491,
      "metadata": {
        "cost": 0.02,
        "model_family": "gpt-4o",
        "region": "us-east-1"
      },
      "owned_by": "openai:vllm4",
      "uuid": "123e0987-d89c-45d6-a789-426614174000"
    }
  ]
}
Copy to clipboard

Adds a new model configuration for the specified provider.

POST /ml/gateway/v1/providers/{provider_uuid}/models

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

CreateModelRequest

Model configuration

Response

Response Body

Model

Configuration for a model, Large Language Model (LLM) or otherwise, that's available through a configured model provider. For example, id could be set to "gpt-o", which is the official server-side name of the model. The alias field can be used by clients to refer to that model in a more convenient or custom manner. When a client provides the alias instead of the official name, the middleware will map the alias back to the underlying id (e.g., "gpt-o") and execute requests against the correct model.

Status Code

201
Model Created
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 201

{
  "object": "model",
  "id": "gpt-3.5-turbo-456723",
  "alias": "gpt-3.5-turbo",
  "created": 1677649963,
  "metadata": {
    "cost": 0.02,
    "model_family": "gpt-3.5",
    "region": "us-east-1"
  },
  "owned_by": "openai:vllm4",
  "uuid": "123e4567-e89b-12d3-a456-426614174000"
}
Copy to clipboard

Replaces a specific model configuration by model UUID.

PUT /ml/gateway/v1/providers/{provider_uuid}/models/{model_uuid}

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000
model_uuid
Required*
string
Model UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

ReplaceModelRequest

Replacement model configuration

Response

Response Body

Model

Configuration for a model, Large Language Model (LLM) or otherwise, that's available through a configured model provider. For example, id could be set to "gpt-o", which is the official server-side name of the model. The alias field can be used by clients to refer to that model in a more convenient or custom manner. When a client provides the alias instead of the official name, the middleware will map the alias back to the underlying id (e.g., "gpt-o") and execute requests against the correct model.

Status Code

201
Replaced model configuration
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

Example responses

Status 201

{
  "object": "model",
  "id": "gpt-3.5-turbo-456723",
  "alias": "gpt-3.5-turbo",
  "created": 1677649963,
  "metadata": {
    "cost": 0.02,
    "model_family": "gpt-3.5",
    "region": "us-east-1"
  },
  "owned_by": "openai:vllm4",
  "uuid": "123e4567-e89b-12d3-a456-426614174000"
}
Copy to clipboard

Updates a specific model configuration by model UUID.

PATCH /ml/gateway/v1/providers/{provider_uuid}/models/{model_uuid}

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000
model_uuid
Required*
string
Model UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

UpdateModelRequest

Updated model configuration

Response

Response Body

Model

Configuration for a model, Large Language Model (LLM) or otherwise, that's available through a configured model provider. For example, id could be set to "gpt-o", which is the official server-side name of the model. The alias field can be used by clients to refer to that model in a more convenient or custom manner. When a client provides the alias instead of the official name, the middleware will map the alias back to the underlying id (e.g., "gpt-o") and execute requests against the correct model.

Status Code

201
Updated model configuration
400
Bad Request
401
Unauthorized
404
Not Found
500
Internal Server Error

Example responses

Status 201

{
  "object": "model",
  "id": "gpt-3.5-turbo-456723",
  "alias": "gpt-3.5-turbo",
  "created": 1677649963,
  "metadata": {
    "cost": 0.02,
    "model_family": "gpt-3.5",
    "region": "us-east-1"
  },
  "owned_by": "openai:vllm4",
  "uuid": "123e4567-e89b-12d3-a456-426614174000"
}
Copy to clipboard

Deletes the specified model configuration for the specified provider.

DELETE /ml/gateway/v1/providers/{provider_uuid}/models/{model_uuid}

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000
model_uuid
Required*
string
Model UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Status Code

204
No Content
400
Bad Request
401
Unauthorized
500
Internal Server Error

No Sample Response

This method does not specify any sample responses.

Lists all models available for the specified provider.

GET /ml/gateway/v1/providers/{provider_uuid}/models_available

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Response

Response Body

OpenAIModelResponseCollection

A list of model responses.

Status Code

200
List of Models
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "object": "list",
  "data": [
    {
      "object": "model",
      "id": "gpt-3.5-turbo-456723",
      "created": 1677649963,
      "owned_by": "openai:vllm4"
    },
    {
      "object": "model",
      "id": "gpt-4o",
      "created": 1677295812,
      "owned_by": "openai:vllm4"
    }
  ]
}
Copy to clipboard

Replaces an existing Nvidia NIM provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/nim

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestNvidiaNimConfig

Replacement Nvidia NIM provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced Nvidia NIM provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-nim-provider",
  "type": "nvidiaNim"
}
Copy to clipboard

Replaces an existing OpenAI provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/openai

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestOpenaiConfig

Replacement OpenAI provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced OpenAI provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "new-openai-provider",
  "type": "openai"
}
Copy to clipboard

Replaces an existing IBM WatsonX.ai provider configuration with new details.

PUT /ml/gateway/v1/providers/{provider_uuid}/watsonxai

Request

Path Parameters

provider_uuid
Required*
string
Provider UUID

Example: 550e8400-e29b-41d4-a716-446655440000

Request Body

Required*

ProviderRequestWatsonxaiConfig

Replacement IBM Watsonx.ai provider configuration details.

Response

Response Body

ProviderResponse

Create provider response.

Status Code

200
Replaced IBM WatsonX.ai provider configuration.
400
Bad Request
401
Unauthorized
500
Internal Server Error

Example responses

Status 200

{
  "uuid": "123e4567-e89b-12d3-a456-426614174000",
  "name": "my-watsonxai-provider2",
  "type": "watsonxai"
}
Copy to clipboard

This retrieves the complete list of supported utility agent tools and contains information required for running each tool.

GET /v1-beta/utility_agent_tools

Request

No Request Parameters

This method does not accept any request parameters.

Get tools

curl --request GET 'https://{cluster_url}/v1-beta/utility_agent_tools'
-H 'Accept: application/json'

Response

Response Body

wxUtilityAgentToolsResponse

Status Code

200
OK - Returned from GET when it succeeds

Example responses

Status 200: get_tools

Get all utility agent tools.

{
  "resources": [
    {
      "name": "GoogleSearch",
      "description": "Search for online trends, news, current events, real-time information, or research topics.",
      "agent_description": "Search for online trends, news, current events, real-time information, or research topics.",
      "config_schema": {
        "title": "config schema for GoogleSearch tool",
        "type": "object",
        "properties": {
          "maxResults": {
            "title": "Max number of results to return",
            "type": "integer",
            "minimum": 1,
            "maximum": 20,
            "wx_ui_name": "Max results",
            "wx_ui_field_type": "numberInput",
            "wx_ui_default": 10
          }
        }
      }
    },
    {
      "name": "WebCrawler",
      "description": "Useful for when you need to summarize a webpage. Do not use for Web search.",
      "agent_description": "Useful for when you need to summarize a webpage. Do not use for Web search.",
      "input_schema": {
        "type": "object",
        "properties": {
          "url": {
            "title": "url",
            "description": "URL for the webpage to be scraped",
            "type": "string",
            "pattern": "^(https?://)?([\\da-z\\.-]+)\\.([a-z\\.]{2,6})([/\\w \\.-]*)*/?$"
          }
        },
        "required": [
          "url"
        ]
      }
    }
  ]
}
Copy to clipboard

This retrieves the details of an utility agent tool and contains information required for running the tool. Providing authentication and configuration params may return additional details.

GET /v1-beta/utility_agent_tools/{tool_id}

Request

Path Parameters

tool_id
Required*
string
Tool name

Possible values: Value must match regular expression [a-zA-Z0-9-]*

RAG query

curl --request GET 'https://{cluster_url}/v1-beta/utility_agent_tools/RAGQuery'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'

Response

Response Body

utilityAgentTool

Status Code

200
OK - Returned from GET when it succeeds

Example responses

Status 200: get_rag_query

Get RAGQuery agent tool with dynamic agent_description.

{
  "name": "RAGQuery",
  "description": "Search the documents in a vector index.",
  "agent_description": "Search information in documents to provide context to a user query. Useful when asked to ground the answer in specific knowledge about watsonx documentation.",
  "config_schema": {
    "title": "config schema for RAGQuery tool",
    "type": "object",
    "properties": {
      "vectorIndexId": {
        "title": "Vector index identifier",
        "type": "string"
      },
      "projectId": {
        "title": "Project identifier",
        "type": "string"
      },
      "spaceId": {
        "title": "Space identifier",
        "type": "string"
      }
    },
    "required": [
      "vectorIndexId"
    ],
    "oneOf": [
      {
        "required": [
          "projectId"
        ]
      },
      {
        "required": [
          "spaceId"
        ]
      }
    ]
  }
}
Copy to clipboard

This runs a utility agent tool given an input and optional configuration parameters.

Some tools can choose to tailor the response based on the access token identity.

POST /v1-beta/utility_agent_tools/run

Request

Request Body

Required*

One of

Change Schema Parameter List

Example:

run google

curl --request POST 'https://{cluster_url}/v1-beta/utility_agent_tools/run'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "tool_name": "GoogleSearch",
  "input": "What was the weather in Toronto on January 13th 2025?",
  "config": {
    "maxResults": 3
  }
}'
Copy to clipboard

run python interpreter

curl --request POST 'https://{cluster_url}/v1-beta/utility_agent_tools/run'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "tool_name": "PythonInterpreter",
  "input": "print(4*5)"
}'
Copy to clipboard

run web crawler

curl --request POST 'https://{cluster_url}/v1-beta/utility_agent_tools/run'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
-H 'Accept: application/json'
--data-raw '{
  "tool_name": "WebCrawler",
  "input": {
    "url": "https://www.ibm.com/us-en"
  }
}'
Copy to clipboard

Response

Response Body

wxUtilityAgentToolsRunResponse

Status Code

200
OK - Returned when tool ran succesfully
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: run_google_tool_results

Results of the GoogleSearch tool.

{
  "output": "[{\"title\":\"Toronto, Ontario, Canada Monthly Weather | AccuWeather\",\"description\":\"January. January February March April May June July August September October November December. 2025 ... 13°. 29. 37°. 18°. 30. 34°. 16°. 31. 36°. 18°. 1. 18°. 11 ...\",\"url\":\"https://www.accuweather.com/en/ca/toronto/m5h/january-weather/55488\"},{\"title\":\"Anthony Slater on X: \\\"Draymond Green missed the Warriors ...\",\"description\":\"Draymond Green missed the Warriors shootaround in Toronto this morning. Under the weather. He is questionable tonight with an illness. 4:45 PM · Jan 13, ...\",\"url\":\"https://x.com/anthonyVslater/status/1878845945854730255\"},{\"title\":\"Canada weather forecast for Tuesday, 13 January 2026\",\"description\":\"Weather in Canada during the last few years on January 13 ; 2025 - January 13, 32 ° / 26 °, 0 in ; 2024 - January 13, 39 ° / 26 °, 0.46 in ; 2023 - January 13, 32 ...\",\"url\":\"https://www.weather25.com/north-america/canada?page=date&date=13-1\"}]"
}

Status 200: run_web_crawler_tool_results

Results of the WebCrawler tool.

{
  "output": "\"{\\\"url\\\":\\\"https://www.ibm.com/us-en\\\",\\\"contentType\\\":\\\"text/html;charset=utf-8\\\",\\\"content\\\":\\\"IBM - United States\\\\n\\\\nBoost developer productivity with AI\\\\n\\\\nAchieve 59% average time savings on code documentation¹ and reduce development costs by 30%²\\\\n\\\\nOvercome developer challenges\\\\n\\\\nExplore watsonx Code Assistant\\\\n\\\\nLatest news\\\\n\\\\nArvind Krishna Celebrates the Work of a Pioneer at the TIME100 AI Impact Awards\\\\n\\\\nIBM and Lenovo Expand Strategic Technology Partnership in the Kingdom of Saudi Arabia\\\\n\\\\nIBM Study: Gen AI Will Elevate Financial Performance of Banks in 2025\\\\n\\\\nTelefónica Tech and IBM Sign a Collaboration Agreement for Quantum-Safe Technology\\\\n\\\\nIBM RELEASES FOURTH-QUARTER RESULTS\\\\n\\\\nIBM BOARD APPROVES REGULAR QUARTERLY CASH DIVIDEND\\\\n\\\\nIBM and Palo Alto Networks Find Platformization is Key to Reduce Cybersecurity Complexity\\\\n\\\\ne& Collaborates with IBM to Launch Pioneering End-to-End AI Governance Platform\\\\n\\\\nRecommended for you\\\\n\\\\nRead why tailor-made AI delivers precision power\\\\n\\\\nLearn AI skills you’ll need for 2025\\\\n\\\\nListen to the episode: DeepSeek facts vs hype and more\\\\n\\\\nMeet Meta Llama 3.2 models on watsonx\\\\n\\\\nAI insights and tools\\\\n\\\\nFor developers\\\\n\\\\nGrow your skills and create something new with our AI tools and foundation models. Then connect, collaborate and innovate with your peers.\\\\n\\\\nStart building with IBM Granite models\\\\n\\\\nExplore AI courses, APIs, data sets and more\\\\n\\\\nAccelerate software development with watsonx Code Assistant\\\\n\\\\nCheck out the watsonx.ai Developer Toolkit\\\\n\\\\nFor business leaders\\\\n\\\\nTransform business and drive growth with AI tools, technology and insights that help you stay competitive—and responsibly map your organization's future.\\\\n\\\\nRead the CEO's guide to generative AI\\\\n\\\\nGet the AI in Action report\\\\n\\\\nExplore IBM's approach to AI ethics\\\\n\\\\nSubscribe to the Think newsletter\\\\n\\\\nThink 2025\\\\n\\\\nJoin 5,000+ senior business and technology leaders at Think 2025 on 5–8 May 2025 in Boston, Massachusetts\\\\n\\\\nRegister today\\\\n\\\\nTechnology & Consulting\\\\n\\\\nFrom next-generation AI to cutting edge hybrid cloud solutions to the deep expertise of IBM Consulting, IBM has what it takes to help you reinvent how your business works in the age of AI.\\\\n\\\\nGet the latest product offers and discounts\\\\n\\\\nAI solutions\\\\n\\\\nGo from AI pilots to production with AI technologies built for business\\\\n\\\\nAI models\\\\n\\\\nGet started with cost-efficient AI models, tailored for business and optimized for scale\\\\n\\\\nConsulting\\\\n\\\\nEngage with IBM Consulting to design, build and operate high-performing businesses\\\\n\\\\nAnalytics\\\\n\\\\nSupport data-driven decisions for your business\\\\n\\\\nIT automation\\\\n\\\\nDiscover how automation solutions increase productivity while managing costs\\\\n\\\\nCompute & servers\\\\n\\\\nHandle mission-critical workloads while maintaining security, reliability and control of your entire IT infrastructure\\\\n\\\\nDatabases\\\\n\\\\nRun your applications, analytics and generative AI with databases on any cloud\\\\n\\\\nSecurity & identity\\\\n\\\\nSecure hybrid cloud and AI with data and identity-centric cybersecurity solutions\\\\n\\\\nInside IBM\\\\n\\\\nOur company\\\\n\\\\nExplore IBM history and culture of putting technology to work in the real world\\\\n\\\\nAbout IBM\\\\n\\\\nOur history\\\\n\\\\nOur impact\\\\n\\\\nLearn about IBM's commitment to environmental, equitable and ethical pillars\\\\n\\\\nCorporate social responsibility\\\\n\\\\nDiversity and inclusion\\\\n\\\\nOur innovations\\\\n\\\\nVisit the IBM lab, and see what's in store for the future of computing\\\\n\\\\nIBM Research\\\\n\\\\nQuantum computing\\\\n\\\\nTake the next step\\\\n\\\\nSolving the world’s problems through technology wouldn’t be possible without people with the right skills. See what it takes to become an IBMer, or build your skills with our educational courses.\\\\n\\\\nBecome an IBMer\\\\n\\\\nExplore jobs\\\\n\\\\nExplore learning opportunities\\\\n\\\\nStart learning\\\\n\\\\nFootnotes\\\\n\\\\n¹ Keep the data flowing. Keep the water flowing. IBM case study on Water Corporation, January 2024.\\\\n2 Accelerating software development with gen AI, IBM, 2024.\\\"}\""
}

This runs a utility agent tool given an input and optional configuration parameters.

Some tools can choose to tailor the response based on the access token identity.

POST /v1-beta/utility_agent_tools/run/{tool_id}

Request

Path Parameters

tool_id
Required*
string
Tool name

Possible values: Value must match regular expression [a-zA-Z0-9-]*

Request Body

Required*

One of

Change Schema Parameter List

Example:

RAG query

curl --request POST 'https://{cluster_url}/v1-beta/utility_agent_tools/run/RAGQuery'
-H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
-H 'Content-Type: application/json'
--data-raw '{
  "input": "What is a project?",
  "config": {
    "projectId": "d514c8ef-423f-429c-8947-fa900dee338a",
    "vectorIndexId": "30964b43-f090-44a6-a379-4ab4c00498ca"
  }
}'
Copy to clipboard

Response

Response Body

wxUtilityAgentToolsRunResponse

Status Code

200
OK - Returned when tool ran succesfully
400
Bad Request - Returned when the request parameters are invalid
401
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

Example responses

Status 200: run_rag_query_tool

Results of the RAGQuery tool.

{
  "output": "Working in projects  A project is a collaborative workspace where you work with data and other assets to accomplish a particular goal.  By default, your sandbox project is created automatically when you sign up for watsonx.ai.  Your project can include these types of resources:   Collaborators are the people who you work with in your project.  Data assets are what you work with. Data assets often consist of raw data that you work with to refine.  Tools and their associated assets are how you work with\n\ndeployment spaces.  Projects and tools  Projects are where your data science and model builder teams work with data to create assets, such as, saved prompts, notebooks, models, or pipelines. Your first project, which is known as your sandbox project, is created automatically when you\n\nTask 2 . If you don't see any projects, then follow these steps to create a project. 1. Click Create a sandbox project . When the project is created, you will see the sandbox project in the Projects section. For more information or to watch a video, see Creating a project {: new_window}.\n\nTask 2 . If you don't see any projects, then follow these steps to create a project. 1. Click Create a sandbox project . When the project is created, you will see the sandbox project in the Projects section. For more information or to watch a video, see Creating a project {: new_window}.\n\ncharacters plus a unique identifier.  Watch this video to see how to create both an empty project, imported project, and a project from a sample.  This video provides a visual method to learn the concepts and tasks in this documentation.         Next steps   Add collaborators  Add data   Learn more   Object storage  Importing a project  Troubleshooting Cloud Object Storage for projects   Parent topic:  Projects\n\nis saved in the project. Many tasks include samples that you can use. You can find sample prompts, notebooks, data sets, and other assets in the Samples from the home page. You can share your work by adding collaborators to your project. If you need to work with data, you can add data assets to your project.  If your sandbox project is your only project, then any task that you select occurs in the context of your sandbox project. When you have multiple projects, you can change the default project\n\na project, you can add a short description to document the purpose or goal of the project. You can edit the description later, on the project's Settings page.  You can mark the project as sensitive. When users open a project that is marked as sensitive, a notification is displayed stating that no data assets can be downloaded or exported from the project.  The Overview page of a project contains a readme file where you can document the status or results of the project. The readme file uses standard\n\nproject.  Asset storage is where project information and files are stored.  Integrations are how you incorporate external tools.   You can customize projects to suit your goals. You can change the contents of your project and almost all of its properties at any time. However, you must make these choices when you create the project because you can't change them later:   The instance of IBM Cloud Object Storage to use for project storage.   You can view projects that you create and collaborate in by"
}

Introduction to IBM watsonx.ai as a Service

Endpoint URLs

Authentication

Error handling

Error response

Errors

Additional headers

API change log

14 March 2024

18 April 2024

Versioning

Active Version Dates

Data References

Activity Tracker events

Methods

Create a new AI service

Auditing

Request

Query Parameters

version

Request Body

space_id

name

software_spec

description

tags

code_type

documentation

custom

any property

tooling

any property

Response

Response Body

metadata

entity

software_spec

code_type

documentation

custom

any property

tooling

any property

system

Status Code

201

400

401

403

404

Retrieve the AI services

Auditing

Request

Query Parameters

version

space_id

project_id

start

limit

tag.value

search

Response

Response Body

limit

first

total_count

next

resources

system

Status Code

200

400

401

403

404

No Sample Response

Retrieve the AI service

Auditing

Request

Path Parameters