IBM Cloud API Docs

Introduction

Using IBM watsonx.ai, you can run text inference, prompt tuning and more on Large Language Models (LLM).

Step-by-step instructions on how to use IBM watsonx.ai can be found here.

There is a specialized python library that is available to access this REST API.

Endpoint URLs

The following URL represents the base URLs for the watsonx.ai API endpoints. When you call the API, use the URL and add the path for each method to form the complete API endpoint for your requests.

  • Dallas: https://us-south.ml.cloud.ibm.com
  • Frankfurt - https://eu-de.ml.cloud.ibm.com

Example request to a Dallas endpoint:

curl -H "Authorization: Bearer {token}" -X {request_method} "https://us-south.ml.cloud.ibm.com/{method_endpoint}"

Replace {request_method}, and {method_endpoint} in this example with the values for your particular API call. See the Authentication section below for more details about the bearer {token}.

Authentication

This API uses IBM Cloud Identity and Access Management (IAM) to authenticate requests.

To work with the API, authenticate your application or service by including your IBM Cloud IAM access token in API requests.

IAM authentication. Replace {token} and {url}/{method} with your service credentials.

curl -H "Authorization:Bearer {token}" -X "{url}/{method}"

Authorization: Bearer {token}

For example, if the token is tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5 in the service credentials, include the credentials in your call like this:

curl -H "Authorization:Bearer tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5" -X "https://us-south.ml.cloud.ibm.com/ml/v4/models"

Error handling

This API uses standard HTTP response codes to indicate whether a method completed successfully. A 200 type response indicates success.

HTTP Code Description Recovery
200 Success The request was successful.
400 Bad Request The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body.
401 Unauthorized You are not authorized to make this request. Log in and try again or provide a valid token. See Authenticating with IAM tokens for instructions on logging in. If this error persists, contact the account owner to check your permissions.
403 Forbidden The supplied authentication is not authorized.
404 Not Found The requested resource could not be found.

Note that 429 and 503 errors may mean that the model is overloaded or unavailable, check the error description for more details.

Error response

Name Description
trace An identifier that can be used to trace the request. This can be set using X-Global-Transaction-Id.
errors The list of errors.

Errors

Name Description
code A simple string code that should convey the general sense of the error.
message The message that describes the error.
more_info A reference to a more detailed explanation when available.

Additional headers

Some additional headers might be required to make successful requests to the API. Those additional headers are described below.

An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.

If there is not a transaction ID that is passed in, then one is generated randomly.

API change log

In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API. The change log lists changes that have been made, ordered by the date they were released. Changes to existing API versions are designed to be compatible with existing client applications, if this is not the case then a new version date will be created.

14 March 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically.

18 April 2024

The /ml/v1/text/embeddings API was added to watsonx.ai, this is a non-breaking change and just adds this single API operation.

Versioning

API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.

When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.

Active Version Dates

Version date Summary of changes
2024-03-14 Publication of the /ml/v1 APIs.

Data References

Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires the use of connection_asset or data_asset reference types. These reference types are created within a space or a project and are referenced in WML requests to represent input data and results locations. These types contain two parameter objects, connection and location, which require different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied to the location object whereas using a connection_asset requires the connection_id for the connection object and different location fields depending on the data source type.

Example connection_asset payload:

"training_data_references": [{
  "type": "connection_asset",
  "connection": {
    "id": "<connection_guid>"
  },
  "location": {
    "<wdp-properties depending on the type>"
  }
}]

Example data_asset payload:

"training_data_references": [{
  "type": "data_asset",
  "location": {
    "href":"/v2/assets/<asset_id>?space_id=<space_id>"
  }
}]

Activity Tracker events

You can monitor API activity within your account by using the IBM Cloud Activity Tracker service. Whenever an API method is called, an event is generated that you can then track and audit from within Activity Tracker. The specific event type is listed for each individual method.

Migrating from the beta API

The watsonx.ai API has changed between the beta and the GA, this section describes the changes required in order to migrate from the beta API to the GA API.

Migrating /ml/v1-beta/generation/text

  1. Change the path from /ml/v1-beta/generation/text to /ml/v1/text/generation.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/generation/text_stream

  1. Change the path from /ml/v1-beta/generation/text_stream to /ml/v1/text/generation_stream.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/deployments/{id_or_name}/generation/text

  1. Change the path from /ml/v1-beta/deployments/{id_or_name}/generation/text to /ml/v1/deployments/{id_or_name}/text/generation.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/deployments/{id_or_name}/generation/text_stream

  1. Change the path from /ml/v1-beta/deployments/{id_or_name}/generation/text_stream to /ml/v1/deployments/{id_or_name}/text/generation_stream.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating the moderations

  1. The moderations request for input and output are now objects that contain the enabled and threshold properties, as well as additional properties specific to a moderation. The following are some examples of how to migrate the moderations request section.

    {
      "moderations": {
        "hap": {
          "input": false,
          "output": true,
          "threshold": 0.5
        },
        "pii": {
          "input": true,
          "output": true,
          "mask": {
            "remove_entity_value": true
          }
        }
      }
    }
    

    becomes

    {
      "moderations": {
        "hap": {
          "output": {
            "enabled": true,
            "threshold": 0.5
          }
        },
        "pii": {
          "input": {
            "enabled": true
          },
          "output": {
            "enabled": true
          },
          "mask": {
            "remove_entity_value": true
          }
        }
      }
    }
    

Migrating /ml/v1-beta/text/tokenization

  1. Change the path from /ml/v1-beta/text/tokenization to /ml/v1/text/tokenization.

Migrating /ml/v1-beta/foundation_model_specs

  1. Change the path from /ml/v1-beta/foundation_model_specs to /ml/v1/foundation_model_specs.

Migrating /ml/v1-beta/foundation_model_tasks

  1. Change the path from /ml/v1-beta/foundation_model_tasks to /ml/v1/foundation_model_tasks.

Methods

Create a new watsonx.ai deployment

Create a new deployment, currently the only supported type is online. If this is a deployment for a prompt tune then the asset object must exist and the id must be the id of the model that was created after the prompt training. If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed.

POST /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The deployment request entity.

The following important fields are described for each use case:

  1. Prompt template: (deployed_asset_type is foundation_model)
    • base_model_id: required
    • promt_template.id: required
    • online: required
    • hardware_spec: forbidden
  2. Prompt tune: (deployed_asset_type is prompt_tune)
    • asset.id: required
    • online: required
    • hardware_spec: forbidden
    • base_model_id: forbidden
Examples:
prompt_tuning
prompt_template

Response

A deployment resource.

Status Code

  • Deployment created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }

Retrieve the deployments

Retrieve the list of deployments for the specified space or project.

GET /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Retrieves the deployment, if any, that contains this serving_name.

    Example: classification

  • Retrieves only the resources with the given tag value.

  • Retrieves only the resources with the given asset_id, asset_id would be the model id.

  • Retrieves only the resources with the given prompt_template_id.

  • Retrieves only the resources with the given name.

  • Retrieves the resources filtered with the given type. There are the deployment types as well as an additional prompt_template if the deployment type includes a prompt template.

    The supported deployment types are (see the description for deployed_asset_type in the deployment entity):

    1. prompt_tune - when a prompt tuned model is deployed.
    2. foundation_model - when a prompt template is used on a pre-deployed IBM provided model.

    These can be combined with the flag prompt_template like this:

    1. type=prompt_tune - return all prompt tuned model deployments.
    2. type=prompt_tune and prompt_template - return all prompt tuned model deployments with a prompt template.
    3. type=foundation_model - return all prompt template deployments.
    4. type=foundation_model and prompt_template - return all prompt template deployments - this is the same as the previous query because a foundation_model can only exist with a prompt template.
    5. type=prompt_template - return all deployments with a prompt template.
  • Retrieves the resources filtered by state. Allowed values are initializing, updating, ready and failed.

  • Returns whether serving_name is available for use or not. This query parameter cannot be combined with any other parameter except for serving_name.

    Default: false

Response

The deployment resources.

Status Code

  • OK.

  • serving_name is available for use. Returned when serving_name and conflict query parameters are used.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

  • Returned when serving_name and conflict query parameters are used. The response body will contain the reason.

Example responses
  • {
      "limit": 10,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments"
      },
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "created_at": "2023-05-02T16:27:51Z",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "name": "text_classification",
            "description": "Classification prompt tuned model deployment",
            "tags": [
              "classification"
            ]
          },
          "entity": {
            "asset": {
              "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
            },
            "deployed_asset_type": "prompt_tune",
            "online": {},
            "base_model_id": "google/flan-t5-xl",
            "status": {
              "state": "ready",
              "message": {
                "level": "info",
                "text": "The deployment is successful"
              },
              "inference": [
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
                },
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
                  "sse": true
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the deployment details

Retrieve the deployment details with the specified identifier.

GET /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.read

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

A deployment resource.

Status Code

  • Deployment details.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }

Update the deployment metadata

Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.

  • /name
  • /description
  • /tags
  • /custom
  • /online/parameters
  • /asset
  • /prompt_template
  • /hardware_spec

The PATCH operation with path specified as /online/parameters can be used to update the serving_name.

Patching /asset or /prompt_template should normally be used in the case when these fields already exist.

PATCH /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.update

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

The json patch.

Response

A deployment resource.

Status Code

  • Deployment accepted

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {
          "parameters": {
            "serving_name": "classification"
          }
        },
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/classification/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/classification/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }

Delete the deployment

Delete the deployment with the specified identifier.

DELETE /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.delete

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Status Code

  • Deployment deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters. If a serving_name is used then it must match the serving_name that is returned in the inference when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The project or space for the deployment must have a WML instance that will be used for limits and billing (if a paid plan).

    Example: classification

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:
prompt_tune
prompt_tune_with_moderations
prompt_template
  • curl --location --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --location --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details for a prompt tune.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details for a prompt tune with moderations.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }
  • The generated text from the model along with other details for a prompt template.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events. If a serving_name is used then it must match the serving_name that is returned in the inference when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned, also the rank and top_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Custom Headers

  • Allowable values: [application/json,text/event-stream]

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The project or space for the deployment must have a WML instance that will be used for limits and billing (if a paid plan).

    Example: classification

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:
View
  • curl --location --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --location --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

A set of server sent events, each event contains a response for one or more tokens.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details.

    [
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.673Z",
        "results": [
          {
            "generated_text": "",
            "generated_token_count": 4,
            "input_token_count": 0,
            "stop_reason": "eos_token"
          }
        ]
      },
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.647Z",
        "results": [
          {
            "generated_text": " km",
            "generated_token_count": 3,
            "input_token_count": 0,
            "stop_reason": "not_finished"
          }
        ]
      },
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.647Z",
        "results": [
          {
            "generated_text": "4,000",
            "generated_token_count": 2,
            "input_token_count": 0,
            "stop_reason": "not_finished"
          }
        ]
      }
    ]

List the available foundation models

Retrieve the list of deployed foundation models.

GET /ml/v1/foundation_model_specs

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The models that are currently deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02"
      },
      "resources": [
        {
          "model_id": "bigcode/starcoder",
          "label": "starcoder-15.5b",
          "provider": "BigCode",
          "source": "Hugging Face",
          "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions",
          "tasks": [
            {
              "id": "code",
              "ratings": {
                "quality": 3
              }
            }
          ],
          "min_shot_size": 0,
          "tier": "class_2",
          "number_params": "15.5b"
        }
      ]
    }

List the supported tasks

Retrieve the list of tasks that are supported by the foundation models.

GET /ml/v1/foundation_model_tasks

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The tasks that are currently supported by models deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02"
      },
      "resources": [
        {
          "task_id": "question_answering",
          "label": "Question answering",
          "rank": 1,
          "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance."
        }
      ]
    }

Create a new notebook.

Create a new notebook

  • either from scratch
  • or by copying another notebook.

To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to the project Cloud Object Storage (COS) and then reference it with the attribute file_reference. The other required attributes are name, project and runtime. The attribute runtime is used to specify the environment on which the notebook runs.

To copy a notebook, you only need to provide name and source_guid in the request body.

POST /v2/notebooks

Request

Specification of the notebook to be created.

Example:
createNewNotebook

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Created and returned a new notebook asset. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

  • The number of requests has exceeded the rate limit.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "notebook",
            "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "rate_limit",
          "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later."
        }
      ]
    }

Retrieve the details of a large number of notebooks inside a project.

Retrieve the details of a large number of notebooks inside a project.

POST /v2/notebooks/list

Request

Query Parameters

  • The guid of the project.

  • Additional info that will be included into the notebook details. Possible values are:

    • runtime

Payload for a notebook list request.

Examples:
listNotebooks

Response

A list of notebook info as returned by a list query.

Status Code

  • Success. Returned a list of notebook assets. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "41d09a9a-f771-48a2-9534-50c0c622356d",
            "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d"
          },
          "entity": {
            "runtime": {
              "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "spark_monitoring_enabled": true
            },
            "asset": {
              "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
              "asset_type": "notebook",
              "created_at": "2021-07-01T12:37:01Z",
              "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
              "version": 2,
              "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
            }
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a particular notebook, including the notebook asset.

Delete a particular notebook, including the notebook asset.

DELETE /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Response

Status Code

  • Successful request. Notebook is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Revert the main notebook to a version.

Revert the main notebook to a version.

PUT /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the main notebook.

Payload for a request to revert to a specific notebook version.

Examples:
revertNotebooks

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Reverted the main notebook to a version. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook v4.2",
        "description": "this is my notebook v4.2",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Update a particular notebook.

Update a particular notebook.

PATCH /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Payload for a notebook update request.

Examples:
updateNotebook

Response

Notebook information as returned by a GET request.

Status Code

  • Success. Updated the notebook. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
          "spark_monitoring_enabled": false
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new version.

Create a version of a given notebook.

POST /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A notebook version in a project.

Status Code

  • Success. Returned the notebook version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

List the versions of a notebook.

List all versions of a particular notebook.

GET /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A list of notebook versions in a project.

Status Code

  • Success. Returned a list of versions of the notebook.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
            "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
            "created_at": 1543681714106
          },
          "entity": {
            "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
            "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
            "created_by_iui": "IBMid-123456ABCD",
            "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
            "rev_id": 1
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Retrieve a notebook version.

Retrieve a particular version of a notebook.

GET /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

A notebook version in a project.

Status Code

  • Success. Returned the version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a notebook version.

Delete a particular version of a given notebook.

DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

Status Code

  • Success. The version is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new prompt / prompt template

This creates a new prompt with the provided parameters.

POST /v1/prompts

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt

This retrieves a prompt / prompt template with the given id.

GET /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Only return a set of model parameters compatiable with inferencing

    Default: true

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt

This updates a prompt / prompt template with the given id.

PATCH /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt

This delets a prompt / prompt template with the given id.

DELETE /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt lock modifications

Modifies the current locked state of a prompt.

PUT /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt lock status

Retrieves the current locked state of a prompt.

GET /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get the inference input string for a given prompt

Computes the inference input string based on state of a prompt. Optionally replaces template params

POST /v1/prompts/{prompt_id}/input

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt

This adds new chat items to the given prompt.

POST /v1/prompts/{prompt_id}/chat_items

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Create a new prompt session

This creates a new prompt session.

POST /v1/prompt_sessions

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session

This retrieves a prompt session with the given id.

GET /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Include the most recent entry

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt session

This updates a prompt session with the given id.

PATCH /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session

This deletes a prompt session with the given id.

DELETE /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new prompt to a prompt session

This creates a new prompt associated with the given session.

POST /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get entries for a prompt session

List entries from a given session.

GET /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Bookmark from a previously limited get request

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Limit for results to retrieve, default 20

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Success - Returned when search completes

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt session entry

This adds new chat items to the given entry.

POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt session lock modifications

Modifies the current locked state of a prompt session.

PUT /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt session lock status

Retrieves the current locked state of a prompt session.

GET /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session entry

This retrieves a prompt session entry with the given id.

GET /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session entry

This deletes a prompt session entry with the given id.

DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Generate embeddings

Generate embeddings from text input.

POST /ml/v1/text/embeddings

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-embeddings.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The text input for a given model to be used to generate the embeddings.

Examples:
request
  • curl --location --get 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "inputs": [
        "Youth craves thrills while adulthood cherishes wisdom.",
        "Youth seeks ambition while adulthood finds contentment.",
        "Dreams chased in youth while goals pursued in adulthood."
      ],
      "model_id": "slate",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "slate",
      "results": [
        {
          "embedding": [
            -0.006929283,
            -0.005336422,
            -0.024047505
          ]
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 10
    }

Detect text similarities

Detect similarities between source strings and target strings.

POST /ml/v1/text/similarities

Auditing

Calling this method generates the following auditing event.

  • pm-20.tbd.tbd

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The strings to be used to detect text similarities.

Examples:
request
  • curl --location --get 'https://{cluster_url}/ml/v1/text/similarities?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "sources": [
        "Youth is not about wisdom, contentment, and pursuit of goals.",
        "Youth is absolutely boring and unambitious."
      ],
      "targets": [
        "Youth craves thrills while adulthood cherishes wisdom.",
        "Youth seeks ambition while adulthood finds contentment."
        "Dreams chased in youth while goals pursued in adulthood."
      ],
      "model_id": "slate",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of similarity ratings for each source string.

    {
      "model_id": "slate",
      "results": [
        {
          "score": 0.7527929544448853
        },
        {
          "score": 0.7988557815551758
        },
        {
          "score": 0.177557945251
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 14
    }

Generate reranks

Rerank texts based on some queries.

POST /ml/v1/text/reranks

Auditing

Calling this method generates the following auditing event.

  • pm-20.tbd.tbd

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input texts and the queries for reranking.

Examples:
request
  • curl --location --get 'https://{cluster_url}/ml/v1/text/reranks?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "inputs": [
        {
           "title": "Youthful Adventures vs. Grownup Routine",
           "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I have come to appreciate the comforting stability of a well-established routine.",
        },
        {
           "title": "Thrill-Seeking Youth and Wisdom of Adulthood",
           "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of lifes novelties, while as a responsible adult, I have come to understand the profound value of accumulated wisdom and life experience.",
        }
      ],
      "queries": [
        "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit."
      ],
      "parameters": {
        "top_n": 2
      },
      "model_id": "sentence-transformers/all-MiniLM",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "sentence-transformers/all-MiniLM",
      "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
      "results": [
        {
          "score": 0.8274
        },
        {
          "score": 0.7461
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 20
    }

Infer text

Infer the next tokens for a given deployed model with a set of parameters.

POST /ml/v1/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:
request
moderations_request
  • curl --location --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl,",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

POST /ml/v1/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Custom Headers

  • Allowable values: [application/json,text/event-stream]

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:
request
moderations_request
  • curl --location --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl,",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

A set of server sent events, each event contains a response for one or more tokens.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details.

    [
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.673Z",
        "results": [
          {
            "generated_text": "",
            "generated_token_count": 4,
            "input_token_count": 0,
            "stop_reason": "eos_token"
          }
        ]
      },
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.647Z",
        "results": [
          {
            "generated_text": " km",
            "generated_token_count": 3,
            "input_token_count": 0,
            "stop_reason": "not_finished"
          }
        ]
      },
      {
        "model_id": "google/flan-ul2",
        "created_at": "2023-07-21T19:17:36.647Z",
        "results": [
          {
            "generated_text": "4,000",
            "generated_token_count": 2,
            "input_token_count": 0,
            "stop_reason": "not_finished"
          }
        ]
      }
    ]

Text tokenization

The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.

POST /ml/v1/text/tokenization

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-tokenization.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input string to tokenize.

Examples:
request
  • curl --location --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-ul2,",
      "input": "Write a tagline for an alumni association: Together we",
      "parameters": {
        "return_tokens": true
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

The tokenization result.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The response with the token count and the tokens, if requested.

    {
      "model_id": "google/flan-ul2",
      "result": {
        "token_count": 11,
        "tokens": [
          "Write",
          "a",
          "tag",
          "line",
          "for",
          "an",
          "alumni",
          "associ",
          "ation:",
          "Together",
          "we"
        ]
      }
    }

Create a new watsonx.ai training

Create a new watsonx.ai training in a project or a space.

The details of the base model and parameters for the training must be provided in the prompt_tuning object.

In order to deploy the tuned model you need to follow the following steps:

  1. Create a WML model asset, in a space or a project, by providing the request.json as shown below:

    curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
      -H "Authorization: Bearer <replace with your token>" \
      -H "content-type: application/json" \
      --data '{
         "name": "replace_with_a_meaningful_name",
         "space_id": "replace_with_your_space_id",
         "type": "prompt_tune_1.0",
         "software_spec": {
           "name": "watsonx-textgen-fm-1.0"
         },
         "metrics": [ from the training job ],
         "training": {
           "id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
           "base_model": {
             "model_id": "google/flan-t5-xl"
           },
           "task_id": "generation",
           "verbalizer": "Input: {{input}} Output:"
         },
         "training_data_references": [
           {
             "connection": {
               "id": "20933468-7e8a-4706-bc90-f0a09332b263"
             },
             "id": "file_to_tune1.json",
             "location": {
               "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
               "path": "file_to_tune1.json"
             },
             "type": "connection_asset"
           }
         ]
       }'
    

    Notes:

    1. If you used the training request field auto_update_model: true then you can skip this step as the model will have been saved at the end of the training job.
    2. Rather than creating the payload for the model you can use the generated request.json that was stored in the results_reference field, look for the path in the field entity.results_reference.location.model_request_path.
    3. The model type must be prompt_tune_1.0.
    4. The software spec name must be watsonx-textgen-fm-1.0.
  2. Create a tuned model deployment as described in the create deployment documentation.

POST /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The training_data_references contain the training datasets and the results_reference the connection where results will be stored.

Examples:
TrainingPromptTuningRequest
  • curl --location --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "name": "my-prompt-tune-training",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "prompt_tuning": {
        "base_model": {
          "model_id": "google/flan-t5-xl"
        },
        "task_id": "classification",
        "tuning_type": "prompt_tuning",
        "num_epochs": 30,
        "learning_rate": 0.4,
        "accumulate_steps": 3,
        "batch_size": 10,
        "max_input_tokens": 100,
        "max_output_tokens": 100
      },
      "training_data_references": [
        {
          "id": "tune1_data.json",
          "location": {
            "path": "tune1_data.json"
          },
          "type": "container"
        }
      ],
      "auto_update_model": true,
      "results_reference": {
        "location": {
          "path": "tune1/results"
        },
        "type": "container"
      }
    }'

Response

Training resource.

Status Code

  • The training job has been created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Retrieve the list of trainings

Retrieve the list of trainings for the specified space or project.

GET /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • Compute the total count. May have performance impact.

  • Return only the resources with the given tag value.

  • Filter based on on the training job state.

    Allowable values: [queued,pending,running,storing,completed,failed,canceled]

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Information for paging when querying resources.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "limit": 100,
      "first": {
        "href": "https://{cluster_url}/ml/v4/trainings"
      },
      "total_count": 1,
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "name": "my-prompt-training",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "created_at": "2023-08-04T13:22:47.000Z"
          },
          "entity": {
            "prompt_tuning": {
              "base_model": {
                "model_id": "google/flan-t5-xl"
              },
              "task_id": "classification"
            },
            "training_data_references": [
              {
                "id": "tune1_data.json",
                "location": {
                  "path": "tune1_data.json"
                },
                "type": "container"
              }
            ],
            "auto_update_model": true,
            "results_reference": {
              "location": {
                "path": "tune1/results",
                "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
                "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
                "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
                "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
              },
              "type": "container"
            },
            "status": {
              "state": "completed",
              "running_at": "2023-08-04T13:22:48.000Z",
              "completed_at": "2023-08-04T13:22:55.289Z",
              "metrics": [
                {
                  "iteration": 0,
                  "ml_metrics": {
                    "loss": 4.49988
                  },
                  "timestamp": "2023-09-22T02:52:03.324Z"
                },
                {
                  "iteration": 1,
                  "ml_metrics": {
                    "loss": 3.86884
                  },
                  "timestamp": "2023-09-22T02:52:03.689Z"
                },
                {
                  "iteration": 2,
                  "ml_metrics": {
                    "loss": 4.05115
                  },
                  "timestamp": "2023-09-22T02:52:04.053Z"
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the training

Retrieve the training with the specified identifier.

GET /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.get

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Training resource.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Cancel the training

Cancel the specified training and remove it.

DELETE /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.delete

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression [a-zA-Z0-9-]*

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job metadata information.

Response

Status Code

  • Training cancelled.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create an AutoRAG job

Create an AutoRAG job that will find the best RAG pattern from the data that is provided in the request.

POST /ml/v1/autorag

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The details of the AutoRAG job with the data used to find the best RAG patterns.

Response

The response of an AutoRAG job.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get an AutoRAG job

Get the results of an AutoRAG job, or details if the job failed.

GET /ml/v1/autorag/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

Response

The response of an AutoRAG job.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete an AutoRAG job

Delete an AutoRAG job if it exists, once deleted all trace of the job is gone.

DELETE /ml/v1/autorag/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

Response

Status Code

  • Deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a geospatial transformation

Create a geospatial transformation from inputs using a given model.

POST /ml/v1/geospatial/transformations_async

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The details of the inputs to transform with the given model.

Response

The response from a geospatial transformation request.

Status Code

  • Accepted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get a geospatial transformation

Get a geospatial transformation.

GET /ml/v1/geospatial/transformations/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

Response

The response from a geospatial transformation request.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete a geospatial transformation

Delete a geospatial transformation.

DELETE /ml/v1/geospatial/transformations/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

Response

Status Code

  • Deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.