IBM Cloud API Docs

Introduction to IBM watsonx.ai as a Service

Using IBM watsonx.ai as a Service APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).

If you are looking for the IBM watsonx.ai software APIs, see here.

Step-by-step instructions on how to use IBM watsonx.ai as a Service can be found here.

There is a specialized python library that is available to access this REST API.

Endpoint URLs

The following URL represents the base URLs for the watsonx.ai API endpoints. When you call the API, use the URL and add the path for each method to form the complete API endpoint for your requests.

  • Dallas: https://us-south.ml.cloud.ibm.com
  • Frankfurt - https://eu-de.ml.cloud.ibm.com
  • London - https://eu-gb.ml.cloud.ibm.com
  • Tokyo - https://jp-tok.ml.cloud.ibm.com

Note that for prompts and notebooks the base URLs are the following:

  • Dallas: https://api.dataplatform.cloud.ibm.com/wx
  • Frankfurt - https://api.eu-de.dataplatform.cloud.ibm.com/wx
  • London - https://api.eu-gb.dataplatform.cloud.ibm.com/wx
  • Tokyo - https://api.jp-tok.dataplatform.cloud.ibm.com/wx

Example request to a Dallas endpoint:

curl -H "Authorization: Bearer {token}" -X {request_method} "https://us-south.ml.cloud.ibm.com/{method_endpoint}"

Replace {request_method}, and {method_endpoint} in this example with the values for your particular API call. See the Authentication section below for more details about the bearer {token}.

Authentication

This API uses IBM Cloud Identity and Access Management (IAM) to authenticate requests.

To work with the API, authenticate your application or service by including your IBM Cloud IAM access token in API requests.

IAM authentication. Replace {token} and {url}/{method} with your service credentials.

curl -H "Authorization:Bearer {token}" -X "{url}/{method}"

Authorization: Bearer {token}

For example, if the token is tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5 in the service credentials, include the credentials in your call like this:

curl -H "Authorization:Bearer tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5" -X "https://us-south.ml.cloud.ibm.com/ml/v4/models"

Error handling

This API uses standard HTTP response codes to indicate whether a method completed successfully. A 200 type response indicates success.

HTTP Code Description Recovery
200 Success The request was successful.
400 Bad Request The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body.
401 Unauthorized You are not authorized to make this request. Log in and try again or provide a valid token. See Authenticating with IAM tokens for instructions on logging in. If this error persists, contact the account owner to check your permissions.
403 Forbidden The supplied authentication is not authorized.
404 Not Found The requested resource could not be found.

Note that 429 and 503 errors may mean that the model is overloaded or unavailable, check the error description for more details.

Error response

Name Description
trace An identifier that can be used to trace the request. This can be set using X-Global-Transaction-Id.
errors The list of errors.

Errors

Name Description
code A simple string code that should convey the general sense of the error.
message The message that describes the error.
more_info A reference to a more detailed explanation when available.

Additional headers

Some additional headers might be required to make successful requests to the API. Those additional headers are described below.

An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.

If there is not a transaction ID that is passed in, then one is generated randomly.

API change log

In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API. The change log lists changes that have been made, ordered by the date they were released. Changes to existing API versions are designed to be compatible with existing client applications, if this is not the case then a new version date will be created.

14 March 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically.

18 April 2024

The /ml/v1/text/embeddings API was added to watsonx.ai, this is a non-breaking change and just adds this single API operation.

Versioning

API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.

When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.

Active Version Dates

Version date Summary of changes
2024-03-14 Publication of the /ml/v1 APIs.

Data References

Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires the use of connection_asset or data_asset reference types. These reference types are created within a space or a project and are referenced in WML requests to represent input data and results locations. These types contain two parameter objects, connection and location, which require different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied to the location object whereas using a connection_asset requires the connection_id for the connection object and different location fields depending on the data source type.

Example connection_asset payload:

{
  "training_data_references": [
    {
      "type": "connection_asset",
      "connection": {
        "id": "<connection_guid>"
      },
      "location": {
        "<wdp-properties depending on the type>": "<value depending on the type>"
      }
    }
  ]
}

Example data_asset payload:

{
  "training_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/<asset_id>?space_id=<space_id>"
      }
    }
  ]
}

Activity Tracker events

You can monitor API activity within your account by using the IBM Cloud Activity Tracker service. Whenever an API method is called, an event is generated that you can then track and audit from within Activity Tracker. The specific event type is listed for each individual method.

Migrating APIs

The watsonx.ai API has changed between the beta and the GA, this section describes the changes required in order to migrate from the beta API to the GA API.

Migrating /ml/v1-beta/generation/text

  1. Change the path from /ml/v1-beta/generation/text to /ml/v1/text/generation.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/generation/text_stream

  1. Change the path from /ml/v1-beta/generation/text_stream to /ml/v1/text/generation_stream.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/deployments/{id_or_name}/generation/text

  1. Change the path from /ml/v1-beta/deployments/{id_or_name}/generation/text to /ml/v1/deployments/{id_or_name}/text/generation.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating /ml/v1-beta/deployments/{id_or_name}/generation/text_stream

  1. Change the path from /ml/v1-beta/deployments/{id_or_name}/generation/text_stream to /ml/v1/deployments/{id_or_name}/text/generation_stream.
  2. Change the following fields in the response:
    1. moderation -> moderations.
  3. Change the moderations request section as described in the section below.

Migrating the moderations

  1. The moderations request for input and output are now objects that contain the enabled and threshold properties, as well as additional properties specific to a moderation. The following are some examples of how to migrate the moderations request section.

    {
      "moderations": {
        "hap": {
          "input": false,
          "output": true,
          "threshold": 0.5
        },
        "pii": {
          "input": true,
          "output": true,
          "mask": {
            "remove_entity_value": true
          }
        }
      }
    }
    

    becomes

    {
      "moderations": {
        "hap": {
          "output": {
            "enabled": true,
            "threshold": 0.5
          }
        },
        "pii": {
          "input": {
            "enabled": true
          },
          "output": {
            "enabled": true
          },
          "mask": {
            "remove_entity_value": true
          }
        }
      }
    }
    

Migrating /ml/v1-beta/text/tokenization

  1. Change the path from /ml/v1-beta/text/tokenization to /ml/v1/text/tokenization.

Migrating /ml/v1-beta/foundation_model_specs

  1. Change the path from /ml/v1-beta/foundation_model_specs to /ml/v1/foundation_model_specs.

Migrating /ml/v1-beta/foundation_model_tasks

  1. Change the path from /ml/v1-beta/foundation_model_tasks to /ml/v1/foundation_model_tasks.

Methods

Create a new AI service

Create a new AI service with the given payload. A AI service is some code that can be deployed as a deployment.

POST /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

Payload for creating the AI service. Either space_id or project_id has to be provided and is mandatory.

Examples:

A sample request.

{
  "name": "ai-app-1",
  "software_spec": {
    "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
  },
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "documentation": {
    "request": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "parameters": {
            "properties": {
              "max_new_tokens": {
                "type": "integer"
              },
              "top_p": {
                "type": "number"
              }
            },
            "required": [
              "max_new_tokens",
              "top_p"
            ]
          }
        },
        "required": [
          "query"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "image": {
            "type": "string",
            "format": "binary"
          }
        },
        "required": [
          "image"
        ]
      }
    },
    "response": {
      "application/json": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "object",
        "properties": {
          "query": {
            "type": "string"
          },
          "result": {
            "type": "string"
          }
        },
        "required": [
          "query",
          "result"
        ]
      },
      "application/png": {
        "$schema": "http://json-schema.org/draft-07/schema#",
        "type": "string",
        "format": "binary"
      }
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v4/ai_services?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "name": "ai-service-1",
      "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "software_spec": {
        "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
      },
      "documentation": {
        "request": {
          "application/json": {
            "$schema": "http://json-schema.org/draft-07/schema#",
            "type": "object",
            "properties": {
              "query": {
                "type": "string"
              },
              "parameters": {
                "properties": {
                  "max_new_tokens": {
                    "type": "integer"
                  },
                  "top_p": {
                    "type": "number"
                  }
                },
                "required": [
                  "max_new_tokens",
                  "top_p"
                ]
              }
            },
            "required": [
              "query"
            ]
          },
          "application/png": {
            "$schema": "http://json-schema.org/draft-07/schema#",
            "type": "object",
            "properties": {
              "image": {
                "type": "string",
                "format": "binary"
              }
            },
            "required": [
              "image"
            ]
          }
        },
        "response": {
          "application/json": {
            "$schema": "http://json-schema.org/draft-07/schema#",
            "type": "object",
            "properties": {
              "query": {
                "type": "string"
              },
              "result": {
                "type": "string"
              }
            },
            "required": [
              "query",
              "result"
            ]
          },
          "application/png": {
            "$schema": "http://json-schema.org/draft-07/schema#",
            "type": "string",
            "format": "binary"
          }
        }
      }
    }'

Response

The information for a flow.

Status Code

  • AI service created

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The response with the result.

    {
      "metadata": {
        "id": "b53c5118-b1ca-43ef-a597-ef839ff7129f",
        "name": "ai-app-1",
        "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-05-02T16:27:51Z"
      },
      "entity": {
        "software_spec": {
          "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
        },
        "documentation": {
          "request": {
            "application/json": {
              "$schema": "http://json-schema.org/draft-07/schema#",
              "type": "object",
              "properties": {
                "query": {
                  "type": "string"
                },
                "parameters": {
                  "properties": {
                    "max_new_tokens": {
                      "type": "integer"
                    },
                    "top_p": {
                      "type": "number"
                    }
                  },
                  "required": [
                    "max_new_tokens",
                    "top_p"
                  ]
                }
              },
              "required": [
                "query"
              ]
            },
            "application/png": {
              "$schema": "http://json-schema.org/draft-07/schema#",
              "type": "object",
              "properties": {
                "image": {
                  "type": "string",
                  "format": "binary"
                }
              },
              "required": [
                "image"
              ]
            }
          },
          "response": {
            "application/json": {
              "$schema": "http://json-schema.org/draft-07/schema#",
              "type": "object",
              "properties": {
                "query": {
                  "type": "string"
                },
                "result": {
                  "type": "string"
                }
              },
              "required": [
                "query",
                "result"
              ]
            },
            "application/png": {
              "$schema": "http://json-schema.org/draft-07/schema#",
              "type": "string",
              "format": "binary"
            }
          }
        }
      }
    }

Retrieve the AI services

Retrieve the AI services for the specified space or project.

GET /ml/v4/ai_services

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • Return only the resources with the given tag values, separated by or or and to support multiple tags.

    Example: tf2.0 or tf2.1

  • Returns only resources that match this search string. The path to the field must be the complete path to the field, and this field must be one of the indexed fields for this resource type. Note that the search string must be URL encoded.

    Possible values: length ≥ 1

Response

A paginated list of AI services.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service

Retrieve the AI service with the specified identifier. If rev query parameter is provided, rev=latest will fetch the latest revision. A call with rev={revision_number} will fetch the given revision_number record. Either space_id or project_id has to be provided and is mandatory.

GET /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.read

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • The revision number of the resource.

    Example: 2

Response

The information for a flow.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Update the AI service

Update the AI service with the provided patch data. The following fields can be patched:

  • /tags
  • /name
  • /description
  • /custom
PATCH /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.update

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Input For Patch. This is the patch body which corresponds to the JavaScript Object Notation (JSON) Patch standard (RFC 6902).

Response

The information for a flow.

Status Code

  • AI service has been patched successfully

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the AI service

Delete the AI service with the specified identifier. This will delete all revisions of this flow as well. For each revision all attachments will also be deleted.

DELETE /ml/v4/ai_services/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.delete

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Status Code

  • AI service deleted

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Upload the AI service code

Upload the flow code. AI services expect a zip file that contains the code files that make up the flow.

PUT /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.add

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

A gzip file containing code files.

Response

The metadata related to the attachment.

Status Code

  • AI service code uploaded

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Download the AI service code

Download the AI service code. It is possible to download the code for a given revision of the flow. AI services expect a zip file that contains the code files that make up the flow.

GET /ml/v4/ai_services/{id}/code

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.read

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • The revision number of the resource.

    Example: 2

Response

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new AI service revision

Create a new AI service revision. The current metadata and content for id will be taken and a new revision created. Either space_id or project_id has to be provided and is mandatory.

POST /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.create

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The details for the revision.

Examples:
{
  "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f",
  "commit_message": "Updated for TF 2.0"
}

Response

The information for a flow.

Status Code

  • AI service revision created

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AI service revisions

Retrieve the AI service revisions.

GET /ml/v4/ai_services/{id}/revisions

Auditing

Calling this method generates the following auditing event.

  • pm-20.ai_service.list

Request

Path Parameters

  • AI service identifier.

    Example: 64dc8921-345f-234b-462d-78e41246987f

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

A paginated list of AI services.

Status Code

  • AI service revisions

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a new watsonx.ai deployment

Create a new deployment, currently the only supported type is online.

If this is a deployment for a prompt tune then the asset object must exist and the id must be the id of the model that was created after the prompt training.

If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed.

POST /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The deployment request entity.

The following important fields are described for each use case:

  1. Prompt template:
    • base_model_id: required
    • promt_template.id: required
    • online: required
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • response deployed_asset_type: foundation_model
  2. Prompt tune:
    • asset.id: required
    • online: required
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • base_model_id: forbidden
    • response deployed_asset_type: prompt_tune
  3. Custom foundation model:
    • asset.id: required
    • online: required
    • online.parameters.foundation_model: optional
    • hardware_spec: forbidden
    • hardware_request: required
    • base_model_id: forbidden
    • base_deployment_id: forbidden
    • response deployed_asset_type: custom_foundation_model
Examples:

Create a prompt tune deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "text_classification",
  "asset": {
    "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
  },
  "online": {}
}

Create a prompt template deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "text_classification",
  "base_model_id": "google/flan-ul2",
  "prompt_template": {
    "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
  },
  "online": {}
}

Create a custom foundation model deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "my_tuned_flan",
  "asset": {
    "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
  },
  "online": {
    "parameters": {
      "serving_name": "myflan"
    }
  }
}

Response

A deployment resource.

Status Code

  • Deployment created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "my_tuned_flan"
      },
      "entity": {
        "asset": {
          "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
        },
        "online": {
          "parameters": {
            "serving_name": "myflan"
          }
        },
        "deployed_asset_type": "custom_foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }

Retrieve the deployments

Retrieve the list of deployments for the specified space or project.

GET /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Retrieves the deployment, if any, that contains this serving_name.

    Example: classification

  • Retrieves only the resources with the given tag value.

  • Retrieves only the resources with the given asset_id, asset_id would be the model id.

  • Retrieves only the resources with the given prompt_template_id.

  • Retrieves only the resources with the given name.

  • Retrieves the resources filtered with the given type. There are the deployment types as well as an additional prompt_template if the deployment type includes a prompt template.

    The supported deployment types are (see the description for deployed_asset_type in the deployment entity):

    1. prompt_tune - when a prompt tuned model is deployed.
    2. foundation_model - when a prompt template is used on a pre-deployed IBM provided model.
    3. custom_foundation_model - when a custom foundation model is deployed.

    These can be combined with the flag prompt_template like this:

    1. type=prompt_tune - return all prompt tuned model deployments.
    2. type=prompt_tune and prompt_template - return all prompt tuned model deployments with a prompt template.
    3. type=foundation_model - return all prompt template deployments.
    4. type=foundation_model and prompt_template - return all prompt template deployments - this is the same as the previous query because a foundation_model can only exist with a prompt template.
    5. type=prompt_template - return all deployments with a prompt template.
  • Retrieves the resources filtered by state. Allowed values are initializing, updating, ready and failed.

  • Returns whether serving_name is available for use or not. This query parameter cannot be combined with any other parameter except for serving_name.

    Default: false

Response

The deployment resources.

Status Code

  • OK.

  • serving_name is available for use. Returned when serving_name and conflict query parameters are used.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

  • Returned when serving_name and conflict query parameters are used. The response body will contain the reason.

Example responses
  • {
      "limit": 10,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments"
      },
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "created_at": "2023-05-02T16:27:51Z",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "name": "text_classification",
            "description": "Classification prompt tuned model deployment",
            "tags": [
              "classification"
            ]
          },
          "entity": {
            "asset": {
              "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
            },
            "deployed_asset_type": "prompt_tune",
            "online": {},
            "base_model_id": "google/flan-t5-xl",
            "status": {
              "state": "ready",
              "message": {
                "level": "info",
                "text": "The deployment is successful"
              },
              "inference": [
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
                },
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
                  "sse": true
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the deployment details

Retrieve the deployment details with the specified identifier.

GET /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.read

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

A deployment resource.

Status Code

  • Deployment details.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "my_tuned_flan"
      },
      "entity": {
        "asset": {
          "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
        },
        "online": {
          "parameters": {
            "serving_name": "myflan"
          }
        },
        "deployed_asset_type": "custom_foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }

Update the deployment metadata

Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.

  • /name
  • /description
  • /tags
  • /custom
  • /online/parameters
  • /asset - replace only
  • /prompt_template - replace only
  • /hardware_spec
  • /hardware_request
  • /base_model_id - replace only (applicable only to prompt template deployments referring to IBM base foundation models)

The PATCH operation with path specified as /online/parameters can be used to update the serving_name.

PATCH /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.update

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

The json patch.

Response

A deployment resource.

Status Code

  • Deployment accepted

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the deployment

Delete the deployment with the specified identifier.

DELETE /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.delete

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Status Code

  • Deployment deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

A prompt tune request.

A prompt tune request.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "how far is paris from bangalore:\n",
  "parameters": {
    "max_new_tokens": 100
  }
}

A prompt tune request with moderations.

A prompt tune request with moderations.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}

A prompt template request.

A prompt template request.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "how far is paris from bangalore:\n",
  "parameters": {
    "max_new_tokens": 100
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details for a prompt tune.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details for a prompt tune with moderations.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }
  • The generated text from the model along with other details for a prompt template.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned, also the rank and top_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:
{
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "decoding_method": "sample",
    "temperature": 0.8,
    "max_new_tokens": 200
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

List the available foundation models

Retrieve the list of deployed foundation models.

GET /ml/v1/foundation_model_specs

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • A set of filters to specify the list of models, filters are described as the pattern shown below.

     pattern: tfilter[,tfilter][:(or|and)]
     tfilter: filter | !filter
       filter: Requires existence of the filter.
       !filter: Requires absence of the filter.
     filter: one of
       modelid_*:     Filters by model id.
                      Namely, select a model with a specific model id.
       provider_*:    Filters by provider.
                      Namely, select all models with a specific provider.
       source_*:      Filters by source.
                      Namely, select all models with a specific source.
       input_tier_*:  Filters by input tier.
                      Namely, select all models with a specific input tier.
       output_tier_*: Filters by output tier.
                      Namely, select all models with a specific output tier.
       tier_*:        Filters by tier.
                      Namely, select all models with a specific input or output tier.
       task_*:        Filters by task id.
                      Namely, select all models that support a specific task id.
       lifecycle_*:   Filters by lifecycle state.
                      Namely, select all models that are currently in the specified lifecycle state.
       function_*:    Filters by function. 
                      Namely, select all models that support a specific function.
    

    Possible values: 1 ≤ length ≤ 1000, Value must match regular expression ^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$

    Example: modelid_ibm/granite-13b-instruct-v2

  • See all the Tech Preview models if entitled.

    Default: false

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The models that are currently deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02"
      },
      "resources": [
        {
          "model_id": "bigcode/starcoder",
          "label": "starcoder-15.5b",
          "provider": "BigCode",
          "source": "Hugging Face",
          "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions",
          "tasks": [
            {
              "id": "code",
              "ratings": {
                "quality": 3
              }
            }
          ],
          "min_shot_size": 0,
          "input_tier": "class_2",
          "output_tier": "class_2",
          "number_params": "15.5b"
        }
      ]
    }

List the supported tasks

Retrieve the list of tasks that are supported by the foundation models.

GET /ml/v1/foundation_model_tasks

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The tasks that are currently supported by models deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02"
      },
      "resources": [
        {
          "task_id": "question_answering",
          "label": "Question answering",
          "rank": 1,
          "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance."
        }
      ]
    }

Create a new notebook.

Create a new notebook

  • either from scratch
  • or by copying another notebook.

To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to the project Cloud Object Storage (COS) and then reference it with the attribute file_reference. The other required attributes are name, project and runtime. The attribute runtime is used to specify the environment on which the notebook runs.

To copy a notebook, you only need to provide name and source_guid in the request body.

POST /v2/notebooks

Request

Specification of the notebook to be created.

Example:

Create a notebook from scratch in a project

{
  "name": "my notebook",
  "description": "this is my notebook",
  "project": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
  "file_reference": "notebook/my_notebook.ipynb",
  "runtime": {
    "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
    "spark_monitoring_enabled": true
  }
}

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Created and returned a new notebook asset. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

  • The number of requests has exceeded the rate limit.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "notebook",
            "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "rate_limit",
          "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later."
        }
      ]
    }

Retrieve the details of a large number of notebooks inside a project.

Retrieve the details of a large number of notebooks inside a project.

POST /v2/notebooks/list

Request

Query Parameters

  • The guid of the project.

  • Additional info that will be included into the notebook details. Possible values are:

    • runtime

Payload for a notebook list request.

Examples:

List notebooks

{
  "notebooks": [
    "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
  ]
}

Response

A list of notebook info as returned by a list query.

Status Code

  • Success. Returned a list of notebook assets. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "41d09a9a-f771-48a2-9534-50c0c622356d",
            "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d"
          },
          "entity": {
            "runtime": {
              "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "spark_monitoring_enabled": true
            },
            "asset": {
              "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
              "asset_type": "notebook",
              "created_at": "2021-07-01T12:37:01Z",
              "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
              "version": 2,
              "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
            }
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a particular notebook, including the notebook asset.

Delete a particular notebook, including the notebook asset.

DELETE /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Response

Status Code

  • Successful request. Notebook is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Revert the main notebook to a version.

Revert the main notebook to a version.

PUT /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the main notebook.

Payload for a request to revert to a specific notebook version.

Examples:

Revert the notebook to a version

{
  "source": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
}

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Reverted the main notebook to a version. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook v4.2",
        "description": "this is my notebook v4.2",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Update a particular notebook.

Update a particular notebook.

PATCH /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Payload for a notebook update request.

Examples:

Update a notebook

{
  "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
  "spark_monitoring_enabled": false,
  "kernel": {
    "display_name": "Python 3.9 with Spark",
    "name": "python39",
    "language": "python3"
  }
}

Response

Notebook information as returned by a GET request.

Status Code

  • Success. Updated the notebook. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
          "spark_monitoring_enabled": false
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new version.

Create a version of a given notebook.

POST /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A notebook version in a project.

Status Code

  • Success. Returned the notebook version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

List the versions of a notebook.

List all versions of a particular notebook.

GET /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A list of notebook versions in a project.

Status Code

  • Success. Returned a list of versions of the notebook.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
            "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
            "created_at": 1543681714106
          },
          "entity": {
            "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
            "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
            "created_by_iui": "IBMid-123456ABCD",
            "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
            "rev_id": 1
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Retrieve a notebook version.

Retrieve a particular version of a notebook.

GET /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

A notebook version in a project.

Status Code

  • Success. Returned the version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a notebook version.

Delete a particular version of a given notebook.

DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

Status Code

  • Success. The version is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new prompt / prompt template

This creates a new prompt with the provided parameters.

POST /v1/prompts

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt

This retrieves a prompt / prompt template with the given id.

GET /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Only return a set of model parameters compatiable with inferencing

    Default: true

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt

This updates a prompt / prompt template with the given id.

PATCH /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt

This delets a prompt / prompt template with the given id.

DELETE /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt lock modifications

Modifies the current locked state of a prompt.

PUT /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt lock status

Retrieves the current locked state of a prompt.

GET /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get the inference input string for a given prompt

Computes the inference input string based on state of a prompt. Optionally replaces template params

POST /v1/prompts/{prompt_id}/input

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt

This adds new chat items to the given prompt.

POST /v1/prompts/{prompt_id}/chat_items

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Create a new prompt session

This creates a new prompt session.

POST /v1/prompt_sessions

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session

This retrieves a prompt session with the given id.

GET /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Include the most recent entry

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt session

This updates a prompt session with the given id.

PATCH /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session

This deletes a prompt session with the given id.

DELETE /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new prompt to a prompt session

This creates a new prompt associated with the given session.

POST /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get entries for a prompt session

List entries from a given session.

GET /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Bookmark from a previously limited get request

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Limit for results to retrieve, default 20

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Success - Returned when search completes

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt session entry

This adds new chat items to the given entry.

POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt session lock modifications

Modifies the current locked state of a prompt session.

PUT /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt session lock status

Retrieves the current locked state of a prompt session.

GET /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session entry

This retrieves a prompt session entry with the given id.

GET /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session entry

This deletes a prompt session entry with the given id.

DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters.

POST /ml/v1/text/chat

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-chat.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

text_chat

A text chat example.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Who won the world series in 2020?"
    },
    {
      "role": "assistant",
      "content": "The Los Angeles Dodgers won the World Series in 2020."
    },
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "Where was it played?"
      }
    }
  ],
  "max_tokens": 100,
  "temperature": 0,
  "time_limit": 1000
}

tool_call

A tool calling example.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "What is the weather like in Boston today?"
      }
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "description": "The city, e.g. San Francisco, CA",
              "type": "string"
            },
            "unit": {
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "type": "string"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
    }
  }
}

json_mode

A text chat example with json output.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "response_format": {
    "type": "json_object"
  },
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant designed to output JSON."
    },
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "Who won the world series in 2020?"
      }
    }
  ]
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful assistant."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Who won the world series in 2020?"
            }
          ]
        },
        {
          "role": "assistant",
          "content": "The Los Angeles Dodgers won the World Series in 2020."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Where was it played?"
            }
          ]
        }
      ],
      "max_tokens": 100,
      "temperature": 0,
      "time_limit": 1000
    }'
    
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "messages": [
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "What is the weather like in Boston today?"
            }
          ]
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "description": "The city, e.g. San Francisco, CA",
                  "type": "string"
                },
                "unit": {
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ],
                  "type": "string"
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "tool_choice": {
        "type": "function",
        "function": {
          "name": "get_current_weather",
          "description": "Get the current weather for a location.
    Call this whenever you need to know the weather,
    or for example when a customer asks What is the weather like in New York"
        }
      }
    }'
    
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "response_format": {
        "type": "json_object"
      },
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful assistant designed to output JSON."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "user",
              "text": "Who won the world series in 2020?"
            }
          ]
        }
      ]
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • A text chat example.

    {
      "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 47,
        "prompt_tokens": 59,
        "total_tokens": 106
      }
    }
  • A tool calling example.

    {
      "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "tool_calls": [
              {
                "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99",
                "type": "function",
                "function": {
                  "name": "get_current_weather",
                  "arguments": "{\n  \"location\": \"Boston, MA\",\n  \"unit\": \"fahrenheit\"\n}\n"
                }
              }
            ]
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 18,
        "prompt_tokens": 19,
        "total_tokens": 37
      }
    }
  • A text chat example with json output.

    {
      "id": "cmpl-09945b25c805491fb49e15439b8e5d84",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 35,
        "prompt_tokens": 20,
        "total_tokens": 55
      }
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

POST /ml/v1/text/chat_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-chat.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate embeddings

Generate embeddings from text input.

See the documentation for a description of text embeddings.

POST /ml/v1/text/embeddings

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-embeddings.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The text input for a given model to be used to generate the embeddings.

Examples:

A sample request.

A simple request.

{
  "model_id": "slate",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    "Youth craves thrills while adulthood cherishes wisdom."
  ]
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    -d '{
      "inputs": [
        "Youth craves thrills while adulthood cherishes wisdom.",
        "Youth seeks ambition while adulthood finds contentment.",
        "Dreams chased in youth while goals pursued in adulthood."
      ],
      "model_id": "slate",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "slate",
      "results": [
        {
          "embedding": [
            -0.006929283,
            -0.005336422,
            -0.024047505
          ]
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 10
    }

Start a text extraction request

Start a request to extract text and metadata from documents.

See the documentation for a description of text extraction.

POST /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input for the text extraction request.

Examples:

simple_request

A simple request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "tables_processing": {
      "enabled": true
    }
  }
}

ocr_request

A simple request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "ocr": {
      "languages_list": [
        "en",
        "fr"
      ]
    },
    "tables_processing": {
      "enabled": false
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "document_reference": {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "file_name": "files/document.pdf"
        }
      },
      "results_reference": {
        "type": "connection_asset",
        "connection": {
          "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
        },
        "location": {
          "file_name": "results"
        }
      },
      "steps": {
        "tables_processing": {
          "enabled": true
        }
      }
    }'
    
  • curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "document_reference": {
        "type": "connection_asset",
        "connection": {
          "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
        },
        "location": {
          "file_name": "files/document.pdf"
        }
      },
      "results_reference": {
        "type": "connection_asset",
        "connection": {
          "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
        },
        "location": {
          "file_name": "results"
        }
      },
      "steps": {
        "ocr": {
          "languages_list": [
            "en"
          ]
        },
        "tables_processing": {
          "enabled": false
        }
      }
    }'
    

Response

The text extraction response.

Status Code

  • Created. The Content-Location header will contain the URI reference to the created resource.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "tables_processing": {
            "enabled": true
          }
        },
        "results": {
          "status": "submitted",
          "number_pages_processed": 0
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "ocr": {
            "languages_list": [
              "en",
              "fr"
            ]
          },
          "tables_processing": {
            "enabled": false
          }
        },
        "results": {
          "status": "submitted",
          "number_pages_processed": 0
        }
      }
    }

Retrieve the text extraction requests

Retrieve the list of text extraction requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

GET /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

A paginated list of resources.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "limit": 10,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions"
      },
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "created_at": "2023-05-02T16:27:51Z",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "name": "extract"
          },
          "entity": {
            "document_reference": {
              "type": "connection_asset",
              "connection": {
                "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
              },
              "location": {
                "file_name": "files/document.pdf"
              }
            },
            "results_reference": {
              "type": "connection_asset",
              "connection": {
                "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
              },
              "location": {
                "file_name": "results"
              }
            },
            "results": {
              "status": "completed",
              "number_pages_processed": 3,
              "running_at": "2023-05-02T16:28:03Z",
              "completed_at": "2023-05-02T16:29:31Z"
            }
          }
        }
      ]
    }

Get the results of the request

Retrieve the text extraction request with the specified identifier.

Note that there is a retention period of 2 days. If this retention period is exceeded then the request will be deleted and the results no longer available. In this case this operation will return 404.

GET /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.get

Request

Path Parameters

  • The identifier of the extraction request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • curl --request GET 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    

Response

The text extraction response.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "tables_processing": {
            "enabled": true
          }
        },
        "results": {
          "status": "running",
          "number_pages_processed": 2,
          "running_at": "2023-05-02T16:28:03Z"
        }
      }
    }

Delete the request

Cancel the specified text extraction request and delete any associated results.

DELETE /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.delete

Request

Path Parameters

  • The identifier of the extraction request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

  • curl --request DELETE 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    

Response

Status Code

  • Request deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters.

POST /ml/v1/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

A request without moderations.

A simple request.

{
  "model_id": "google/flan-ul2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "temperature": 0.8,
    "max_new_tokens": 30
  }
}

A request with moderations.

A simple request with moderations.

{
  "model_id": "google/flan-t5-xl",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

POST /ml/v1/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

A request without moderations.

A simple request.

{
  "model_id": "google/flan-ul2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "temperature": 0.8,
    "max_new_tokens": 30
  }
}

A request with moderations.

A simple request with moderations.

{
  "model_id": "google/flan-t5-xl",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate rerank

Rerank texts based on some queries.

POST /ml/v1/text/rerank

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-rerank.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input texts and the queries for reranking.

Examples:

A sample request.

{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    {
      "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
    },
    {
      "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
    }
  ],
  "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
  "parameters": {
    "return_options": {
      "top_n": 2
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    -d '{
      "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "inputs": [
        {
          "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
        },
        {
          "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
        }
      ],
      "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
      "parameters": {
        "return_options": {
          "top_n": 2
        }
      }
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
      "results": [
        {
          "index": 1,
          "score": 0.7461
        },
        {
          "index": 0,
          "score": 0.8274
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 20
    }

Text tokenization

The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.

POST /ml/v1/text/tokenization

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-tokenization.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input string to tokenize.

Examples:

A sample request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "model_id": "google/flan-ul2",
  "input": "Write a tagline for an alumni association: Together we",
  "parameters": {
    "return_tokens": true
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-ul2,",
      "input": "Write a tagline for an alumni association: Together we",
      "parameters": {
        "return_tokens": true
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

The tokenization result.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The response with the token count and the tokens, if requested.

    {
      "model_id": "google/flan-ul2",
      "result": {
        "token_count": 11,
        "tokens": [
          "Write",
          "a",
          "tag",
          "line",
          "for",
          "an",
          "alumni",
          "associ",
          "ation:",
          "Together",
          "we"
        ]
      }
    }

Create a new watsonx.ai training

Create a new watsonx.ai training in a project or a space.

The details of the base model and parameters for the training must be provided in the prompt_tuning object.

In order to deploy the tuned model you need to follow the following steps:

  1. Create a WML model asset, in a space or a project, by providing the request.json as shown below:

    curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
      -H "Authorization: Bearer <replace with your token>" \
      -H "content-type: application/json" \
      --data '{
         "name": "replace_with_a_meaningful_name",
         "space_id": "replace_with_your_space_id",
         "type": "prompt_tune_1.0",
         "software_spec": {
           "name": "watsonx-textgen-fm-1.0"
         },
         "metrics": [ from the training job ],
         "training": {
           "id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
           "base_model": {
             "model_id": "google/flan-t5-xl"
           },
           "task_id": "generation",
           "verbalizer": "Input: {{input}} Output:"
         },
         "training_data_references": [
           {
             "connection": {
               "id": "20933468-7e8a-4706-bc90-f0a09332b263"
             },
             "id": "file_to_tune1.json",
             "location": {
               "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
               "path": "file_to_tune1.json"
             },
             "type": "connection_asset"
           }
         ]
       }'
    

    Notes:

    1. If you used the training request field auto_update_model: true then you can skip this step as the model will have been saved at the end of the training job.
    2. Rather than creating the payload for the model you can use the generated request.json that was stored in the results_reference field, look for the path in the field entity.results_reference.location.model_request_path.
    3. The model type must be prompt_tune_1.0.
    4. The software spec name must be watsonx-textgen-fm-1.0.
  2. Create a tuned model deployment as described in the create deployment documentation.

POST /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The training_data_references contain the training datasets and the results_reference the connection where results will be stored.

Examples:

Start a prompt tune training job.

{
  "name": "my-prompt-tune-training",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "prompt_tuning": {
    "base_model": {
      "model_id": "google/flan-t5-xl"
    },
    "tuning_type": "prompt_tuning",
    "task_id": "classification",
    "num_epochs": 30,
    "learning_rate": 0.4,
    "accumulate_steps": 3,
    "batch_size": 10,
    "max_input_tokens": 100,
    "max_output_tokens": 100
  },
  "training_data_references": [
    {
      "id": "tune1_data.json",
      "location": {
        "path": "tune1_data.json"
      },
      "type": "container"
    }
  ],
  "auto_update_model": true,
  "results_reference": {
    "location": {
      "path": "tune1/results"
    },
    "type": "container"
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "name": "my-prompt-tune-training",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "prompt_tuning": {
        "base_model": {
          "model_id": "google/flan-t5-xl"
        },
        "task_id": "classification",
        "tuning_type": "prompt_tuning",
        "num_epochs": 30,
        "learning_rate": 0.4,
        "accumulate_steps": 3,
        "batch_size": 10,
        "max_input_tokens": 100,
        "max_output_tokens": 100
      },
      "training_data_references": [
        {
          "id": "tune1_data.json",
          "location": {
            "path": "tune1_data.json"
          },
          "type": "container"
        }
      ],
      "auto_update_model": true,
      "results_reference": {
        "location": {
          "path": "tune1/results"
        },
        "type": "container"
      }
    }'

Response

Training resource.

Status Code

  • The training job has been created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Retrieve the list of trainings

Retrieve the list of trainings for the specified space or project.

GET /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • Compute the total count. May have performance impact.

  • Return only the resources with the given tag value.

  • Filter based on on the training job state.

    Allowable values: [queued,pending,running,storing,completed,failed,canceled]

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Information for paging when querying resources.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "limit": 100,
      "first": {
        "href": "https://{cluster_url}/ml/v4/trainings"
      },
      "total_count": 1,
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "name": "my-prompt-training",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "created_at": "2023-08-04T13:22:47.000Z"
          },
          "entity": {
            "prompt_tuning": {
              "base_model": {
                "model_id": "google/flan-t5-xl"
              },
              "task_id": "classification"
            },
            "training_data_references": [
              {
                "id": "tune1_data.json",
                "location": {
                  "path": "tune1_data.json"
                },
                "type": "container"
              }
            ],
            "auto_update_model": true,
            "results_reference": {
              "location": {
                "path": "tune1/results",
                "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
                "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
                "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
                "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
              },
              "type": "container"
            },
            "status": {
              "state": "completed",
              "running_at": "2023-08-04T13:22:48.000Z",
              "completed_at": "2023-08-04T13:22:55.289Z",
              "metrics": [
                {
                  "iteration": 0,
                  "ml_metrics": {
                    "loss": 4.49988
                  },
                  "timestamp": "2023-09-22T02:52:03.324Z"
                },
                {
                  "iteration": 1,
                  "ml_metrics": {
                    "loss": 3.86884
                  },
                  "timestamp": "2023-09-22T02:52:03.689Z"
                },
                {
                  "iteration": 2,
                  "ml_metrics": {
                    "loss": 4.05115
                  },
                  "timestamp": "2023-09-22T02:52:04.053Z"
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the training

Retrieve the training with the specified identifier.

GET /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.get

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Training resource.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Cancel or delete the training

Cancel or delete the specified training, once deleted all trace of the job is gone.

DELETE /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.delete

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

Response

Status Code

  • Training cancelled.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

id=curlclassName=tab-item-selected