IBM Cloud API Docs

Introduction to IBM watsonx.ai software

Using IBM watsonx.ai software APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).

If you are looking for the IBM watsonx.ai as a Service APIs, see here.

Step-by-step instructions on how to use IBM watsonx.ai software can be found here.

There is a specialized python library that is available to access this REST API.

Endpoint URLs

The base URLs for API endpoints come from the cluster and add-on service instance. The URL follows this pattern:

https://{cluster_url}/ml/v1
  • {cluster_url} represents the name or IP address of your deployed cluster. Use a hostname that resolves to an IP address in the cluster.

To find the base URL, view the details for the service instance from the Cloud Pak for Data web client.

Note that for prompts and notebooks the base URLs are /wx.

Use that URL in your requests to the API.

Endpoint example

curl -k -X {request_method} -H "Authorization: Bearer {token}" "https://{cluster_url}/ml/v1/text/generation"

Disabling SSL verification

Watson Machine Learning uses Secure Sockets Layer (SSL) (or Transport Layer Security (TLS)) for secure connections between the client and server. The connection is verified against the local certificate store to ensure authentication, integrity, and confidentiality.

If you use a self-signed certificate, you need to disable SSL verification to make a successful connection.

Enabling SSL verification is highly recommended. Disabling SSL jeopardizes the security of the connection and data. Disable SSL only if necessary, and take steps to enable SSL as soon as possible.

To disable SSL verification for a curl request, use the --insecure (-k) option with the request.

Authentication

A bearer token is required to use any of the watsonx.ai APIs.

For more information, see the Authorization section of the Platform API reference.

Use the value of the access_token property from the example request. Set the access_token value as the authorization header parameter for requests to the APIs. The format is Authorization: Bearer {access_token_value}:

Authorization: Bearer eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6IlJTMjU2In0...

Example request that uses an API key to retrieve the token

curl -k -X POST "https://cluster_url_host/icp4d-api/v1/authorize"   -H "cache-control: no-cache"   -H "content-type: application/json"   -d "{\“username\”:\“admin\”,\“password\”:\“password\”}"

Response

{
  "username": "admin",
  "role": "Admin",
  "permissions": [
    "administrator"
  ],
  "sub": "admin",
  "iss": "KNOXSSO",
  "aud": "DSX",
  "uid": "999",
  "authenticator": "default",
  "access_token": "eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6...",
  "_messageCode_": "success"
}

Error handling

This API uses standard HTTP response codes to indicate whether a method completed successfully. A 200 type response indicates success.

HTTP Code Description Recovery
200 Success The request was successful.
400 Bad Request The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body.
401 Unauthorized You are not authorized to make this request. Log in and try again or provide a valid token. For more information about logging in, see the Authentication section. If this error persists, contact the account owner to check your permissions.
403 Forbidden The supplied authentication is not authorized.
404 Not Found The requested resource could not be found.

Note that 429 and 503 errors may mean that the model is overloaded or unavailable, check the error description for more details.

Error response

Name Description
trace An identifier that can be used to trace the request. This can be set using X-Global-Transaction-Id.
errors The list of errors.

Errors

Name Description
code A simple string code that should convey the general sense of the error.
message The message that describes the error.
more_info A reference to a more detailed explanation when available.

Additional headers

Some additional headers might be required to make successful requests to the API. Those additional headers are described below.

An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.

If there is not a transaction ID that is passed in, then one is generated randomly.

API change log

In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API. The change log lists changes that have been made, ordered by the date they were released. Changes to existing API versions are designed to be compatible with existing client applications, if this is not the case then a new version date will be created.

14 March 2024

The watsonx.ai API is generally available. Use the watsonx.ai API to work with foundation models programmatically.

Versioning

API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.

When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.

Active Version Dates

Version date Summary of changes
2024-03-14 Publication of the /ml/v1 APIs.

Data References

Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires the use of connection_asset or data_asset reference types. These reference types are created within a space or a project and are referenced in requests to represent input data and results locations. These types contain two parameter objects, connection and location, which require different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied to the location object whereas using a connection_asset requires the connection_id for the connection object and different location fields depending on the data source type,

Example connection_asset payload:

{
  "training_data_references": [
    {
      "type": "connection_asset",
      "connection": {
        "id": "<connection_guid>"
      },
      "location": {
        "<wdp-properties depending on the type>": "<value depending on the type>"
      }
    }
  ]
}

Example data_asset payload:

{
  "training_data_references": [
    {
      "type": "data_asset",
      "location": {
        "href": "/v2/assets/<asset_id>?space_id=<space_id>"
      }
    }
  ]
}

Example fs payload:

  • project_id
{
  "training_data_references": [
    {
      "type":"fs",
      "location":{
        "path":"/projects/<project_id>/assets/<fs_path>"
      }
    }
  ]
}
  • space_id
{
  "training_data_references": [
    {
      "type":"fs",
      "location":{
        "path":"/spaces/<space_id>/assets/<fs_path>"
      }
    }
  ]
}

Methods

Create a new AutoAI RAG run

Create a new AutoAI RAG that will find the best RAG pattern from the data that is provided in the request.

Since CloudPak for Data 5.1.0.

POST /ml/v1/autoai/rags

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The details of the AutoAI RAG run with the data used to find the best RAG patterns.

Response

The response of an AutoAI RAG run.

Status Code

  • Created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the AutoAI RAG runs

Retrieve the list of AutoAI RAG requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

Since CloudPak for Data 5.1.0.

GET /ml/v1/autoai/rags

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

The response of an AutoAI RAG run.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get an AutoAI RAG run

Get the results of an AutoAI RAG run, or details if the job failed.

Since CloudPak for Data 5.1.0.

GET /ml/v1/autoai/rags/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

The response of an AutoAI RAG run.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "description": "My autoai rag experiment for 2023 financial documents",
        "name": "AutoAI RAG"
      },
      "entity": {
        "timestamp": "2023-09-22T02:52:03.324Z",
        "hardware_spec": {
          "id": "c076e82c-b2a7-4d20-9c0f-1f0c2fdf5a24",
          "name": "L"
        },
        "parameters": {
          "constraints": {
            "embedding_models": [
              "ibm/slate-125m-english-rtrvr"
            ],
            "foundation_models": [
              "meta-llama/llama-3-70b-instruct",
              "mistralai/mixtral-8x7b-instruct-v01",
              "ibm/granite-13b-chat-v2"
            ],
            "max_number_of_rag_patterns": 8
          },
          "optimization": {
            "metrics": [
              "answer_correctness"
            ]
          },
          "output_logs": true
        },
        "input_data_references": [
          {
            "type": "connection_asset",
            "connection": {
              "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
            },
            "location": {
              "path": "files/document.pdf"
            }
          }
        ],
        "test_data_references": [
          {
            "type": "connection_asset",
            "connection": {
              "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
            },
            "location": {
              "path": "files/qa_document.json"
            }
          }
        ],
        "vector_store_references": [
          {
            "type": "connection_asset",
            "connection": {
              "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
            }
          }
        ],
        "results_reference": {
          "type": "container",
          "location": {
            "path": "results_autoai",
            "training": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5",
            "training_status": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training-status.json",
            "assets_path": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets",
            "training_log": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training.log"
          }
        },
        "results": [
          {
            "metrics": {
              "test_data": [
                {
                  "metric_name": "answer_correctness",
                  "mean": 0.51,
                  "ci_high": 0.68,
                  "ci_low": 0.43
                }
              ]
            },
            "context": {
              "rag_pattern": {
                "composition_steps": [
                  "vector_store",
                  "chunking",
                  "embeddings",
                  "retrieval",
                  "generation"
                ],
                "location": {
                  "evaluation_results": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/evaluation_results.json",
                  "indexing_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/indexing_notebook.ipynb",
                  "inference_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_notebook.ipynb"
                },
                "name": "Pattern 1",
                "settings": {
                  "vector_store": {
                    "datasource_type": "milvus",
                    "index_name": "autoai_rag_1234_iteration_5_index",
                    "distance_metric": "euclidean",
                    "operation": "upsert",
                    "schema": {
                      "id": "autoai_rag_1.0.0",
                      "name": "AutoAI RAG document schema",
                      "type": "struct",
                      "fields": [
                        {
                          "name": "text",
                          "description": "text field",
                          "type": "string",
                          "role": "text"
                        },
                        {
                          "name": "document_id",
                          "description": "document name field",
                          "type": "string",
                          "role": "document_name"
                        },
                        {
                          "name": "start_index",
                          "description": "chunk starting token position in the source document",
                          "type": "number",
                          "role": "start_index"
                        },
                        {
                          "name": "sequence_number",
                          "description": "chunk number per document",
                          "type": "number",
                          "role": "sequence_number"
                        },
                        {
                          "name": "vector",
                          "description": "vector embeddings",
                          "type": "array",
                          "role": "vector_embeddings"
                        }
                      ]
                    }
                  },
                  "chunking": {
                    "method": "recursive",
                    "chunk_size": 256,
                    "chunk_overlap": 64
                  },
                  "embeddings": {
                    "truncate_strategy": "left",
                    "truncate_input_tokens": 384,
                    "model_id": "ibm/slate-125m-english-rtrvr"
                  },
                  "retrieval": {
                    "method": "simple",
                    "number_of_chunks": 5
                  },
                  "generation": {
                    "model_id": "mistralai/mixtral-8x7b-instruct-v01",
                    "parameters": {
                      "max_new_tokens": 256
                    },
                    "prompt_template_text": "Answer the following questions based on provided context:\\n ..."
                  }
                }
              },
              "iteration": 1,
              "max_combinations": 160
            }
          }
        ],
        "status": {
          "state": "running",
          "step": "vector_store",
          "message": {
            "level": "info",
            "text": "Pipeline 1 of 8 is completed."
          },
          "running_at": "2023-08-04T13:22:48.000Z"
        }
      }
    }

Cancel or delete an AutoAI RAG run

Cancel or delete the specified AutoAI RAG run, once deleted all trace of the run job is gone.

Since CloudPak for Data 5.1.0.

DELETE /ml/v1/autoai/rags/{id}

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

Response

Status Code

  • Deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the custom foundation models

Retrieve the custom foundation models.

In order to deploy a custom foundation model using one of the models in this list you need to follow the following steps:

  1. Create a model asset, in a space or a project, by providing the custom foundation model details as shown below:

    curl -X POST "https://{cluster_url}/ml/v4/models?version=2024-01-29" \
      -H "Authorization: Bearer <replace with your token>" \
      -H "content-type: application/json" \
      --data '{
                "name": "replace_with_a_meaningful_name",
                "space_id": "replace_with_your_space_id",
                "foundation_model": {
                  "model_id": "replace_with_your_model_id"
                },
                "type": "custom_foundation_model_1.0",
                "software_spec": {
                  "name": "watsonx-cfm-caikit-1.0"
                }
              }'
    

    Notes:

    1. The model type must be custom_foundation_model_1.0.
    2. The software spec name must be watsonx-cfm-caikit-1.0.
  2. Create a custom foundation model deployment as described in the create deployment documentation.

Since watsonx.ai 1.1.x.

GET /ml/v4/custom_foundation_models

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • curl --request GET 'https://{cpd_cluster}/ml/v4/custom_foundation_models?version=2023-05-02&limit=10'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    

Response

Pagination information and list of models and common parameters.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The list of custom foundation models that were created and registered.

    {
      "total_count": 1,
      "limit": 10,
      "first": {
        "href": "https://{cpd_cluster}/ml/v4/custom_foundation_models"
      },
      "resources": [
        {
          "model_id": "my_flan_t5_xl",
          "description": "A tuned version of flan_t5_xl",
          "tags": [
            "flan_t5_xl"
          ],
          "parameters": [
            {
              "name": "max_batch_weight",
              "display_name": "Maximum batch weight",
              "default": 10000,
              "description": "The maximum batch weight that is allowed for this model.",
              "type": "number",
              "min": 0,
              "max": 100000
            }
          ]
        }
      ],
      "parameters": [
        {
          "name": "max_batch_weight",
          "display_name": "Maximum batch weight",
          "default": 1000,
          "description": "The maximum batch weight that is allowed for all models.",
          "type": "number",
          "min": 0,
          "max": 10000
        }
      ]
    }

Create a new watsonx.ai deployment

Create a new deployment, currently the only supported type is online. If this is a deployment for a prompt tune then the asset object must exist and the id must be the id of the model that was created after the prompt training. If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed. If this is a deployment for a custom foundation model then the online object must exist, the asset object must exist and point to the model object that describes the custom foundation model, and the hardware_spec is mandatory. Note that the base_model_id will be returned and will be the base model id that is defined in the model asset (asset.id). If this is a deployment for a fine tuned model then the asset.id must point to the model that was created after the fine tuning. In case of a fine tuned model with a template, the field base_deployment_id will be the tuned model deployment. Pre-defined hardware specifications are provided for custom foundation model deployments:

  • WX-S: 1 GPU, Request 1 CPU, Limit 2 CPU and 60 GB (Request and Limit) - 1B to 20B parameters
  • WX-M: 2 GPU, Request 2 CPU, Limit 3 CPU and 120 GB (Request and Limit) - 21B to 40B parameters
  • WX-L: 4 GPU, Request 4 CPU, Limit 5 CPU and 240 GB (Request and Limit) - 41B to 80B parameters
  • WX-XL: 8 GPU, Request 8 CPU, Limit 9 CPU and 600 GB (Request and Limit) - 81B to 200B parameters

A prompt template can be used in conjunction with a custom foundation model by specifying the prompt_template object with the id point to the prompt template.

POST /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The deployment request entity.

The following important fields are described for each use case:

  1. Prompt template:
    • base_model_id: required
    • promt_template.id: required
    • online: required
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • response deployed_asset_type: foundation_model
  2. Prompt tune:
    • asset.id: required
    • online: required
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • base_model_id: forbidden
    • response deployed_asset_type: prompt_tune
  3. Custom foundation model:
    • asset.id: required
    • online: required
    • online.parameters.foundation_model: optional
    • hardware_spec: required
    • hardware_request: forbidden
    • base_model_id: forbidden
    • base_deployment_id: forbidden
    • response deployed_asset_type: custom_foundation_model
  4. Custom foundation model with template:
    • base_deployment_id: required
    • promt_template.id: required
    • online: required
    • online.parameters.foundation_model: forbidden
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • asset.id: forbidden
    • base_model_id: forbidden
    • response deployed_asset_type: custom_foundation_model
  5. Fine tuned model:
    • asset.id: required
    • online: required
    • online.parameters.foundation_model: optional
    • hardware_spec: required
    • hardware_request: forbidden
    • base_model_id: forbidden
    • base_deployment_id: forbidden
    • response deployed_asset_type: fine_tune
  6. Fine tune model with template:
    • base_deployment_id: required
    • promt_template.id: required
    • online: required
    • online.parameters.foundation_model: forbidden
    • hardware_spec: forbidden
    • hardware_request: forbidden
    • asset.id: forbidden
    • base_model_id: forbidden
    • response deployed_asset_type: fine_tune
Examples:

Create a prompt tune deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "text_classification",
  "asset": {
    "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
  },
  "online": {}
}

Create a prompt template deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "text_classification",
  "base_model_id": "google/flan-ul2",
  "prompt_template": {
    "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
  },
  "online": {}
}

Create a custom foundation model deployment.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "my_tuned_flan",
  "asset": {
    "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
  },
  "online": {
    "parameters": {
      "serving_name": "myflan",
      "foundation_model": {
        "max_batch_weight": 10000,
        "max_sequence_length": 8192
      }
    }
  },
  "hardware_spec": {
    "id": "WX-S",
    "num_nodes": 1
  }
}

Create a prompt template deployment with a custom foundation model.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "name": "my_tuned_flan_template",
  "base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc",
  "prompt_template": {
    "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
  },
  "online": {
    "parameters": {
      "serving_name": "myflan_template"
    }
  }
}

Deploy a curated model.

{
  "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
  "name": "my_granite_13b_chat_v2",
  "asset": {
    "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
  },
  "online": {
    "parameters": {
      "serving_name": "granite_13b_chat_v2"
    }
  }
}

Response

A deployment resource.

Status Code

  • Deployment created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "my_tuned_flan"
      },
      "entity": {
        "asset": {
          "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
        },
        "online": {
          "parameters": {
            "serving_name": "myflan",
            "foundation_model": {
              "max_batch_weight": 10000,
              "max_sequence_length": 8192
            }
          }
        },
        "deployed_asset_type": "custom_foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "my_tuned_flan_template"
      },
      "entity": {
        "base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc",
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {
          "parameters": {
            "serving_name": "myflan_template"
          }
        },
        "deployed_asset_type": "custom_foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan_template/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan_template/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }

Retrieve the deployments

Retrieve the list of deployments for the specified space or project.

GET /ml/v4/deployments

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Retrieves the deployment, if any, that contains this serving_name.

    Example: classification

  • Retrieves only the resources with the given tag value.

  • Retrieves only the resources with the given asset_id, asset_id would be the model id.

  • Retrieves only the resources with the given prompt_template_id.

  • Retrieves only the resources with the given name.

  • Retrieves the resources filtered with the given type. There are the deployment types as well as an additional prompt_template if the deployment type includes a prompt template.

    The supported deployment types are (see the description for deployed_asset_type in the deployment entity):

    1. prompt_tune - when a prompt tuned model is deployed.
    2. foundation_model - when a prompt template is used on a pre-deployed IBM provided model.
    3. custom_foundation_model - when a custom foundation model is deployed.

    These can be combined with the flag prompt_template like this:

    1. type=prompt_tune - return all prompt tuned model deployments.
    2. type=prompt_tune and prompt_template - return all prompt tuned model deployments with a prompt template.
    3. type=foundation_model - return all prompt template deployments.
    4. type=foundation_model and prompt_template - return all prompt template deployments - this is the same as the previous query because a foundation_model can only exist with a prompt template.
    5. type=custom_foundation_model - return all custom model deployments.
    6. type=custom_foundation_model and prompt_template - return all custom model deployments with a prompt template.
    7. type=prompt_template - return all deployments with a prompt template.
  • Retrieves the resources filtered by state. Allowed values are initializing, updating, ready and failed.

  • Returns whether serving_name is available for use or not. This query parameter cannot be combined with any other parameter except for serving_name.

    Default: false

Response

The deployment resources.

Status Code

  • OK.

  • serving_name is available for use. Returned when serving_name and conflict query parameters are used.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

  • Returned when serving_name and conflict query parameters are used. The response body will contain the reason.

Example responses
  • {
      "limit": 10,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments"
      },
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "created_at": "2023-05-02T16:27:51Z",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "name": "text_classification",
            "description": "Classification prompt tuned model deployment",
            "tags": [
              "classification"
            ]
          },
          "entity": {
            "asset": {
              "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
            },
            "deployed_asset_type": "prompt_tune",
            "online": {},
            "base_model_id": "google/flan-t5-xl",
            "status": {
              "state": "ready",
              "message": {
                "level": "info",
                "text": "The deployment is successful"
              },
              "inference": [
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
                },
                {
                  "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
                  "sse": true
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the deployment details

Retrieve the deployment details with the specified identifier.

GET /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.read

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

A deployment resource.

Status Code

  • Deployment details.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt tuned model deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "asset": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "prompt_tune",
        "base_model_id": "google/flan-ul2",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "text_classification",
        "description": "Classification prompt template deployment",
        "tags": [
          "classification"
        ]
      },
      "entity": {
        "prompt_template": {
          "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
        },
        "online": {},
        "deployed_asset_type": "foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream",
              "sse": true
            }
          ]
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "my_tuned_flan"
      },
      "entity": {
        "asset": {
          "id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
        },
        "hardware_spec": {
          "id": "WX-S",
          "num_nodes": 1
        },
        "online": {
          "parameters": {
            "serving_name": "myflan",
            "foundation_model": {
              "max_batch_weight": 10000,
              "max_sequence_length": 8192
            }
          }
        },
        "deployed_asset_type": "custom_foundation_model",
        "base_model_id": "google/flan-t5-xl",
        "status": {
          "state": "ready",
          "message": {
            "level": "info",
            "text": "The deployment is successful"
          },
          "inference": [
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation"
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation",
              "uses_serving_name": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream",
              "sse": true
            },
            {
              "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream",
              "sse": true,
              "uses_serving_name": true
            }
          ]
        }
      }
    }

Update the deployment metadata

Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.

  • /name
  • /description
  • /tags
  • /custom
  • /online/parameters
  • /asset - replace only
  • /prompt_template - replace only
  • /hardware_spec
  • /hardware_request
  • /base_model_id - replace only (applicable only to prompt template deployments referring to IBM base foundation models) Since CloudPak for Data 5.0.3.

The PATCH operation with path specified as /online/parameters can be used to update the serving_name.

PATCH /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.update

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

The json patch.

Response

A deployment resource.

Status Code

  • Deployment accepted

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Delete the deployment

Delete the deployment with the specified identifier.

DELETE /ml/v4/deployments/{deployment_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.deployment.delete

Request

Path Parameters

  • The deployment id.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Status Code

  • Deployment deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

A prompt tune request.

A prompt tune request.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "how far is paris from bangalore:\n",
  "parameters": {
    "max_new_tokens": 100
  }
}

A prompt tune request with moderations.

A prompt tune request with moderations.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}

A prompt template request.

A prompt template request.

{
  "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "how far is paris from bangalore:\n",
  "parameters": {
    "max_new_tokens": 100
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details for a prompt tune.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details for a prompt tune with moderations.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }
  • The generated text from the model along with other details for a prompt template.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events. If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.

Return options

Note that there is currently a limitation in this operation when using return_options, for input only input_text will be returned if requested, for output the input_tokens and generated_tokens will not be returned, also the rank and top_tokens will not be returned.

POST /ml/v1/deployments/{id_or_name}/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Path Parameters

  • The id_or_name can be either the deployment_id that identifies the deployment or a serving_name that allows a predefined URL to be used to post a prediction.

    The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:
{
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "decoding_method": "sample",
    "temperature": 0.8,
    "max_new_tokens": 200
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
    }'
  • curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000,
        "prompt_variables": {
          "name": "joe",
          "count": 3
        },
      },
    }'

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Create a fine tuning job

Create a fine tuning job that will fine tune an LLM.

Since CloudPak for Data 5.0.3.

POST /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

  • pm-20.fine-tuning.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The details of the fine tuning job with the data used to tune the LLM.

Response

The response of a fine tuning job.

Status Code

  • Created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Retrieve the list of fine tuning jobs

Retrieve the list of fine tuning jobs for the specified space or project.

Since CloudPak for Data 5.0.3.

GET /ml/v1/fine_tunings

Auditing

Calling this method generates the following auditing event.

  • pm-20.fine-tuning.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned.

    Possible values: value ≤ 200

    Default: 100

  • Compute the total count. May have performance impact.

  • Return only the resources with the given tag value.

  • Filter based on on the job state: queued, running, completed, failed etc.

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

System details.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Get a fine tuning job

Get the results of a fine tuning job, or details if the job failed.

Since CloudPak for Data 5.0.3.

GET /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.fine-tuning.get

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

The response of a fine tuning job.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Cancel or delete a fine tuning job

Delete a fine tuning job if it exists, once deleted all trace of the job is gone.

Since CloudPak for Data 5.0.3.

DELETE /ml/v1/fine_tunings/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.fine-tuning.delete

Request

Path Parameters

  • The id is the identifier that was returned in the metadata.id field of the request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

Response

Status Code

  • Deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

List the available foundation models

Retrieve the list of deployed foundation models.

GET /ml/v1/foundation_model_specs

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • A set of filters to specify the list of models, filters are described as the pattern shown below.

     pattern: tfilter[,tfilter][:(or|and)]
     tfilter: filter | !filter
       filter: Requires existence of the filter.
       !filter: Requires absence of the filter.
     filter: one of
       modelid_*:     Filters by model id.
                      Namely, select a model with a specific model id.
       provider_*:    Filters by provider.
                      Namely, select all models with a specific provider.
       source_*:      Filters by source.
                      Namely, select all models with a specific source.
       input_tier_*:  Filters by input tier.
                      Namely, select all models with a specific input tier.
       output_tier_*: Filters by output tier.
                      Namely, select all models with a specific output tier.
       tier_*:        Filters by tier.
                      Namely, select all models with a specific input or output tier.
       task_*:        Filters by task id.
                      Namely, select all models that support a specific task id.
       lifecycle_*:   Filters by lifecycle state.
                      Namely, select all models that are currently in the specified lifecycle state.
       function_*:    Filters by function. Since CloudPak for Data `5.0.0`.
                      Namely, select all models that support a specific function.
    

    Possible values: 1 ≤ length ≤ 1000, Value must match regular expression ^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$

    Example: modelid_ibm/granite-13b-instruct-v2

  • See all the Tech Preview models if entitled.

    Default: false

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The models that are currently deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02"
      },
      "resources": [
        {
          "model_id": "bigcode/starcoder",
          "label": "starcoder-15.5b",
          "provider": "BigCode",
          "source": "Hugging Face",
          "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions",
          "tasks": [
            {
              "id": "code",
              "ratings": {
                "quality": 3
              }
            }
          ],
          "min_shot_size": 0,
          "input_tier": "class_2",
          "output_tier": "class_2",
          "number_params": "15.5b"
        }
      ]
    }

List the supported tasks

Retrieve the list of tasks that are supported by the foundation models.

GET /ml/v1/foundation_model_tasks

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

System details.

Status Code

  • OK

  • Bad request, the response body should contain the reason.

  • The specified resource was not found.

Example responses
  • The tasks that are currently supported by models deployed in the cluster.

    {
      "total_count": 1,
      "limit": 100,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02"
      },
      "resources": [
        {
          "task_id": "question_answering",
          "label": "Question answering",
          "rank": 1,
          "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance."
        }
      ]
    }

Create a new notebook.

Create a new notebook

  • either from scratch
  • or by copying another notebook.

To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to your project or space storage using Assets-files API and then reference it with the attribute file_reference. The other required attributes are name, project/space and runtime. The attribute runtime is used to specify the environment on which the notebook runs. Either project or space must be specified in the request body.

To copy a notebook, you only need to provide name and source_guid in the request body.

POST /v2/notebooks

Request

Specification of the notebook to be created.

Example:

Create a notebook from scratch in a project

{
  "name": "my notebook",
  "description": "this is my notebook",
  "project": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
  "file_reference": "notebook/my_notebook.ipynb",
  "runtime": {
    "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
    "spark_monitoring_enabled": true
  }
}

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Created and returned a new notebook asset. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

  • The number of requests has exceeded the rate limit.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "space_id": "92ae0e27-9b11-4de9-a646-d46ca3c183d4"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-92ae0e27-9b11-4de9-a646-d46ca3c183d4",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?space_id=92ae0e27-9b11-4de9-a646-d46ca3c183d4"
      }
    }
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python3",
            "language": "python3"
          },
          "originates_from": {
            "type": "notebook",
            "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "rate_limit",
          "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later."
        }
      ]
    }

Retrieve the details of a large number of notebooks inside a project.

Retrieve the details of a large number of notebooks inside a project.

POST /v2/notebooks/list

Request

Query Parameters

  • The guid of the project.

  • Additional info that will be included into the notebook details. Possible values are:

    • runtime

Payload for a notebook list request.

Examples:

List notebooks

{
  "notebooks": [
    "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
  ]
}

Response

A list of notebook info as returned by a list query.

Status Code

  • Success. Returned a list of notebook assets. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "41d09a9a-f771-48a2-9534-50c0c622356d",
            "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d"
          },
          "entity": {
            "runtime": {
              "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "spark_monitoring_enabled": true
            },
            "asset": {
              "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
              "asset_type": "notebook",
              "created_at": "2021-07-01T12:37:01Z",
              "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
              "version": 2,
              "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
              "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
            }
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a particular notebook, including the notebook asset.

Delete a particular notebook, including the notebook asset.

DELETE /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Response

Status Code

  • Successful request. Notebook is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Revert the main notebook to a version.

Revert the main notebook to a version.

PUT /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the main notebook.

Payload for a request to revert to a specific notebook version.

Examples:

Revert the notebook to a version

{
  "source": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
}

Response

Notebook information in a project as returned by a GET request.

Status Code

  • Success. Reverted the main notebook to a version. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook v4.2",
        "description": "this is my notebook v4.2",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
          "spark_monitoring_enabled": true
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Update a particular notebook.

Update a particular notebook.

PATCH /v2/notebooks/{notebook_guid}

Request

Path Parameters

  • The guid of the notebook.

Payload for a notebook update request.

Examples:

Update a notebook

{
  "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
  "spark_monitoring_enabled": false,
  "kernel": {
    "display_name": "Python 3.9 with Spark",
    "name": "python39",
    "language": "python3"
  }
}

Response

Notebook information as returned by a GET request.

Status Code

  • Success. Updated the notebook. Format follows v2/assets.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "name": "my notebook",
        "description": "this is my notebook",
        "asset_type": "notebook",
        "created": 1540471021134,
        "created_at": "2021-07-01T12:37:01Z",
        "owner_id": "IBMid-310000SG2Y",
        "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9",
        "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d",
        "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      },
      "entity": {
        "notebook": {
          "kernel": {
            "display_name": "Python 3.9 with Spark",
            "name": "python39",
            "language": "python3"
          },
          "originates_from": {
            "type": "blank"
          }
        },
        "runtime": {
          "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
          "spark_monitoring_enabled": false
        },
        "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf"
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Promote a notebook from project to space.

Promote a notebook from project to space.

POST /v2/notebooks/{notebook_guid}/promote

Request

Path Parameters

  • The guid of the notebook.

Query Parameters

  • The guid of the notebook version.

  • The id of the project from which a notebook will be promoted.

Body parameters for promoting a notebook. space_id is required. name and description are optional. If not specified, the name and description of the source notebook in project will be used.

Response

Notebook information in a space as returned by promoting a notebook from project to space.

Status Code

  • Success. Returned the notebook asset in the space.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new version.

Create a version of a given notebook.

POST /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A notebook version in a project.

Status Code

  • Success. Returned the notebook version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

List the versions of a notebook.

List all versions of a particular notebook.

GET /v2/notebooks/{notebook_guid}/versions

Request

Path Parameters

  • The guid of the notebook.

Response

A list of notebook versions in a project.

Status Code

  • Success. Returned a list of versions of the notebook.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
            "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
            "created_at": 1543681714106
          },
          "entity": {
            "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
            "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
            "created_by_iui": "IBMid-123456ABCD",
            "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
            "rev_id": 1
          }
        }
      ]
    }
  • {
      "total_results": 1,
      "resources": [
        {
          "metadata": {
            "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
            "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
            "created_at": 1543681714106
          },
          "entity": {
            "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
            "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
            "created_by_iui": "IBMid-123456ABCD",
            "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
            "rev_id": 1
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Retrieve a notebook version.

Retrieve a particular version of a notebook.

GET /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

A notebook version in a project.

Status Code

  • Success. Returned the version definition.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "metadata": {
        "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0",
        "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142",
        "created_at": 1543681714106
      },
      "entity": {
        "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2",
        "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964",
        "created_by_iui": "IBMid-123456ABCD",
        "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb",
        "rev_id": 1
      }
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Delete a notebook version.

Delete a particular version of a given notebook.

DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}

Request

Path Parameters

  • The guid of the notebook.

  • The guid of the version.

Response

Status Code

  • Success. The version is deleted.

  • Bad request. One of the fields has invalid format/content.

  • Unauthorized. No/Malformed authentication provided.

  • Forbidden. User is not allowed to perform the target operation.

Example responses
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_type",
          "message": "The `project` field needs to be a uuid v4, but is 12345.",
          "target": {
            "type": "field",
            "name": "project"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "invalid_auth_token",
          "message": "The IAM bearer token is not valid.",
          "target": {
            "type": "header",
            "name": "Authentication"
          }
        }
      ]
    }
  • {
      "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2",
      "errors": [
        {
          "code": "endpoint_access_forbidden",
          "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID."
        }
      ]
    }

Create a new prompt / prompt template

This creates a new prompt with the provided parameters.

POST /v1/prompts

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt

This retrieves a prompt / prompt template with the given id.

GET /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Only return a set of model parameters compatiable with inferencing

    Default: true

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt

This updates a prompt / prompt template with the given id.

PATCH /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt

This delets a prompt / prompt template with the given id.

DELETE /v1/prompts/{prompt_id}

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt lock modifications

Modifies the current locked state of a prompt.

PUT /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt lock status

Retrieves the current locked state of a prompt.

GET /v1/prompts/{prompt_id}/lock

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get the inference input string for a given prompt

Computes the inference input string based on state of a prompt. Optionally replaces template params

POST /v1/prompts/{prompt_id}/input

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt

This adds new chat items to the given prompt.

POST /v1/prompts/{prompt_id}/chat_items

Request

Path Parameters

  • Prompt ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the space ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Create a new prompt session

This creates a new prompt session.

POST /v1/prompt_sessions

Request

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session

This retrieves a prompt session with the given id.

GET /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Include the most recent entry

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Update a prompt session

This updates a prompt session with the given id.

PATCH /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session

This deletes a prompt session with the given id.

DELETE /v1/prompt_sessions/{session_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new prompt to a prompt session

This creates a new prompt associated with the given session.

POST /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get entries for a prompt session

List entries from a given session.

GET /v1/prompt_sessions/{session_id}/entries

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Bookmark from a previously limited get request

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Limit for results to retrieve, default 20

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Success - Returned when search completes

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Add a new chat item to a prompt session entry

This adds new chat items to the given entry.

POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Created - Returned when created

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Prompt session lock modifications

Modifies the current locked state of a prompt session.

PUT /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Override a lock if it is currently taken.

Response

Status Code

  • Ok - Returned when lock change is successful

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get current prompt session lock status

Retrieves the current locked state of a prompt session.

GET /v1/prompt_sessions/{session_id}/lock

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • Ok - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Get a prompt session entry

This retrieves a prompt session entry with the given id.

GET /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • OK - Returned from GET when it succeeds

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Delete a prompt session entry

This deletes a prompt session entry with the given id.

DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}

Request

Path Parameters

  • Prompt Session ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

  • Prompt Session Entry ID

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Query Parameters

  • [REQUIRED] Specifies the project ID as the target. One target must be supplied per request.

    Possible values: Value must match regular expression [a-zA-Z0-9-]*

Response

Status Code

  • No Content - Returned on success

  • Bad Request - Returned when the request parameters are invalid

  • Unauthorized - Returned when caller does not have a valid authorization token, or it is missing

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters.

Since CloudPak for Data 5.1.0.

POST /ml/v1/text/chat

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-chat.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

text_chat

A text chat example.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Who won the world series in 2020?"
    },
    {
      "role": "assistant",
      "content": "The Los Angeles Dodgers won the World Series in 2020."
    },
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "Where was it played?"
      }
    }
  ],
  "max_tokens": 100,
  "temperature": 0,
  "time_limit": 1000
}

tool_call

A tool calling example.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "messages": [
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "What is the weather like in Boston today?"
      }
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "description": "The city, e.g. San Francisco, CA",
              "type": "string"
            },
            "unit": {
              "enum": [
                "celsius",
                "fahrenheit"
              ],
              "type": "string"
            }
          },
          "required": [
            "location"
          ]
        }
      }
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "get_current_weather",
      "description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
    }
  }
}

json_mode

A text chat example with json output.

{
  "model_id": "meta-llama/llama-3-8b-instruct",
  "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
  "response_format": {
    "type": "json_object"
  },
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant designed to output JSON."
    },
    {
      "role": "user",
      "content": {
        "type": "text",
        "text": "Who won the world series in 2020?"
      }
    }
  ]
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful assistant."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Who won the world series in 2020?"
            }
          ]
        },
        {
          "role": "assistant",
          "content": "The Los Angeles Dodgers won the World Series in 2020."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Where was it played?"
            }
          ]
        }
      ],
      "max_tokens": 100,
      "temperature": 0,
      "time_limit": 1000
    }'
    
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "messages": [
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "What is the weather like in Boston today?"
            }
          ]
        }
      ],
      "tools": [
        {
          "type": "function",
          "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "description": "The city, e.g. San Francisco, CA",
                  "type": "string"
                },
                "unit": {
                  "enum": [
                    "celsius",
                    "fahrenheit"
                  ],
                  "type": "string"
                }
              },
              "required": [
                "location"
              ]
            }
          }
        }
      ],
      "tool_choice": {
        "type": "function",
        "function": {
          "name": "get_current_weather",
        }
      }
    }'
    
  • curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    -d '{
      "model_id": "meta-llama/llama-3-8b-instruct",
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
      "response_format": {
        "type": "json_object"
      },
      "messages": [
        {
          "role": "system",
          "content": "You are a helpful assistant designed to output JSON."
        },
        {
          "role": "user",
          "content": [
            {
              "type": "user",
              "text": "Who won the world series in 2020?"
            }
          ]
        }
      ]
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • A text chat example.

    {
      "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 47,
        "prompt_tokens": 59,
        "total_tokens": 106
      }
    }
  • A tool calling example.

    {
      "id": "cmpl-15475d0dea9b4429a55843c77997f8a9",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "tool_calls": [
              {
                "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99",
                "type": "function",
                "function": {
                  "name": "get_current_weather",
                  "arguments": "{\n  \"location\": \"Boston, MA\",\n  \"unit\": \"fahrenheit\"\n}\n"
                }
              }
            ]
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 18,
        "prompt_tokens": 19,
        "total_tokens": 37
      }
    }
  • A text chat example with json output.

    {
      "id": "cmpl-09945b25c805491fb49e15439b8e5d84",
      "model_id": "meta-llama/llama-3-8b-instruct",
      "created": 1689958352,
      "created_at": "2023-07-21T16:52:32.190Z",
      "choices": [
        {
          "index": 0,
          "message": {
            "role": "assistant",
            "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]"
          },
          "finish_reason": "stop"
        }
      ],
      "usage": {
        "completion_tokens": 35,
        "prompt_tokens": 20,
        "total_tokens": 55
      }
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

Since CloudPak for Data 5.1.0.

POST /ml/v1/text/chat_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-chat.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate embeddings

Generate embeddings from text input.

See the documentation for a description of text embeddings.

Since watsonx.ai 2.0.0.

POST /ml/v1/text/embeddings

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-embeddings.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The text input for a given model to be used to generate the embeddings.

Examples:

A sample request.

A simple request.

{
  "model_id": "slate",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    "Youth craves thrills while adulthood cherishes wisdom."
  ]
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    -d '{
      "inputs": [
        "Youth craves thrills while adulthood cherishes wisdom.",
        "Youth seeks ambition while adulthood finds contentment.",
        "Dreams chased in youth while goals pursued in adulthood."
      ],
      "model_id": "slate",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f"
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "slate",
      "results": [
        {
          "embedding": [
            -0.006929283,
            -0.005336422,
            -0.024047505
          ]
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 10
    }

Start a text extraction request

Start a request to extract text and metadata from documents.

See the documentation for a description of text extraction.

Since CloudPak for Data 5.1.0.

POST /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input for the text extraction request.

Examples:

simple_request

A simple request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "tables_processing": {
      "enabled": true
    }
  }
}

ocr_request

A simple request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "document_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
    },
    "location": {
      "file_name": "files/document.pdf"
    }
  },
  "results_reference": {
    "type": "connection_asset",
    "connection": {
      "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
    },
    "location": {
      "file_name": "results"
    }
  },
  "steps": {
    "ocr": {
      "languages_list": [
        "en",
        "fr"
      ]
    },
    "tables_processing": {
      "enabled": false
    }
  }
}

Response

Status Code

  • Created. The Content-Location header will contain the URI reference to the created resource.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "tables_processing": {
            "enabled": true
          }
        },
        "results": {
          "status": "submitted",
          "number_pages_processed": 0
        }
      }
    }
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "ocr": {
            "languages_list": [
              "en",
              "fr"
            ]
          },
          "tables_processing": {
            "enabled": false
          }
        },
        "results": {
          "status": "submitted",
          "number_pages_processed": 0
        }
      }
    }

Retrieve the text extraction requests

Retrieve the list of text extraction requests for the specified space or project.

This operation does not save the history, any requests that were deleted or purged will not appear in this list.

Since CloudPak for Data 5.1.0.

GET /ml/v1/text/extractions

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

Response

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "limit": 10,
      "first": {
        "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions"
      },
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "created_at": "2023-05-02T16:27:51Z",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "name": "extract"
          },
          "entity": {
            "document_reference": {
              "type": "connection_asset",
              "connection": {
                "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
              },
              "location": {
                "file_name": "files/document.pdf"
              }
            },
            "results_reference": {
              "type": "connection_asset",
              "connection": {
                "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
              },
              "location": {
                "file_name": "results"
              }
            },
            "results": {
              "status": "completed",
              "number_pages_processed": 3,
              "running_at": "2023-05-02T16:28:03Z",
              "completed_at": "2023-05-02T16:29:31Z"
            }
          }
        }
      ]
    }

Get the results of the request

Retrieve the text extraction request with the specified identifier.

Note that there is a retention period of 2 days. If this retention period is exceeded then the request will be deleted and the results no longer available. In this case this operation will return 404.

Since CloudPak for Data 5.1.0.

GET /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.get

Request

Path Parameters

  • The identifier of the extraction request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "created_at": "2023-05-02T16:27:51Z",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "name": "extract"
      },
      "entity": {
        "document_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
          },
          "location": {
            "file_name": "files/document.pdf"
          }
        },
        "results_reference": {
          "type": "connection_asset",
          "connection": {
            "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
          },
          "location": {
            "file_name": "results"
          }
        },
        "steps": {
          "tables_processing": {
            "enabled": true
          }
        },
        "results": {
          "status": "running",
          "number_pages_processed": 2,
          "running_at": "2023-05-02T16:28:03Z"
        }
      }
    }

Delete the request

Cancel the specified text extraction request and delete any associated results.

Since CloudPak for Data 5.1.0.

DELETE /ml/v1/text/extractions/{id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-extraction.delete

Request

Path Parameters

  • The identifier of the extraction request.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

Response

Status Code

  • Request deleted.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Infer text

Infer the next tokens for a given deployed model with a set of parameters.

POST /ml/v1/text/generation

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens.

Examples:

A request without moderations.

A simple request.

{
  "model_id": "google/flan-ul2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "temperature": 0.8,
    "max_new_tokens": 30
  }
}

A request with moderations.

A simple request with moderations.

{
  "model_id": "google/flan-t5-xl",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-ul2",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "4,000 km",
          "generated_token_count": 4,
          "input_token_count": 12,
          "stop_reason": "eos_token"
        }
      ]
    }
  • The generated text from the model along with other details.

    {
      "model_id": "google/flan-t5-xl",
      "created_at": "2023-07-21T16:52:32.190Z",
      "results": [
        {
          "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.",
          "generated_token_count": 118,
          "input_token_count": 11,
          "stop_reason": "eos_token",
          "moderations": {
            "pii": [
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 74,
                  "end": 88
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 200,
                  "end": 212
                },
                "entity": "PhoneNumber"
              },
              {
                "score": 0.8,
                "input": false,
                "position": {
                  "start": 244,
                  "end": 259
                },
                "entity": "EmailAddress"
              }
            ]
          }
        }
      ]
    }

Infer text event stream

Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.

POST /ml/v1/text/generation_stream

Auditing

Calling this method generates the following auditing event.

  • pm-20.foundation-model.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

From a given prompt, infer the next tokens in a server-sent events (SSE) stream.

Examples:

A request without moderations.

A simple request.

{
  "model_id": "google/flan-ul2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
  "parameters": {
    "temperature": 0.8,
    "max_new_tokens": 30
  }
}

A request with moderations.

A simple request with moderations.

{
  "model_id": "google/flan-t5-xl",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "input": "Tell me how to reach the US Postal service",
  "parameters": {
    "max_new_tokens": 120,
    "min_new_tokens": 100,
    "repetition_penalty": 2
  },
  "moderations": {
    "hap": {
      "output": {
        "enabled": true,
        "threshold": 0.5
      }
    },
    "pii": {
      "output": {
        "enabled": true
      },
      "mask": {
        "remove_entity_value": true
      }
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-t5-xxl",
      "input": "how far is paris from bangalore:",
      "parameters": {
        "max_new_tokens": 100,
        "time_limit": 1000
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.

Status Code

  • Successful operation (Content-Type: text/event-stream).

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

Generate rerank

Rerank texts based on some queries.

POST /ml/v1/text/rerank

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-rerank.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input texts and the queries for reranking.

Examples:

A sample request.

{
  "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "inputs": [
    {
      "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
    },
    {
      "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
    }
  ],
  "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
  "parameters": {
    "return_options": {
      "top_n": 2
    }
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Accept: application/json'
    -d '{
      "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "inputs": [
        {
          "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
        },
        {
          "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
        }
      ],
      "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
      "parameters": {
        "return_options": {
          "top_n": 2
        }
      }
    }'
    

Response

System details.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • An array of embeddings for each input string.

    {
      "model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
      "results": [
        {
          "index": 1,
          "score": 0.7461
        },
        {
          "index": 0,
          "score": 0.8274
        }
      ],
      "created_at": "2024-02-21T17:32:28Z",
      "input_token_count": 20
    }

Text tokenization

The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.

POST /ml/v1/text/tokenization

Auditing

Calling this method generates the following auditing event.

  • pm-20.text-tokenization.send

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The input string to tokenize.

Examples:

A sample request.

{
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "model_id": "google/flan-ul2",
  "input": "Write a tagline for an alumni association: Together we",
  "parameters": {
    "return_tokens": true
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "model_id": "google/flan-ul2,",
      "input": "Write a tagline for an alumni association: Together we",
      "parameters": {
        "return_tokens": true
      },
      "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f"
    }'

Response

The tokenization result.

Status Code

  • Successful operation

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • The response with the token count and the tokens, if requested.

    {
      "model_id": "google/flan-ul2",
      "result": {
        "token_count": 11,
        "tokens": [
          "Write",
          "a",
          "tag",
          "line",
          "for",
          "an",
          "alumni",
          "associ",
          "ation:",
          "Together",
          "we"
        ]
      }
    }

Create a new watsonx.ai training

Create a new watsonx.ai training in a project or a space.

The details of the base model and parameters for the training must be provided in the prompt_tuning object.

In order to deploy the tuned model you need to follow the following steps:

  1. Create a WML model asset, in a space or a project, by providing the request.json as shown below:

    curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \
      -H "Authorization: Bearer <replace with your token>" \
      -H "content-type: application/json" \
      --data '{
         "name": "replace_with_a_meaningful_name",
         "space_id": "replace_with_your_space_id",
         "type": "prompt_tune_1.0",
         "software_spec": {
           "name": "watsonx-textgen-fm-1.0"
         },
         "metrics": [ from the training job ],
         "training": {
           "id": "05859469-b25b-420e-aefe-4a5cb6b595eb",
           "base_model": {
             "model_id": "google/flan-t5-xl"
           },
           "task_id": "generation",
           "verbalizer": "Input: {{input}} Output:"
         },
         "training_data_references": [
           {
             "connection": {
               "id": "20933468-7e8a-4706-bc90-f0a09332b263"
             },
             "id": "file_to_tune1.json",
             "location": {
               "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl",
               "path": "file_to_tune1.json"
             },
             "type": "connection_asset"
           }
         ]
       }'
    

    Notes:

    1. If you used the training request field auto_update_model: true then you can skip this step as the model will have been saved at the end of the training job.
    2. Rather than creating the payload for the model you can use the generated request.json that was stored in the results_reference field, look for the path in the field entity.results_reference.location.model_request_path.
    3. The model type must be prompt_tune_1.0.
    4. The software spec name must be watsonx-textgen-fm-1.0.
  2. Create a tuned model deployment as described in the create deployment documentation.

POST /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.create

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

The training_data_references contain the training datasets and the results_reference the connection where results will be stored.

Examples:

Start a prompt tune training job.

{
  "name": "my-prompt-tune-training",
  "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
  "prompt_tuning": {
    "base_model": {
      "model_id": "google/flan-t5-xl"
    },
    "tuning_type": "prompt_tuning",
    "task_id": "classification",
    "num_epochs": 30,
    "learning_rate": 0.4,
    "accumulate_steps": 3,
    "batch_size": 10,
    "max_input_tokens": 100,
    "max_output_tokens": 100
  },
  "training_data_references": [
    {
      "id": "tune1_data.json",
      "location": {
        "path": "tune1_data.json"
      },
      "type": "container"
    }
  ],
  "auto_update_model": true,
  "results_reference": {
    "location": {
      "path": "tune1/results"
    },
    "type": "container"
  }
}
  • curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02'
    -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
    -H 'Content-Type: application/json'
    -H 'Accept: application/json'
    --data-raw '{
      "name": "my-prompt-tune-training",
      "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
      "prompt_tuning": {
        "base_model": {
          "model_id": "google/flan-t5-xl"
        },
        "task_id": "classification",
        "tuning_type": "prompt_tuning",
        "num_epochs": 30,
        "learning_rate": 0.4,
        "accumulate_steps": 3,
        "batch_size": 10,
        "max_input_tokens": 100,
        "max_output_tokens": 100
      },
      "training_data_references": [
        {
          "id": "tune1_data.json",
          "location": {
            "path": "tune1_data.json"
          },
          "type": "container"
        }
      ],
      "auto_update_model": true,
      "results_reference": {
        "location": {
          "path": "tune1/results"
        },
        "type": "container"
      }
    }'

Response

Training resource.

Status Code

  • The training job has been created.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Retrieve the list of trainings

Retrieve the list of trainings for the specified space or project.

GET /ml/v4/trainings

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.list

Request

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the next field.

  • How many resources should be returned. By default limit is 100. Max limit allowed is 200.

    Possible values: 1 ≤ value ≤ 200

    Default: 100

    Example: 50

  • Compute the total count. May have performance impact.

  • Return only the resources with the given tag value.

  • Filter based on on the training job state.

    Allowable values: [queued,pending,running,storing,completed,failed,canceled]

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Information for paging when querying resources.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "limit": 100,
      "first": {
        "href": "https://{cluster_url}/ml/v4/trainings"
      },
      "total_count": 1,
      "resources": [
        {
          "metadata": {
            "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
            "name": "my-prompt-training",
            "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
            "created_at": "2023-08-04T13:22:47.000Z"
          },
          "entity": {
            "prompt_tuning": {
              "base_model": {
                "model_id": "google/flan-t5-xl"
              },
              "task_id": "classification"
            },
            "training_data_references": [
              {
                "id": "tune1_data.json",
                "location": {
                  "path": "tune1_data.json"
                },
                "type": "container"
              }
            ],
            "auto_update_model": true,
            "results_reference": {
              "location": {
                "path": "tune1/results",
                "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
                "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
                "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
                "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
              },
              "type": "container"
            },
            "status": {
              "state": "completed",
              "running_at": "2023-08-04T13:22:48.000Z",
              "completed_at": "2023-08-04T13:22:55.289Z",
              "metrics": [
                {
                  "iteration": 0,
                  "ml_metrics": {
                    "loss": 4.49988
                  },
                  "timestamp": "2023-09-22T02:52:03.324Z"
                },
                {
                  "iteration": 1,
                  "ml_metrics": {
                    "loss": 3.86884
                  },
                  "timestamp": "2023-09-22T02:52:03.689Z"
                },
                {
                  "iteration": 2,
                  "ml_metrics": {
                    "loss": 4.05115
                  },
                  "timestamp": "2023-09-22T02:52:04.053Z"
                }
              ]
            }
          }
        }
      ]
    }

Retrieve the training

Retrieve the training with the specified identifier.

GET /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.get

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

Response

Training resource.

Status Code

  • OK.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

Example responses
  • {
      "metadata": {
        "id": "6213cf1-252f-424b-b52d-5cdd9814956c",
        "name": "my-prompt-training",
        "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
        "created_at": "2023-08-04T13:22:47.000Z"
      },
      "entity": {
        "prompt_tuning": {
          "base_model": {
            "model_id": "google/flan-t5-xl"
          },
          "task_id": "classification"
        },
        "training_data_references": [
          {
            "id": "tune1_data.json",
            "location": {
              "path": "tune1_data.json"
            },
            "type": "container"
          }
        ],
        "auto_update_model": true,
        "results_reference": {
          "location": {
            "path": "tune1/results",
            "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82",
            "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json",
            "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets",
            "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json"
          },
          "type": "container"
        },
        "status": {
          "state": "completed",
          "running_at": "2023-08-04T13:22:48.000Z",
          "completed_at": "2023-08-04T13:22:55.289Z",
          "metrics": [
            {
              "iteration": 0,
              "ml_metrics": {
                "loss": 4.49988
              },
              "timestamp": "2023-09-22T02:52:03.324Z"
            },
            {
              "iteration": 1,
              "ml_metrics": {
                "loss": 3.86884
              },
              "timestamp": "2023-09-22T02:52:03.689Z"
            },
            {
              "iteration": 2,
              "ml_metrics": {
                "loss": 4.05115
              },
              "timestamp": "2023-09-22T02:52:04.053Z"
            }
          ]
        }
      }
    }

Cancel or delete the training

Cancel or delete the specified training, once deleted all trace of the job is gone.

DELETE /ml/v4/trainings/{training_id}

Auditing

Calling this method generates the following auditing event.

  • pm-20.training.delete

Request

Path Parameters

  • The training identifier.

Query Parameters

  • The version date for the API of the form YYYY-MM-DD.

    Example: 2023-07-07

  • The space that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: 63dc4cf1-252f-424b-b52d-5cdd9814987f

  • The project that contains the resource. Either space_id or project_id query parameter has to be given.

    Possible values: length = 36, Value must match regular expression ^[a-zA-Z0-9-]*$

    Example: a77190a2-f52d-4f2a-be3d-7867b5f46edc

  • Set to true in order to also delete the job or request metadata.

Response

Status Code

  • Training cancelled.

  • Bad request, the response body should contain the reason.

  • Unauthorized.

  • Forbidden, an authentication error including trying to access an unauthorized space or project.

  • The specified resource was not found.

No Sample Response

This method does not specify any sample responses.

id=curlclassName=tab-item-selected