Introduction to IBM watsonx.ai software
Using IBM watsonx.ai software
APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).
If you are looking for the IBM watsonx.ai as a Service
APIs, see here.
Step-by-step instructions on how to use IBM watsonx.ai software
can be found
here.
There is a specialized python library that is available to access this REST API.
Endpoint URLs
The base URLs for API endpoints come from the cluster and add-on service instance. The URL follows this pattern:
https://{cluster_url}/ml/v1
{cluster_url}
represents the name or IP address of your deployed cluster. Use a hostname that resolves to an IP address in the cluster.
To find the base URL, view the details for the service instance from the Cloud Pak for Data web client.
Note that for prompts
and notebooks
the base URLs are /wx
.
Use that URL in your requests to the API.
Endpoint example
curl -k -X {request_method} -H "Authorization: Bearer {token}" "https://{cluster_url}/ml/v1/text/generation"
Disabling SSL verification
Watson Machine Learning uses Secure Sockets Layer (SSL) (or Transport Layer Security (TLS)) for secure connections between the client and server. The connection is verified against the local certificate store to ensure authentication, integrity, and confidentiality.
If you use a self-signed certificate, you need to disable SSL verification to make a successful connection.
Enabling SSL verification is highly recommended. Disabling SSL jeopardizes the security of the connection and data. Disable SSL only if necessary, and take steps to enable SSL as soon as possible.
To disable SSL verification for a curl request, use the --insecure
(-k
) option with the request.
Authentication
A bearer token is required to use any of the watsonx.ai
APIs.
For more information, see the Authorization section of the Platform API reference.
Use the value of the access_token
property from the example request. Set the access_token
value as the authorization header parameter for requests to the APIs. The format is Authorization: Bearer {access_token_value}
:
Authorization: Bearer eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6IlJTMjU2In0...
Example request that uses an API key to retrieve the token
curl -k -X POST "https://cluster_url_host/icp4d-api/v1/authorize" -H "cache-control: no-cache" -H "content-type: application/json" -d "{\“username\”:\“admin\”,\“password\”:\“password\”}"
Response
{
"username": "admin",
"role": "Admin",
"permissions": [
"administrator"
],
"sub": "admin",
"iss": "KNOXSSO",
"aud": "DSX",
"uid": "999",
"authenticator": "default",
"access_token": "eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6...",
"_messageCode_": "success"
}
Error handling
This API uses standard HTTP response codes to indicate whether a method completed successfully.
A 200
type response indicates success.
HTTP Code | Description | Recovery |
---|---|---|
200 |
Success | The request was successful. |
400 |
Bad Request | The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body. |
401 |
Unauthorized | You are not authorized to make this request. Log in and try again or provide a valid token. For more information about logging in, see the Authentication section. If this error persists, contact the account owner to check your permissions. |
403 |
Forbidden | The supplied authentication is not authorized. |
404 |
Not Found | The requested resource could not be found. |
Note that 429
and 503
errors may mean that the model is overloaded or unavailable,
check the error description for more details.
Additional headers
Some additional headers might be required to make successful requests to the API. Those additional headers are described below.
An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id
and the value is anything that you choose.
If there is not a transaction ID that is passed in, then one is generated randomly.
API change log
In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai
API.
The change log lists changes that have been made, ordered by the date they were released.
Changes to existing API versions are designed to be compatible with existing client applications,
if this is not the case then a new version date will be created.
Versioning
API requests require a version parameter that takes the date in the format version=YYYY-MM-DD
. Send the version parameter with every API request.
When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.
Data References
Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires
the use of connection_asset
or data_asset
reference types.
These reference types are created within a space or a project and are referenced in requests to represent input
data and results locations. These types contain two parameter objects, connection
and location
, which require
different values to be supplied based on the reference type. Using a data_asset
, requires an href
to be supplied
to the location
object whereas using a connection_asset
requires the connection_id
for the connection
object
and different location
fields depending on the data source type,
Example connection_asset
payload:
{
"training_data_references": [
{
"type": "connection_asset",
"connection": {
"id": "<connection_guid>"
},
"location": {
"<wdp-properties depending on the type>": "<value depending on the type>"
}
}
]
}
Example data_asset
payload:
{
"training_data_references": [
{
"type": "data_asset",
"location": {
"href": "/v2/assets/<asset_id>?space_id=<space_id>"
}
}
]
}
Example fs
payload:
- project_id
{
"training_data_references": [
{
"type":"fs",
"location":{
"path":"/projects/<project_id>/assets/<fs_path>"
}
}
]
}
- space_id
{
"training_data_references": [
{
"type":"fs",
"location":{
"path":"/spaces/<space_id>/assets/<fs_path>"
}
}
]
}
Methods
Retrieve the custom foundation models
Retrieve the custom foundation models.
In order to deploy a custom foundation model using one of the models in this list you need to follow the following steps:
-
Create a model asset, in a space or a project, by providing the custom foundation model details as shown below:
curl -X POST "https://{cluster_url}/ml/v4/models?version=2024-01-29" \ -H "Authorization: Bearer <replace with your token>" \ -H "content-type: application/json" \ --data '{ "name": "replace_with_a_meaningful_name", "space_id": "replace_with_your_space_id", "foundation_model": { "model_id": "replace_with_your_model_id" }, "type": "custom_foundation_model_1.0", "software_spec": { "name": "watsonx-cfm-caikit-1.0" } }'
Notes:
- The model
type
must becustom_foundation_model_1.0
. - The software spec name must be
watsonx-cfm-caikit-1.0
.
- The model
-
Create a custom foundation model deployment as described in the create deployment documentation.
Since watsonx.ai 1.1.x
.
GET /ml/v4/custom_foundation_models
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
curl --request GET 'https://{cpd_cluster}/ml/v4/custom_foundation_models?version=2023-05-02&limit=10' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
Pagination information and list of models and common parameters.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources.
Example:
1
A reference to the first item of the next page, if any.
A list of models.
A list of common parameters that apply to all models, but can be overridden in each model description.
Status Code
OK
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The list of custom foundation models that were created and registered.
{ "total_count": 1, "limit": 10, "first": { "href": "https://{cpd_cluster}/ml/v4/custom_foundation_models" }, "resources": [ { "model_id": "my_flan_t5_xl", "description": "A tuned version of flan_t5_xl", "tags": [ "flan_t5_xl" ], "parameters": [ { "name": "max_batch_weight", "display_name": "Maximum batch weight", "default": 10000, "description": "The maximum batch weight that is allowed for this model.", "type": "number", "min": 0, "max": 100000 } ] } ], "parameters": [ { "name": "max_batch_weight", "display_name": "Maximum batch weight", "default": 1000, "description": "The maximum batch weight that is allowed for all models.", "type": "number", "min": 0, "max": 10000 } ] }
Create a new watsonx.ai deployment
Create a new deployment, currently the only supported type is online
.
If this is a deployment for a prompt tune then the asset
object must exist and the id
must be the id
of the model
that was created after the prompt training.
If this is a deployment for a prompt template then the prompt_template
object should exist and the id
must be the id
of the prompt template to be deployed.
If this is a deployment for a custom foundation model then the online
object must exist, the asset
object must exist and point to the model object that describes the custom foundation model, and the hardware_spec
is mandatory. Note that the base_model_id
will be returned and will be the base model id that is defined in the model asset (asset.id
).
If this is a deployment for a fine tuned model then the asset.id
must point to the model that was created after the fine tuning. In case of a fine tuned model with a template, the field base_deployment_id
will be the tuned model deployment.
Pre-defined hardware specifications are provided for custom foundation model deployments:
WX-S
: 1 GPU, Request 1 CPU, Limit 2 CPU and 60 GB (Request and Limit) - 1B to 20B parametersWX-M
: 2 GPU, Request 2 CPU, Limit 3 CPU and 120 GB (Request and Limit) - 21B to 40B parametersWX-L
: 4 GPU, Request 4 CPU, Limit 5 CPU and 240 GB (Request and Limit) - 41B to 80B parametersWX-XL
: 8 GPU, Request 8 CPU, Limit 9 CPU and 600 GB (Request and Limit) - 81B to 200B parameters
A prompt template can be used in conjunction with a custom foundation model by specifying the prompt_template
object with the id
point to the prompt template.
POST /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The deployment request entity.
The following important fields are described for each use case:
- Prompt template:
base_model_id
: requiredpromt_template.id
: requiredonline
: requiredhardware_spec
: forbiddenhardware_request
: forbidden- response
deployed_asset_type
:foundation_model
- Prompt tune:
asset.id
: requiredonline
: requiredhardware_spec
: forbiddenhardware_request
: forbiddenbase_model_id
: forbidden- response
deployed_asset_type
:prompt_tune
- Custom foundation model:
asset.id
: requiredonline
: requiredonline.parameters.foundation_model
: optionalhardware_spec
: requiredhardware_request
: forbiddenbase_model_id
: forbiddenbase_deployment_id
: forbidden- response
deployed_asset_type
:custom_foundation_model
- Custom foundation model with template:
base_deployment_id
: requiredpromt_template.id
: requiredonline
: requiredonline.parameters.foundation_model
: forbiddenhardware_spec
: forbiddenhardware_request
: forbiddenasset.id
: forbiddenbase_model_id
: forbidden- response
deployed_asset_type
:custom_foundation_model
- Fine tuned model:
asset.id
: requiredonline
: requiredonline.parameters.foundation_model
: optionalhardware_spec
: requiredhardware_request
: forbiddenbase_model_id
: forbiddenbase_deployment_id
: forbidden- response
deployed_asset_type
:fine_tune
- Fine tune model with template:
base_deployment_id
: requiredpromt_template.id
: requiredonline
: requiredonline.parameters.foundation_model
: forbiddenhardware_spec
: forbiddenhardware_request
: forbiddenasset.id
: forbiddenbase_model_id
: forbidden- response
deployed_asset_type
:fine_tune
Create a prompt tune deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "text_classification",
"asset": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {}
}
Create a prompt template deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "text_classification",
"base_model_id": "google/flan-ul2",
"prompt_template": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {}
}
Create a custom foundation model deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "my_tuned_flan",
"asset": {
"id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
},
"online": {
"parameters": {
"serving_name": "myflan",
"foundation_model": {
"max_batch_weight": 10000,
"max_sequence_length": 8192
}
}
},
"hardware_spec": {
"id": "WX-S",
"num_nodes": 1
}
}
Create a prompt template deployment with a custom foundation model.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "my_tuned_flan_template",
"base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc",
"prompt_template": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {
"parameters": {
"serving_name": "myflan_template"
}
}
}
The name of the resource.
Example:
my-resource
Indicates that this is an online deployment. An object has to be specified but can be empty. The
serving_name
can be provided in theonline.parameters
.The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
A description of the resource.
Example:
This is my first resource.
A list of tags for this resource.
Examples:[ "t1", "t2" ]
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
A reference to a resource.
A hardware specification.
The requested hardware for deployment.
A reference to a resource.
The base model that is required for this deployment if this is for a prompt template or a prompt tune for an IBM foundation model (so this does not apply for custom foundation models).
Example:
google/flan-t5-xl
The base deployment when this is a custom foundation model with a prompt template. The
id
must be the id of the custom foundation model deployment.Possible values: length ≤ 128, Value must match regular expression
^[-0-9a-z]+$
Example:
a12b278b-e40c-4ca4-bfa0-a4e8583b58e1
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "prompt_tune", "base_model_id": "google/flan-ul2", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "online": { "parameters": { "serving_name": "myflan", "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192 } } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan_template" }, "entity": { "base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc", "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": { "parameters": { "serving_name": "myflan_template" } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan_template/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan_template/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
Retrieve the deployments
Retrieve the list of deployments for the specified space or project.
GET /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Retrieves the deployment, if any, that contains this
serving_name
.Example:
classification
Retrieves only the resources with the given tag value.
Retrieves only the resources with the given asset_id, asset_id would be the model id.
Retrieves only the resources with the given prompt_template_id.
Retrieves only the resources with the given name.
Retrieves the resources filtered with the given type. There are the deployment types as well as an additional
prompt_template
if the deployment type includes a prompt template.The supported deployment types are (see the description for
deployed_asset_type
in the deployment entity):prompt_tune
- when a prompt tuned model is deployed.foundation_model
- when a prompt template is used on a pre-deployed IBM provided model.custom_foundation_model
- when a custom foundation model is deployed.
These can be combined with the flag
prompt_template
like this:type=prompt_tune
- return all prompt tuned model deployments.type=prompt_tune and prompt_template
- return all prompt tuned model deployments with a prompt template.type=foundation_model
- return all prompt template deployments.type=foundation_model and prompt_template
- return all prompt template deployments - this is the same as the previous query because afoundation_model
can only exist with a prompt template.type=custom_foundation_model
- return all custom model deployments.type=custom_foundation_model and prompt_template
- return all custom model deployments with a prompt template.type=prompt_template
- return all deployments with a prompt template.
Retrieves the resources filtered by state. Allowed values are
initializing
,updating
,ready
andfailed
.Returns whether
serving_name
is available for use or not. This query parameter cannot be combined with any other parameter except forserving_name
.Default:
false
Response
The deployment resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
A list of deployment resources.
System details including warnings.
Status Code
OK.
serving_name
is available for use. Returned whenserving_name
andconflict
query parameters are used.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
Returned when
serving_name
andconflict
query parameters are used. The response body will contain the reason.
{ "limit": 10, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments" }, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "deployed_asset_type": "prompt_tune", "online": {}, "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } } ] }
Retrieve the deployment details
Retrieve the deployment details with the specified identifier.
GET /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment details.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "prompt_tune", "base_model_id": "google/flan-ul2", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "hardware_spec": { "id": "WX-S", "num_nodes": 1 }, "online": { "parameters": { "serving_name": "myflan", "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192 } } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
Update the deployment metadata
Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.
/name
/description
/tags
/custom
/online/parameters
/asset
-replace
only/prompt_template
-replace
only/hardware_spec
/hardware_request
/base_model_id
-replace
only (applicable only to prompt template deployments referring to IBM base foundation models) Since CloudPak for Data5.0.3
.
The PATCH operation with path specified as /online/parameters
can be used to update the serving_name
.
PATCH /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
The json patch.
The operation to be performed.
Allowable values: [
add
,remove
,replace
]The pointer that identifies the field that is the target of the operation.
The value to be used within the operation.
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment accepted
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Delete the deployment
Delete the deployment with the specified identifier.
DELETE /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when the deployment was created.
Return options
Note that there is currently a limitation in this operation when using return_options
,
for input only input_text
will be returned if requested,
for output the input_tokens
and generated_tokens
will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generation
Request
Path Parameters
The
id_or_name
can be either thedeployment_id
that identifies the deployment or aserving_name
that allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
A prompt tune request.
A prompt tune request.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "how far is paris from bangalore:\n",
"parameters": {
"max_new_tokens": 100
}
}
A prompt tune request with moderations.
A prompt tune request with moderations.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
A prompt template request.
A prompt template request.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "how far is paris from bangalore:\n",
"parameters": {
"max_new_tokens": 100
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
System details.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details for a prompt tune.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
The generated text from the model along with other details for a prompt tune with moderations.
{ "model_id": "google/flan-t5-xl", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.", "generated_token_count": 118, "input_token_count": 11, "stop_reason": "eos_token", "moderations": { "pii": [ { "score": 0.8, "input": false, "position": { "start": 74, "end": 88 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 200, "end": 212 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 244, "end": 259 }, "entity": "EmailAddress" } ] } } ] }
The generated text from the model along with other details for a prompt template.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters.
This operation will return the output tokens as a stream of events.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when the deployment was created.
Return options
Note that there is currently a limitation in this operation when using return_options
,
for input only input_text
will be returned if requested,
for output the input_tokens
and generated_tokens
will not be returned, also the
rank
and top_tokens
will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generation_stream
Request
Path Parameters
The
id_or_name
can be either thedeployment_id
that identifies the deployment or aserving_name
that allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
{
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"decoding_method": "sample",
"temperature": 0.8,
"max_new_tokens": 200
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Create a fine tuning job
Create a fine tuning job that will fine tune an LLM.
Since CloudPak for Data 5.0.3
.
POST /ml/v1/fine_tunings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The details of the fine tuning job with the data used to tune the LLM.
The name of the job.
The
type
of Fine Tuning training. Thetype
is set toilab
for InstructLab training.Allowable values: [
ilab
]The training datasets.
Possible values: 1 ≤ number of items ≤ 20
The training results. Normally this is specified as
type=container
which means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }
- results_reference
The data source type like
connection_asset
ordata_asset
.Allowable values: [
connection_asset
,data_asset
,container
,url
,fs
]Example:
connection_asset
Contains a set of fields that describe the location of the data with respect to the
connection
.Examples:{ "bucket": "wml-v4-fvt-remote-tests", "file_name": "heart_testpy379.csv" }
- location
Item identification inside a collection.
Contains a set of fields specific to each connection. See here for details about specifying connections.
Examples:{ "id": "2d07a6b4-8fa9-43ab-91c8-befcd9dab8d2" }
The description of the job.
A list of tags for this resource.
Examples:[ "t1", "t2" ]
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
If set to
true
then the result of the training, if successful, will be uploaded to the repository as a model.Default:
false
The parameters for the job. Note that if
verbalizer
is provided thenresponse_template
must also be provided (and vice versa).The holdout/test datasets.
Possible values: 1 ≤ number of items ≤ 20
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Response
The response of a fine tuning job.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-fine-tuning-job", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "owner": "guy", "created_at": "2023-08-04T13:22:55.289Z", "rev": "2", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
- metadata
The id of the resource.
The time when the resource was created.
The revision of the resource.
The user id which created this resource.
The time when the resource was last modified.
The id of the parent resource where applicable.
The name of the resource.
A description of the resource.
A list of tags for this resource.
Examples:[ "t1", "t2" ]
Information related to the revision.
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Status of the training job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the list of fine tuning jobs
Retrieve the list of fine tuning jobs for the specified space or project.
Since CloudPak for Data 5.0.3
.
GET /ml/v1/fine_tunings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned.
Possible values: value ≤ 200
Default:
100
Compute the total count. May have performance impact.
Return only the resources with the given tag value.
Filter based on on the job state: queued, running, completed, failed etc.
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
The response of a fine tuning job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Get a fine tuning job
Get the results of a fine tuning job, or details if the job failed.
Since CloudPak for Data 5.0.3
.
GET /ml/v1/fine_tunings/{id}
Request
Path Parameters
The
id
is the identifier that was returned in themetadata.id
field of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
The response of a fine tuning job.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-fine-tuning-job", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "owner": "guy", "created_at": "2023-08-04T13:22:55.289Z", "rev": "2", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
- metadata
The id of the resource.
The time when the resource was created.
The revision of the resource.
The user id which created this resource.
The time when the resource was last modified.
The id of the parent resource where applicable.
The name of the resource.
A description of the resource.
A list of tags for this resource.
Examples:[ "t1", "t2" ]
Information related to the revision.
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Status of the training job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Cancel or delete a fine tuning job
Delete a fine tuning job if it exists, once deleted all trace of the job is gone.
Since CloudPak for Data 5.0.3
.
DELETE /ml/v1/fine_tunings/{id}
Request
Path Parameters
The
id
is the identifier that was returned in themetadata.id
field of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Set to true in order to also delete the job or request metadata.
List the available foundation models
Retrieve the list of deployed foundation models.
GET /ml/v1/foundation_model_specs
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
A set of filters to specify the list of models, filters are described as the
pattern
shown below.pattern: tfilter[,tfilter][:(or|and)] tfilter: filter | !filter filter: Requires existence of the filter. !filter: Requires absence of the filter. filter: one of modelid_*: Filters by model id. Namely, select a model with a specific model id. provider_*: Filters by provider. Namely, select all models with a specific provider. source_*: Filters by source. Namely, select all models with a specific source. input_tier_*: Filters by input tier. Namely, select all models with a specific input tier. output_tier_*: Filters by output tier. Namely, select all models with a specific output tier. tier_*: Filters by tier. Namely, select all models with a specific input or output tier. task_*: Filters by task id. Namely, select all models that support a specific task id. lifecycle_*: Filters by lifecycle state. Namely, select all models that are currently in the specified lifecycle state. function_*: Filters by function. Since CloudPak for Data `5.0.0`. Namely, select all models that support a specific function.
Possible values: 1 ≤ length ≤ 1000, Value must match regular expression
^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$
Example:
modelid_ibm/granite-13b-instruct-v2
See all the
Tech Preview
models if entitled.Default:
false
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources.
Example:
1
A reference to the first item of the next page, if any.
The supported foundation models.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The models that are currently deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02" }, "resources": [ { "model_id": "bigcode/starcoder", "label": "starcoder-15.5b", "provider": "BigCode", "source": "Hugging Face", "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions", "tasks": [ { "id": "code", "ratings": { "quality": 3 } } ], "min_shot_size": 0, "input_tier": "class_2", "output_tier": "class_2", "number_params": "15.5b" } ] }
List the supported tasks
Retrieve the list of tasks that are supported by the foundation models.
GET /ml/v1/foundation_model_tasks
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources.
Example:
1
A reference to the first item of the next page, if any.
The supported foundation model tasks.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The tasks that are currently supported by models deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02" }, "resources": [ { "task_id": "question_answering", "label": "Question answering", "rank": 1, "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance." } ] }
Create a new notebook.
Create a new notebook
- either from scratch
- or by copying another notebook.
To create a notebook from scratch, you need to first upload the notebook content(ipynb
format) to your project or space storage using Assets-files API and
then reference it with the attribute file_reference
.
The other required attributes are name
, project/space
and runtime
.
The attribute runtime
is used to specify the environment on which the notebook runs.
Either project
or space
must be specified in the request body.
To copy a notebook, you only need to provide name
and source_guid
in the request body.
POST /v2/notebooks
Request
Specification of the notebook to be created.
Create a notebook from scratch in a project
{
"name": "my notebook",
"description": "this is my notebook",
"project": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"file_reference": "notebook/my_notebook.ipynb",
"runtime": {
"environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"spark_monitoring_enabled": true
}
}
The name of the new notebook.
Example:
my notebook
The reference to the file in the object storage.
Example:
notebook/my_notebook.ipynb
A notebook runtime.
The guid of the project in which to create the notebook.
Example:
92ae0e27-9b11-4de9-a646-d46ca3c183d4
A more verbose description of the notebook.
Example:
this is my notebook
The notebook origin.
A notebook kernel.
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Created and returned a new notebook asset. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
The number of requests has exceeded the rate limit.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "space_id": "92ae0e27-9b11-4de9-a646-d46ca3c183d4" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-92ae0e27-9b11-4de9-a646-d46ca3c183d4", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?space_id=92ae0e27-9b11-4de9-a646-d46ca3c183d4" } }
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "notebook", "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "rate_limit", "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later." } ] }
Retrieve the details of a large number of notebooks inside a project.
Retrieve the details of a large number of notebooks inside a project.
POST /v2/notebooks/list
Request
Query Parameters
The guid of the project.
Additional info that will be included into the notebook details. Possible values are:
- runtime
Payload for a notebook list request.
List notebooks
{
"notebooks": [
"ca3c0e27-46ca-83d4-a646-d49b11c14de9"
]
}
The list of notebooks whose details will be retrieved.
Response
A list of notebook info as returned by a list query.
The number of items in the resources list.
Example:
1
An array of notebooks.
Status Code
Success. Returned a list of notebook assets. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "41d09a9a-f771-48a2-9534-50c0c622356d", "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d" }, "entity": { "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "asset": { "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "asset_type": "notebook", "created_at": "2021-07-01T12:37:01Z", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "version": 2, "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a particular notebook, including the notebook asset.
Delete a particular notebook, including the notebook asset.
DELETE /v2/notebooks/{notebook_guid}
Response
Status Code
Successful request. Notebook is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Revert the main notebook to a version.
Revert the main notebook to a version.
PUT /v2/notebooks/{notebook_guid}
Request
Path Parameters
The guid of the main notebook.
Payload for a request to revert to a specific notebook version.
Revert the notebook to a version
{
"source": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
}
The guid of the notebook version.
Example:
ca3c0e27-46ca-83d4-a646-d49b11c14de9
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Reverted the main notebook to a version. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook v4.2", "description": "this is my notebook v4.2", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Request
Path Parameters
The guid of the notebook.
Payload for a notebook update request.
Update a notebook
{
"environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
"spark_monitoring_enabled": false,
"kernel": {
"display_name": "Python 3.9 with Spark",
"name": "python39",
"language": "python3"
}
}
The guid of the environment on which the notebook runs.
Example:
d46ca0e27-a646-4de9-a646-9b113c183d4
Spark monitoring enabled or not.
A notebook kernel.
Response
Notebook information as returned by a GET request.
Metadata of a notebook.
Entity of a notebook.
Status Code
Success. Updated the notebook. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4", "spark_monitoring_enabled": false }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Promote a notebook from project to space.
Promote a notebook from project to space.
POST /v2/notebooks/{notebook_guid}/promote
Request
Path Parameters
The guid of the notebook.
Query Parameters
The guid of the notebook version.
The id of the project from which a notebook will be promoted.
Body parameters for promoting a notebook. space_id
is required. name
and description
are optional. If not specified, the name and description of the source notebook in project will be used.
The id of the space to which a notebook will be promoted
Example:
b275be5f-10ff-47ee-bfc9-63f1ce5addbf
The name of the new notebook in space. If not specified, the name of the notebook in project will be used.
Example:
my notebook
The description of the new notebook in space. If not specified, the description of the notebook in project will be used.
Example:
this is my notebook in space
A list of tags for the new notebook in space. If not specified, the tags will be ['notebook'].
Examples:[ "test", "promote" ]
Response
Notebook information in a space as returned by promoting a notebook from project to space.
Metadata of a notebook in a space.
Entity of a notebook without runtime.
Status Code
Success. Returned the notebook asset in the space.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new version.
Create a version of a given notebook.
POST /v2/notebooks/{notebook_guid}/versions
Response
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the notebook version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
List the versions of a notebook.
List all versions of a particular notebook.
GET /v2/notebooks/{notebook_guid}/versions
Response
A list of notebook versions in a project.
The number of items in the resources array.
Example:
1
An array of notebook versions.
Status Code
Success. Returned a list of versions of the notebook.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } } ] }
{ "total_results": 1, "resources": [ { "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Retrieve a notebook version.
Retrieve a particular version of a notebook.
GET /v2/notebooks/{notebook_guid}/versions/{version_guid}
Response
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a notebook version.
Delete a particular version of a given notebook.
DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}
Response
Status Code
Success. The version is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new prompt / prompt template
This creates a new prompt with the provided parameters.
POST /v1/prompts
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
,chat
,detached
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt
This retrieves a prompt / prompt template with the given id.
GET /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Only return a set of model parameters compatiable with inferencing
Default:
true
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt
This updates a prompt / prompt template with the given id.
PATCH /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt
This delets a prompt / prompt template with the given id.
DELETE /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt lock modifications
Modifies the current locked state of a prompt.
PUT /v1/prompts/{prompt_id}/lock
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt lock status
Retrieves the current locked state of a prompt.
GET /v1/prompts/{prompt_id}/lock
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get the inference input string for a given prompt
Computes the inference input string based on state of a prompt. Optionally replaces template params
POST /v1/prompts/{prompt_id}/input
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override input string that will be used to generate the response. The string can contain template parameters.
Possible values: Value must match regular expression
.*
Example:
Some text with variables.
Supply only to replace placeholders. Object content must be key:value pairs where the 'key' is the parameter to replace and 'value' is the value to use.
- prompt_variables
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
var1
Response
The prompt's input string used for inferences.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt
This adds new chat items to the given prompt.
POST /v1/prompts/{prompt_id}/chat_items
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question
,answer
]Possible values: Value must match regular expression
.*
Example:
Some text
Allowable values: [
ready
,error
]Example:
1711504485261
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session
This retrieves a prompt session with the given id.
GET /v1/prompt_sessions/{session_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Include the most recent entry
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt session
This updates a prompt session with the given id.
PATCH /v1/prompt_sessions/{session_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
An optional description for the prompt.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session
This deletes a prompt session with the given id.
DELETE /v1/prompt_sessions/{session_id}
Add a new prompt to a prompt session
This creates a new prompt associated with the given session.
POST /v1/prompt_sessions/{session_id}/entries
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
Time the prompt was created.
Example:
1711504485261
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
,chat
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
Time the prompt was created.
Example:
1711504485261
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
- prompt_variables
- any property
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
]
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get entries for a prompt session
List entries from a given session.
GET /v1/prompt_sessions/{session_id}/entries
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Bookmark from a previously limited get request
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Limit for results to retrieve, default 20
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
- results
The prompt entry's ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
The prompt entry's name
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
Name of an entry
The prompt entry's description
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
Description of an entry
The prompt entry's create time in millis
Example:
1711504485261
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Success - Returned when search completes
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt session entry
This adds new chat items to the given entry.
POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question
,answer
]Possible values: Value must match regular expression
.*
Example:
Some text
Allowable values: [
ready
,error
]Example:
1711504485261
Prompt session lock modifications
Modifies the current locked state of a prompt session.
PUT /v1/prompt_sessions/{session_id}/lock
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt session lock status
Retrieves the current locked state of a prompt session.
GET /v1/prompt_sessions/{session_id}/lock
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session entry
This retrieves a prompt session entry with the given id.
GET /v1/prompt_sessions/{session_id}/entries/{entry_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session entry
This deletes a prompt session entry with the given id.
DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
Since CloudPak for Data 5.1.0
.
POST /ml/v1/text/chat
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
text_chat
A text chat example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
{
"role": "assistant",
"content": "The Los Angeles Dodgers won the World Series in 2020."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Where was it played?"
}
}
],
"max_tokens": 100,
"temperature": 0,
"time_limit": 1000
}
tool_call
A tool calling example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the weather like in Boston today?"
}
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"description": "The city, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"enum": [
"celsius",
"fahrenheit"
],
"type": "string"
}
},
"required": [
"location"
]
}
}
}
],
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
}
}
}
json_mode
A text chat example with json output.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"response_format": {
"type": "json_object"
},
"messages": [
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Who won the world series in 2020?"
}
}
]
}
The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Using
auto
means the model can pick between generating a message or calling one or more tools. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
false
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobs
must be set totrue
if this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
Default:
1024
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0
The chat response format parameters.
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_p
but not both.Possible values: 0 < value < 2
Default:
1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperature
but not both.Possible values: 0 < value < 1
Default:
1
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": [ { "type": "text", "text": "Who won the world series in 2020?" } ] }, { "role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020." }, { "role": "user", "content": [ { "type": "text", "text": "Where was it played?" } ] } ], "max_tokens": 100, "temperature": 0, "time_limit": 1000 }'
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What is the weather like in Boston today?" } ] } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "description": "The city, e.g. San Francisco, CA", "type": "string" }, "unit": { "enum": [ "celsius", "fahrenheit" ], "type": "string" } }, "required": [ "location" ] } } } ], "tool_choice": { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather for a location. Call this whenever you need to know the weather, or for example when a customer asks What is the weather like in New York" } } }'
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "response_format": { "type": "json_object" }, "messages": [ { "role": "system", "content": "You are a helpful assistant designed to output JSON." }, { "role": "user", "content": [ { "type": "user", "text": "Who won the world series in 2020?" } ] } ] }'
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2
The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
n
is greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
Usage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A text chat example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 47, "prompt_tokens": 59, "total_tokens": 106 } }
A tool calling example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "tool_calls": [ { "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99", "type": "function", "function": { "name": "get_current_weather", "arguments": "{\n \"location\": \"Boston, MA\",\n \"unit\": \"fahrenheit\"\n}\n" } } ] }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 18, "prompt_tokens": 19, "total_tokens": 37 } }
A text chat example with json output.
{ "id": "cmpl-09945b25c805491fb49e15439b8e5d84", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 35, "prompt_tokens": 20, "total_tokens": 55 } }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
Since CloudPak for Data 5.1.0
.
POST /ml/v1/text/chat_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Using
auto
means the model can pick between generating a message or calling one or more tools. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
false
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobs
must be set totrue
if this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
Default:
1024
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0
The chat response format parameters.
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_p
but not both.Possible values: 0 < value < 2
Default:
1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperature
but not both.Possible values: 0 < value < 1
Default:
1
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2
The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
n
is greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
Usage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Generate embeddings
Generate embeddings from text input.
See the documentation for a description of text embeddings.
Since watsonx.ai 2.0.0
.
POST /ml/v1/text/embeddings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The text input for a given model to be used to generate the embeddings.
A sample request.
A simple request.
{
"model_id": "slate",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
"Youth craves thrills while adulthood cherishes wisdom."
]
}
The
id
of the model to be used for this request. Please refer to the list of models.The input text.
Possible values: number of items ≤ 1000
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Parameters for text embedding requests.
curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "inputs": [ "Youth craves thrills while adulthood cherishes wisdom.", "Youth seeks ambition while adulthood finds contentment.", "Dreams chased in youth while goals pursued in adulthood." ], "model_id": "slate", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
id
of the model to be used for this request. Please refer to the list of models.The embedding values for a given text.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The number of input tokens that were consumed.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "slate", "results": [ { "embedding": [ -0.006929283, -0.005336422, -0.024047505 ] } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 10 }
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
POST /ml/v1/text/generation
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}
A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Properties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
The generated text from the model along with other details.
{ "model_id": "google/flan-t5-xl", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.", "generated_token_count": 118, "input_token_count": 11, "stop_reason": "eos_token", "moderations": { "pii": [ { "score": 0.8, "input": false, "position": { "start": 74, "end": 88 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 200, "end": 212 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 244, "end": 259 }, "entity": "EmailAddress" } ] } } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
POST /ml/v1/text/generation_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}
A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Properties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The input texts and the queries for reranking.
A sample request.
{
"model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
{
"text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
},
{
"text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
}
],
"query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
"parameters": {
"return_options": {
"top_n": 2
}
}
}
The
id
of the model to be used for this request. Please refer to the list of models.The rank input strings.
Possible values: 0 ≤ number of items ≤ 1000
The rank query.
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The properties used for reranking.
curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "inputs": [ { "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine." }, { "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience." } ], "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.", "parameters": { "return_options": { "top_n": 2 } } }'
Response
System details.
The
id
of the model to be used for this request. Please refer to the list of models.The ranked results.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The number of input tokens that were consumed.
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The rank query, if requested.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "results": [ { "index": 1, "score": 0.7461 }, { "index": 0, "score": 0.8274 } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 20 }
Text tokenization
The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.
POST /ml/v1/text/tokenization
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The input string to tokenize.
A sample request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"model_id": "google/flan-ul2",
"input": "Write a tagline for an alumni association: Together we",
"parameters": {
"return_tokens": true
}
}
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The input string to tokenize.
Example:
Write a tagline for an alumni association: Together we
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The parameters for text tokenization.
curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-ul2,", "input": "Write a tagline for an alumni association: Together we", "parameters": { "return_tokens": true }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
The tokenization result.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The result of tokenizing the input string.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The response with the token count and the tokens, if requested.
{ "model_id": "google/flan-ul2", "result": { "token_count": 11, "tokens": [ "Write", "a", "tag", "line", "for", "an", "alumni", "associ", "ation:", "Together", "we" ] } }
Create a new watsonx.ai training
Create a new watsonx.ai training in a project or a space.
The details of the base model and parameters for the training
must be provided in the prompt_tuning
object.
In order to deploy the tuned model you need to follow the following steps:
-
Create a WML model asset, in a space or a project, by providing the
request.json
as shown below:curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \ -H "Authorization: Bearer <replace with your token>" \ -H "content-type: application/json" \ --data '{ "name": "replace_with_a_meaningful_name", "space_id": "replace_with_your_space_id", "type": "prompt_tune_1.0", "software_spec": { "name": "watsonx-textgen-fm-1.0" }, "metrics": [ from the training job ], "training": { "id": "05859469-b25b-420e-aefe-4a5cb6b595eb", "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "generation", "verbalizer": "Input: {{input}} Output:" }, "training_data_references": [ { "connection": { "id": "20933468-7e8a-4706-bc90-f0a09332b263" }, "id": "file_to_tune1.json", "location": { "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl", "path": "file_to_tune1.json" }, "type": "connection_asset" } ] }'
Notes:
- If you used the training request field
auto_update_model: true
then you can skip this step as the model will have been saved at the end of the training job. - Rather than creating the payload for the model you can use the
generated
request.json
that was stored in theresults_reference
field, look for the path in the fieldentity.results_reference.location.model_request_path
. - The model
type
must beprompt_tune_1.0
. - The software spec name must be
watsonx-textgen-fm-1.0
.
- If you used the training request field
-
Create a tuned model deployment as described in the create deployment documentation.
POST /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The training_data_references
contain the training datasets and the
results_reference
the connection where results will be stored.
Start a prompt tune training job.
{
"name": "my-prompt-tune-training",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"prompt_tuning": {
"base_model": {
"model_id": "google/flan-t5-xl"
},
"tuning_type": "prompt_tuning",
"task_id": "classification",
"num_epochs": 30,
"learning_rate": 0.4,
"accumulate_steps": 3,
"batch_size": 10,
"max_input_tokens": 100,
"max_output_tokens": 100
},
"training_data_references": [
{
"id": "tune1_data.json",
"location": {
"path": "tune1_data.json"
},
"type": "container"
}
],
"auto_update_model": true,
"results_reference": {
"location": {
"path": "tune1/results"
},
"type": "container"
}
}
The name of the training.
Example:
my-prompt-training
The training results. Normally this is specified as
type=container
(Service) ortype=fs
(Software) which means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }
- results_reference
The data source type like
connection_asset
,container
(Service) orfs
(Software).Allowable values: [
connection_asset
,container
,fs
]Example:
connection_asset
Contains a set of fields that describe the location of the data with respect to the
connection
.Examples:{ "bucket": "wml-v4-fvt-remote-tests", "file_name": "heart_testpy379.csv" }
- location
Item identification inside a collection.
Contains a set of fields specific to each connection. See here for details about specifying connections.
Examples:{ "id": "2d07a6b4-8fa9-43ab-91c8-befcd9dab8d2" }
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
A description of the training.
Example:
My prompt training.
A list of tags for this resource.
Examples:[ "t1", "t2" ]
Properties to control the prompt tuning.
Examples:{ "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "summarization", "tuning_type": "prompt_tuning", "num_epochs": 30, "learning_rate": 0.4, "accumulate_steps": 3, "batch_size": 10, "max_input_tokens": 100, "max_output_tokens": 100 }
Training datasets.
Examples:[ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ]
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
If set to
true
then the result of the training, if successful, will be uploaded to the repository as a model.Default:
false
Example:
true
curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "name": "my-prompt-tune-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification", "tuning_type": "prompt_tuning", "num_epochs": 30, "learning_rate": 0.4, "accumulate_steps": 3, "batch_size": 10, "max_input_tokens": 100, "max_output_tokens": 100 }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results" }, "type": "container" } }'
Response
Training resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" } }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
The training job has been created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } }
Retrieve the list of trainings
Retrieve the list of trainings for the specified space or project.
GET /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Compute the total count. May have performance impact.
Return only the resources with the given tag value.
Filter based on on the training job state.
Allowable values: [
queued
,pending
,running
,storing
,completed
,failed
,canceled
]The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Information for paging when querying resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
The training resources.
Optional details coming from the service and related to the API call or the associated resource.
Examples:{ "warnings": [ { "message": "This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations.", "id": "DisclaimerWarning" } ] }
- system
Any warnings coming from the system.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "limit": 100, "first": { "href": "https://{cluster_url}/ml/v4/trainings" }, "total_count": 1, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } } ] }
Retrieve the training
Retrieve the training with the specified identifier.
GET /ml/v4/trainings/{training_id}
Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Training resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" } }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } }
Cancel or delete the training
Cancel or delete the specified training, once deleted all trace of the job is gone.
DELETE /ml/v4/trainings/{training_id}
Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Set to true in order to also delete the job or request metadata.