Introduction to IBM watsonx.ai software
Using IBM watsonx.ai software APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).
If you are looking for the IBM watsonx.ai as a Service APIs, see here.
Step-by-step instructions on how to use IBM watsonx.ai software can be found
here.
There is a specialized python library that is available to access this REST API.
Endpoint URLs
The base URLs for API endpoints come from the cluster and add-on service instance. The URL follows this pattern:
https://{cluster_url}/ml/v1
{cluster_url}represents the name or IP address of your deployed cluster. Use a hostname that resolves to an IP address in the cluster.
To find the base URL, view the details for the service instance from the Cloud Pak for Data web client.
Note that for prompts, notebooks and vector indexes the base URLs are /wx.
https://{cluster_url}/wx
Use that URL in your requests to the API.
Endpoint example
curl -k -X {request_method} -H "Authorization: Bearer {token}" "https://{cluster_url}/ml/v1/text/generation"
Disabling SSL verification
Watson Machine Learning uses Secure Sockets Layer (SSL) (or Transport Layer Security (TLS)) for secure connections between the client and server. The connection is verified against the local certificate store to ensure authentication, integrity, and confidentiality.
If you use a self-signed certificate, you need to disable SSL verification to make a successful connection.
Enabling SSL verification is highly recommended. Disabling SSL jeopardizes the security of the connection and data. Disable SSL only if necessary, and take steps to enable SSL as soon as possible.
To disable SSL verification for a curl request, use the --insecure (-k) option with the request.
Authentication
A bearer token is required to use any of the watsonx.ai APIs.
For more information, see the Authorization section of the Platform API reference.
Use the value of the access_token property from the example request. Set the access_token value as the authorization header parameter for requests to the APIs. The format is Authorization: Bearer {access_token_value}:
Authorization: Bearer eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6IlJTMjU2In0...
Example request that uses an API key to retrieve the token
curl -k -X POST "https://{cluster_url}/icp4d-api/v1/authorize" -H "cache-control: no-cache" -H "content-type: application/json" -d "{\“username\”:\“admin\”,\“password\”:\“password\”}"
Response
{
"username": "admin",
"role": "Admin",
"permissions": [
"administrator"
],
"sub": "admin",
"iss": "KNOXSSO",
"aud": "DSX",
"uid": "999",
"authenticator": "default",
"access_token": "eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6...",
"_messageCode_": "success"
}
Error handling
This API uses standard HTTP response codes to indicate whether a method completed successfully.
A 200 type response indicates success.
| HTTP Code | Description | Recovery |
|---|---|---|
200 |
Success | The request was successful. |
400 |
Bad Request | The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body. |
401 |
Unauthorized | You are not authorized to make this request. Log in and try again or provide a valid token. For more information about logging in, see the Authentication section. If this error persists, contact the account owner to check your permissions. |
403 |
Forbidden | The supplied authentication is not authorized. |
404 |
Not Found | The requested resource could not be found. |
Note that 429 and 503 errors may mean that the model is overloaded or unavailable,
check the error description for more details.
Additional headers
Some additional headers might be required to make successful requests to the API. Those additional headers are described below.
An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id and the value is anything that you choose.
If there is not a transaction ID that is passed in, then one is generated randomly.
API change log
In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai API.
The change log lists changes that have been made, ordered by the date they were released.
Changes to existing API versions are designed to be compatible with existing client applications,
if this is not the case then a new version date will be created.
Versioning
API requests require a version parameter that takes the date in the format version=YYYY-MM-DD. Send the version parameter with every API request.
When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.
Data References
Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires
the use of connection_asset or data_asset reference types.
These reference types are created within a space or a project and are referenced in requests to represent input
data and results locations. These types contain two parameter objects, connection and location, which require
different values to be supplied based on the reference type. Using a data_asset, requires an href to be supplied
to the location object whereas using a connection_asset requires the connection_id for the connection object
and different location fields depending on the data source type,
Example connection_asset payload:
{
"training_data_references": [
{
"type": "connection_asset",
"connection": {
"id": "<connection_guid>"
},
"location": {
"<wdp-properties depending on the type>": "<value depending on the type>"
}
}
]
}
Example data_asset payload:
{
"training_data_references": [
{
"type": "data_asset",
"location": {
"href": "/v2/assets/<asset_id>?space_id=<space_id>"
}
}
]
}
Example fs payload:
- project_id
{
"training_data_references": [
{
"type":"fs",
"location":{
"path":"/projects/<project_id>/assets/<fs_path>"
}
}
]
}
- space_id
{
"training_data_references": [
{
"type":"fs",
"location":{
"path":"/spaces/<space_id>/assets/<fs_path>"
}
}
]
}
Methods
Create a new AI service
Create a new AI service with the given payload. A AI service is some code that can be deployed as a deployment.
POST /ml/v4/ai_services
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
Payload for creating the AI service. Either space_id or project_id has to be provided and is mandatory.
A sample request.
{
"name": "ai-app-1",
"software_spec": {
"id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
},
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"documentation": {
"request": {
"application/json": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"query": {
"type": "string"
},
"parameters": {
"properties": {
"max_new_tokens": {
"type": "integer"
},
"top_p": {
"type": "number"
}
},
"required": [
"max_new_tokens",
"top_p"
]
}
},
"required": [
"query"
]
},
"application/png": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"image": {
"type": "string",
"format": "binary"
}
},
"required": [
"image"
]
}
},
"response": {
"application/json": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"query": {
"type": "string"
},
"result": {
"type": "string"
}
},
"required": [
"query",
"result"
]
},
"application/png": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "string",
"format": "binary"
}
}
},
"tooling": {
"reference_format": true
}
}The space that contains the resource.
Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe name of the resource.
Example:
my-resourceA software specification.
A description of the resource.
Example:
This is my first resource.A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]The type that allows the deployment service to know how to setup the code during deployment.
Allowable values: [
python]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
User defined properties specified as key-value pairs, which is propagated to the deployment.
Since watsonx.ai
2.2.0.Examples:{ "name": "reference_format", "tag": true }- tooling
curl --request POST 'https://{cluster_url}/ml/v4/ai_services?version=2024-10-17' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "name": "ai-service-1", "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "software_spec": { "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309" }, "documentation": { "request": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "parameters": { "properties": { "max_new_tokens": { "type": "integer" }, "top_p": { "type": "number" } }, "required": [ "max_new_tokens", "top_p" ] } }, "required": [ "query" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "image": { "type": "string", "format": "binary" } }, "required": [ "image" ] } }, "response": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "result": { "type": "string" } }, "required": [ "query", "result" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "string", "format": "binary" } } } }'
Response
The information for a flow.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
User defined properties specified as key-value pairs, which is propagated to the deployment.
Since watsonx.ai
2.2.0.Examples:{ "name": "reference_format", "tag": true }- tooling
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service created
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The response with the result.
{ "metadata": { "id": "b53c5118-b1ca-43ef-a597-ef839ff7129f", "name": "ai-app-1", "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-05-02T16:27:51Z" }, "entity": { "software_spec": { "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309" }, "documentation": { "request": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "parameters": { "properties": { "max_new_tokens": { "type": "integer" }, "top_p": { "type": "number" } }, "required": [ "max_new_tokens", "top_p" ] } }, "required": [ "query" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "image": { "type": "string", "format": "binary" } }, "required": [ "image" ] } }, "response": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "result": { "type": "string" } }, "required": [ "query", "result" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "string", "format": "binary" } } } } }
Retrieve the AI services
Retrieve the AI services for the specified space or project.
GET /ml/v4/ai_services
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcToken required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50Return only the resources with the given tag values, separated by
ororandto support multiple tags.Example:
tf2.0 or tf2.1Returns only resources that match this search string. The path to the field must be the complete path to the field, and this field must be one of the indexed fields for this resource type. Note that the search string must be URL encoded.
These are the fields that can be searched in the
metadata:/metadata/name
Note that
tagsare filtered using thetagquery parameter and thetagquery parameter takes precedence over thesearchquery parameter.The metadata fields, on all assets, can be searched like this:
/metadata/name=asset2->search=%2Fmetadata%2Fname%3Dasset2
These are the fields that can be searched in the
entityand that depend on the asset type:model/entity/type/entity/software_spec.id
function/entity/software_spec.id
ai_service/entity/software_spec.id
The entity fields can be searched like this:
/entity.type=tensorflow_2.14->search=%2Fentity%2Ftype%3Dtensorflow_2.14Possible values: length ≥ 1
curl --request GET 'https://{cluster_url}/ml/v4/ai_services?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
Retrieve the AI service
Retrieve the AI service with the specified identifier. If rev query parameter is provided,
rev=latest will fetch the latest revision. A call with rev={revision_number} will fetch the given revision_number record. Either space_id or project_id has to be provided and is mandatory.
GET /ml/v4/ai_services/{id}Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcThe revision number of the resource.
Example:
2
curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Response
The information for a flow.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
User defined properties specified as key-value pairs, which is propagated to the deployment.
Since watsonx.ai
2.2.0.Examples:{ "name": "reference_format", "tag": true }- tooling
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Update the AI service
Update the AI service with the provided patch data. The following fields can be patched:
/tags/name/description/custom
PATCH /ml/v4/ai_services/{id}Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Input For Patch. This is the patch body which corresponds to the JavaScript Object Notation (JSON) Patch standard (RFC 6902).
The operation to be performed.
Allowable values: [
add,remove,replace]The pointer that identifies the field that is the target of the operation.
The value to be used within the operation.
curl --request PATCH "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..." -H "Content-Type: application/json" -H "Accept: application/json" -d [ { "op": "replace", "path": "/description", "value": "New Description" } ]
Response
The information for a flow.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
User defined properties specified as key-value pairs, which is propagated to the deployment.
Since watsonx.ai
2.2.0.Examples:{ "name": "reference_format", "tag": true }- tooling
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service has been patched successfully
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Delete the AI service
Delete the AI service with the specified identifier. This will delete all revisions of this flow as well. For each revision all attachments will also be deleted.
DELETE /ml/v4/ai_services/{id}Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request DELETE "https://{cluster_url}/ml/v4/ai_services/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Upload the AI service code
Upload the flow code. AI services expect a zip file that contains the code files that make up the flow.
PUT /ml/v4/ai_services/{id}/codeRequest
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
A gzip file containing code files.
curl --request PUT "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..." -H "Content-Type: application/gzip"
Download the AI service code
Download the AI service code.
It is possible to download the code for a given revision of the flow.
AI services expect a zip file that contains the code files that make up the flow.
GET /ml/v4/ai_services/{id}/codeRequest
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcThe revision number of the resource.
Example:
2
curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/code?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&rev=1&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Create a new AI service revision
Create a new AI service revision.
The current metadata and content for
id will be taken and a new revision created.
Either space_id or project_id has to be provided and is mandatory.
POST /ml/v4/ai_services/{id}/revisionsRequest
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The details for the revision.
{
"space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f",
"commit_message": "Updated for TF 2.0"
}The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fAn optional commit message for the revision.
curl --request POST "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..." -H "Content-Type: application/json" -H "Accept: application/json" -d { "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "commit_message": "New Code" }
Response
The information for a flow.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
User defined properties specified as key-value pairs, which is propagated to the deployment.
Since watsonx.ai
2.2.0.Examples:{ "name": "reference_format", "tag": true }- tooling
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service revision created
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the AI service revisions
Retrieve the AI service revisions.
GET /ml/v4/ai_services/{id}/revisionsRequest
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcToken required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50
curl --request GET "https://{cluster_url}/ml/v4/ai_services/{id}/revisions?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&limit=100&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Transcribe audio
Transcribe audio into text.
Since watsonx.ai 2.2.1.
POST /ml/v1/audio/transcriptions
Request
Form Parameters
The model to use for audio transcriptions.
The path to a
mp3orwavaudio file to transcribe.The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fOptional target language to which to transcribe; for example,
frfor French. Default is English.
Response
Audio transcriptions response fields.
The model used for audio transcriptions.
The transcribed text.
The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZNumber of estimated tokens from returned text.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An audio transcriptions example.
{ "model": "openai/whisper-tiny", "text": "the ending was terrific.", "created_at": "2023-07-21T16:52:32.190Z", "token_count": 8 }
Create a new AutoAI RAG run
Create a new AutoAI RAG that will find the best RAG pattern from the data that is provided in the request.
Since watsonx.ai 2.1.0.
POST /ml/v1/autoai/rags
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The details of the AutoAI RAG run with the data used to find the best RAG patterns.
The name of the job.
A hardware specification.
The training results. Normally this is specified as
type=container(Service) ortype=fs(Software) which means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }- results_reference
The data source type like
connection_asset,container(Service) orfs(Software).Allowable values: [
connection_asset,container,fs]Example:
connection_assetContains a set of fields that describe the location of the data with respect to the
connection.- location
Item identification inside a collection.
Contains a set of fields specific to each connection. See here for details about specifying connections.
The description of the job.
A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe parameters for an AutoAI RAG run.
A set of input data references.
Possible values: 1 ≤ number of items ≤ 20
A set of test data references.
Possible values: number of items = 1
A set of vector store references.
Possible values: number of items = 1
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ '{ "name": "AutoAI RAG Example", "description": "AutoAI RAG Example description", "parameters": { "constraints": { "max_number_of_rag_patterns": 4 }, "optimization": { "metrics": [ "answer_correctness" ] }, "output_logs": true }, "project_id": "dc178286-21d1-4262-9000-e543cf4c7742", "input_data_references": [ { "type": "data_asset", "location": { "href": "/v2/assets/4cc2f990-cd83-4e62-bd61-33b21605cf0e?project_id=dc178286-21d1-4262-9000-e543cf4c7742", "id": "4cc2f990-cd83-4e62-bd61-33b21605cf0e" } } ], "test_data_references": [ { "type": "data_asset", "location": { "href": "/v2/assets/d0d1607f-1ac1-4a88-8098-c5c8b6e4b78a?project_id=dc178286-21d1-4262-9000-e543cf4c7742", "id": "d0d1607f-1ac1-4a88-8098-c5c8b6e4b78a" } } ], "results_reference": { "type": "fs", "location": { "path": "/projects/dc178286-21d1-4262-9000-e543cf4c7742/assets/auto_ml/auto_ml.e274332a-cd3f-4d31-83bc-5072d6dfb535/wml_data" } }, "hardware_spec": { "id": "a6c4923b-b8e4-444c-9f43-8a7ec3020110", "name": "L" } }'
Response
The response of an AutoAI RAG run.
The request fields that are not part of the returned entity.
The status of an AutoAI RAG run.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the AutoAI RAG runs
Retrieve the list of AutoAI RAG requests for the specified space or project.
This operation does not save the history, any requests that were deleted or purged will not appear in this list.
Since watsonx.ai 2.1.0.
GET /ml/v1/autoai/rags
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcToken required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50
curl --request POST 'https://{cluster_url}/ml/v1/autoai/rags?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
A paginated list of training definitions.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources. Computed explicitly only when
total_count=truequery parameter is present. This is in order to avoid performance penalties.Example:
1A reference to the first item of the next page, if any.
A list of training definitions.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Get an AutoAI RAG run
Get the results of an AutoAI RAG run, or details if the job failed.
Since watsonx.ai 2.1.0.
GET /ml/v1/autoai/rags/{id}Request
Path Parameters
The
idis the identifier that was returned in themetadata.idfield of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request GET 'https://{cluster_url}/ml/v1/autoai/rags/{id}?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
The response of an AutoAI RAG run.
The request fields that are not part of the returned entity.
The status of an AutoAI RAG run.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "description": "My autoai rag experiment for 2023 financial documents", "name": "AutoAI RAG" }, "entity": { "timestamp": "2023-09-22T02:52:03.324Z", "hardware_spec": { "id": "c076e82c-b2a7-4d20-9c0f-1f0c2fdf5a24", "name": "L" }, "parameters": { "constraints": { "embedding_models": [ "ibm/slate-125m-english-rtrvr" ], "generation": { "foundation_models": [ { "model_id": "meta-llama/llama-3-3-70b-instruct\"," }, { "model_id": "mistralai/mixtral-8x7b-instruct-v01" } ] }, "max_number_of_rag_patterns": 8 }, "optimization": { "metrics": [ "answer_correctness" ] }, "output_logs": true }, "input_data_references": [ { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "path": "files/document.pdf" } } ], "test_data_references": [ { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "path": "files/qa_document.json" } } ], "vector_store_references": [ { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" } } ], "results_reference": { "type": "container", "location": { "path": "results_autoai", "training": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5", "training_status": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training-status.json", "assets_path": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets", "training_log": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/training.log" } }, "results": [ { "metrics": { "test_data": [ { "metric_name": "answer_correctness", "mean": 0.51, "ci_high": 0.68, "ci_low": 0.43 } ] }, "context": { "rag_pattern": { "composition_steps": [ "vector_store", "chunking", "embeddings", "retrieval", "generation" ], "location": { "evaluation_results": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/evaluation_results.json", "indexing_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/indexing_notebook.ipynb", "inference_notebook": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_notebook.ipynb", "inference_service_code": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_service_code.gz", "inference_service_metadata": "results_autoai/6a9362f6-7ea2-4419-8c7d-a7e07432dec5/rag/assets/Pattern1/inference_service_metadata.json" }, "name": "Pattern 1", "settings": { "vector_store": { "datasource_type": "milvus", "index_name": "autoai_rag_1234_iteration_5_index", "distance_metric": "euclidean", "operation": "upsert", "schema": { "id": "autoai_rag_1.0.0", "name": "AutoAI RAG document schema", "type": "struct", "fields": [ { "name": "text", "description": "text field", "type": "string", "role": "text" }, { "name": "document_id", "description": "document name field", "type": "string", "role": "document_name" }, { "name": "start_index", "description": "chunk starting token position in the source document", "type": "number", "role": "start_index" }, { "name": "sequence_number", "description": "chunk number per document", "type": "number", "role": "sequence_number" }, { "name": "vector", "description": "vector embeddings", "type": "array", "role": "vector_embeddings" } ] } }, "chunking": { "method": "recursive", "chunk_size": 256, "chunk_overlap": 64 }, "embeddings": { "truncate_strategy": "left", "truncate_input_tokens": 384, "model_id": "ibm/slate-125m-english-rtrvr" }, "retrieval": { "method": "simple", "number_of_chunks": 5 }, "generation": { "model_id": "meta-llama/llama-3-1-70b-instruct", "prompt_template_text": "Answer the following questions based on provided context:\\n ...", "context_template_text": "[Document]\n{document}\n[End]", "word_to_token_ratio": 2.2 } } }, "iteration": 1, "max_combinations": 160 } } ], "status": { "state": "running", "step": "vector_store", "message": { "level": "info", "text": "Pipeline 1 of 8 is completed." }, "running_at": "2023-08-04T13:22:48.000Z" } } }
Cancel or delete an AutoAI RAG run
Cancel or delete the specified AutoAI RAG run, once deleted all trace of the run job is gone.
Since watsonx.ai 2.1.0.
DELETE /ml/v1/autoai/rags/{id}Request
Path Parameters
The
idis the identifier that was returned in themetadata.idfield of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcSet to true in order to also delete the job or request metadata.
curl --request DELETE "https://{cluster_url}/ml/v1/autoai/rags/{id}?space_id=12ac4cf1-252f-424b-b52d-5cdd9814987f&version=2024-10-17" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Retrieve the custom foundation models
Retrieve the custom foundation models.
In order to deploy a custom foundation model using one of the models in this list you need to follow the following steps:
-
Create a model asset, in a space or a project, by providing the custom foundation model details as shown below:
curl -X POST "https://{cluster_url}/ml/v4/models?version=2024-01-29" \ -H "Authorization: Bearer <replace with your token>" \ -H "content-type: application/json" \ --data '{ "name": "replace_with_a_meaningful_name", "space_id": "replace_with_your_space_id", "foundation_model": { "model_id": "replace_with_your_model_id" }, "type": "custom_foundation_model_1.0", "software_spec": { "name": "watsonx-cfm-caikit-1.0" } }'Notes:
- The model
typemust becustom_foundation_model_1.0. - The software spec name must be
watsonx-cfm-caikit-1.0.
- The model
-
Create a custom foundation model deployment as described in the create deployment documentation.
Since watsonx.ai 1.1.x.
GET /ml/v4/custom_foundation_models
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50
curl --request GET 'https://{cpd_cluster}/ml/v4/custom_foundation_models?version=2023-05-02&limit=10' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
Pagination information and list of models and common parameters.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources.
Example:
1A reference to the first item of the next page, if any.
A list of models.
A list of common parameters that apply to all models, but can be overridden in each model description.
Status Code
OK
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The list of custom foundation models that were created and registered.
{ "total_count": 1, "limit": 10, "first": { "href": "https://{cpd_cluster}/ml/v4/custom_foundation_models" }, "resources": [ { "model_id": "my_flan_t5_xl", "description": "A tuned version of flan_t5_xl", "tags": [ "flan_t5_xl" ], "parameters": [ { "name": "max_batch_weight", "display_name": "Maximum batch weight", "default": 10000, "description": "The maximum batch weight that is allowed for this model.", "type": "number", "min": 0, "max": 100000 } ] } ], "parameters": [ { "name": "max_batch_weight", "display_name": "Maximum batch weight", "default": 1000, "description": "The maximum batch weight that is allowed for all models.", "type": "number", "min": 0, "max": 10000 } ] }
Create a new watsonx.ai deployment
Create a new deployment, currently the only supported type is online.
If this is a deployment for a prompt template then the prompt_template object should exist and the id must be the id of the prompt template to be deployed.
If this is a deployment for a custom foundation model then the online object must exist, the asset object must exist and point to the model object that describes the custom foundation model, and the hardware_spec is mandatory. Note that the base_model_id will be returned and will be the base model id that is defined in the model asset (asset.id).
If this is a deployment for a fine tuned model then the asset.id must point to the model that was created after the fine tuning. In case of a fine tuned model with a template, the field base_deployment_id will be the tuned model deployment.
Pre-defined hardware specifications are provided for custom and base foundation model deployments:
WX-S: 1 GPU, Request 1 CPU, Limit 2 CPU and 60 GB (Request and Limit) - 1B to 20B parametersWX-M: 2 GPU, Request 2 CPU, Limit 3 CPU and 120 GB (Request and Limit) - 21B to 40B parametersWX-L: 4 GPU, Request 4 CPU, Limit 5 CPU and 240 GB (Request and Limit) - 41B to 80B parametersWX-XL: 8 GPU, Request 8 CPU, Limit 9 CPU and 600 GB (Request and Limit) - 81B to 200B parameters
A prompt template can be used in conjunction with a custom foundation model by specifying the prompt_template object with the id point to the prompt template.
POST /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The deployment request entity.
The following important fields are described for each use case:
- Prompt template:
base_model_id: requiredprompt_template.id: requiredonline: requiredhardware_spec: forbiddenhardware_request: forbidden- response
deployed_asset_type:foundation_model
- Custom foundation model:
asset.id: requiredonline: requiredonline.parameters.foundation_model: optionalhardware_spec: requiredhardware_request: forbiddenbase_model_id: forbiddenbase_deployment_id: forbidden- response
deployed_asset_type:custom_foundation_model
- Custom foundation model with template:
base_deployment_id: requiredprompt_template.id: requiredonline: requiredonline.parameters.foundation_model: forbiddenhardware_spec: forbiddenhardware_request: forbiddenasset.id: forbiddenbase_model_id: forbidden- response
deployed_asset_type:custom_foundation_model
- Fine tuned model:
asset.id: requiredonline: requiredonline.parameters.foundation_model: optionalhardware_spec: requiredhardware_request: forbiddenbase_model_id: forbiddenbase_deployment_id: forbidden- response
deployed_asset_type:fine_tune
- Fine tuned model with template:
base_deployment_id: requiredprompt_template.id: requiredonline: requiredonline.parameters.foundation_model: forbiddenhardware_spec: forbiddenhardware_request: forbiddenasset.id: forbiddenbase_model_id: forbidden- response
deployed_asset_type:fine_tune
- Base Foundation model:
asset.id: requiredonline: requiredonline.parameters.foundation_model: optionalhardware_spec: requiredbase_model_id: forbiddenbase_deployment_id: forbidden- response
deployed_asset_type:base_foundation_model
- Base Foundation model for LoRA:
asset.id: requiredonline: requiredonline.parameters.foundation_model.enable_lora: requiredhardware_spec: requiredbase_model_id: forbiddenbase_deployment_id: forbidden- response
deployed_asset_type:base_foundation_model
- LoRA adapter model:
asset.id: requiredbase_deployment_id: requiredonline: requiredonline.parameters.foundation_model: forbiddenhardware_spec: forbiddenbase_model_id: forbidden- response
deployed_asset_type:lora_adapter
Create a prompt template deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "text_classification",
"base_model_id": "google/flan-ul2",
"prompt_template": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {}
}Create a custom foundation model deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "my_tuned_flan",
"asset": {
"id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
},
"online": {
"parameters": {
"serving_name": "myflan",
"foundation_model": {
"max_batch_weight": 10000,
"max_sequence_length": 8192
}
}
},
"hardware_spec": {
"name": "WX-S",
"num_nodes": 1
}
}Create a prompt template deployment with a custom foundation model.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "my_tuned_flan_template",
"base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc",
"prompt_template": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {
"parameters": {
"serving_name": "myflan_template"
}
}
}Deploy a curated model.
{
"space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
"name": "my_granite_13b_chat_v2",
"asset": {
"id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
},
"online": {
"parameters": {
"serving_name": "granite_13b_chat_v2"
}
}
}The name of the resource.
Possible values: 1 ≤ length ≤ 250
Example:
my-resourceIndicates that this is an online deployment. An object has to be specified but can be empty. The
serving_namecan be provided in theonline.parameters.The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fA description of the resource.
Possible values: length ≤ 1000
Example:
This is my first resource.A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
A reference to a resource.
A hardware specification.
The requested hardware for deployment.
A reference to a resource.
The base model that is required for this deployment if this is for a prompt template for an IBM foundation model (so this does not apply for custom foundation models).
Example:
google/flan-t5-xlThe base deployment when this is a custom foundation model with a prompt template. The
idmust be the id of the custom foundation model deployment.Possible values: length ≤ 128, Value must match regular expression
^[-0-9a-z]+$Example:
a12b278b-e40c-4ca4-bfa0-a4e8583b58e1
curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03", "name": "my_fm", "asset": { "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4" }, "online": { "parameters": { "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192, } } }, "hardware_spec": { "name": "WX-S", "num_nodes": 1 } }'curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03", "name": "my_fm", "asset": { "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4" }, "online": { "parameters": { "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192, "enable_lora": true, "max_gpu_loras": 8, "max_cpu_loras": 16, "max_lora_rank": 32 } } }, "hardware_spec": { "name": "WX-S", "num_nodes": 1 } }'curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03", "name": "my_lora_adapter", "asset": { "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4" }, "online": {}, "base_deployment_id": "bdda3999-1012-45bd-a726-045d8774e622" }'curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d { "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": ["classification"], "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {} }curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d { "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": ["classification"], "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "base_model_id": "google/flan-t5-xl", "online": {} }curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d { "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f" "name": "my_tuned_flan" "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "online": { "parameters": { "serving_name": "myflan" } } }curl --request POST 'https://{cluster_url}/ml/v4/deployments?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d { "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03", "name": "my_granite_13b_chat_v2", "asset": { "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4" }, "base_model_id": "ibm/granite-13b-chat-v2-curated", "hardware_request": { "size": "gpu_s", "num_nodes": 1 }, "online": { "parameters": { "serving_name": "granite_13b_chat_v2" } } }
Response
A deployment resource.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The definition of the deployment.
Status Code
Deployment created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true } ] } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "online": { "parameters": { "serving_name": "myflan", "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192 } } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://<cluster_url>/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://<cluster_url>/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan_template" }, "entity": { "base_deployment_id": "a77190a2-f52d-4f2a-be3d-7867b5f46edc", "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": { "parameters": { "serving_name": "myflan_template" } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://<cluster_url>ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://<cluster_url>/ml/v1/deployments/myflan_template/text/generation", "uses_serving_name": true }, { "url": "https://<cluster_url>/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://<cluster_url>/ml/v1/deployments/myflan_template/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
Retrieve the deployments
Retrieve the list of deployments for the specified space or project.
GET /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcRetrieves the deployment, if any, that contains this
serving_name.Example:
classificationRetrieves only the resources with the given tag value.
Retrieves only the resources with the given asset_id, asset_id would be the model id.
Retrieves only the resources with the given prompt_template_id.
Retrieves only the resources with the given name.
Retrieves the resources filtered with the given type. There are the deployment types as well as an additional
prompt_templateif the deployment type includes a prompt template.The supported deployment types are (see the description for
deployed_asset_typein the deployment entity):-
foundation_model- when a prompt template is used on a pre-deployed IBM provided model. -
custom_foundation_model- when a custom foundation model is deployed. -
lora_adapter- when a lora adapter model is deployed. -
fine_tune- when a fine tune model is deployed. These can be combined with the flagprompt_templatelike this: -
type=foundation_model- return all prompt template deployments. -
type=foundation_model and prompt_template- return all prompt template deployments - this is the same as the previous query because afoundation_modelcan only exist with a prompt template. -
type=custom_foundation_model- return all custom model deployments. -
type=custom_foundation_model and prompt_template- return all custom model deployments with a prompt template. -
type=prompt_template- return all deployments with a prompt template.
-
Retrieves the resources filtered by state. Allowed values are
initializing,updating,readyandfailed.Returns whether
serving_nameis available for use or not. This query parameter cannot be combined with any other parameter except forserving_name.Default:
false
curl --request GET 'https://{cluster_url}/ml/v4/deployments?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e& serving_name=ibm&asset_id=259efabd-7850-40fc-843d-6dddcfc286d1 &state=ready&version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
Response
The deployment resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources. Computed explicitly only when
total_count=truequery parameter is present. This is in order to avoid performance penalties.Example:
1A reference to the first item of the next page, if any.
A list of deployment resources.
System details including warnings.
Status Code
OK.
serving_nameis available for use. Returned whenserving_nameandconflictquery parameters are used.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
Returned when
serving_nameandconflictquery parameters are used. The response body will contain the reason.
No Sample Response
Retrieve the deployment details
Retrieve the deployment details with the specified identifier.
GET /ml/v4/deployments/{deployment_id}Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request GET "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..."
Response
A deployment resource.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The definition of the deployment.
Status Code
Deployment details.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "hardware_spec": { "id": "WX-S", "num_nodes": 1 }, "online": { "parameters": { "serving_name": "myflan", "foundation_model": { "max_batch_weight": 10000, "max_sequence_length": 8192 } } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
Update the deployment metadata
Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.
/name/description/tags/custom/online/parameters/asset-replaceonly/prompt_template-replaceonly/hardware_spec/hardware_request/base_model_id-replaceonly (applicable only to prompt template deployments referring to IBM base foundation models) Since CloudPak for Data5.0.3.
The PATCH operation with path specified as /online/parameters can be used to update the serving_name.
PATCH /ml/v4/deployments/{deployment_id}Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
The json patch.
The operation to be performed.
Allowable values: [
add,remove,replace]The pointer that identifies the field that is the target of the operation.
The value to be used within the operation.
curl --request PATCH "https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02" -H "Authorization: Bearer eyJhbGciOiJSUzUxM..." -H "Content-Type: application/json" -H "Accept: application/json" -d [ { "op": "replace", "path": "/description", "value": "New Description" } ]
Response
A deployment resource.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }The definition of the deployment.
Status Code
Deployment accepted
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Delete the deployment
Delete the deployment with the specified identifier.
DELETE /ml/v4/deployments/{deployment_id}Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request DELETE 'https://{cluster_url}/ml/v4/deployments/{deployment_id}?space_id=aa6dc728-958e-42b7-acdf-d403e16d1e9e&version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.
This API is legacy, consider using Deployment Text Chat.
Return options
Note that there is currently a limitation in this operation when using return_options,
for input only input_text will be returned if requested,
for output the input_tokens and generated_tokens will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generationRequest
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens.
A prompt template request.
A prompt template request.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "how far is paris from bangalore:\n",
"parameters": {
"max_new_tokens": 100
}
}The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }Properties that control the moderations, for usages such as
Hate and profanity(HAP) andPersonal identifiable information(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
System details.
The
idof the model for inference.Example:
google/flan-ul2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished,max_tokens,eos_token,cancelled,time_limit,stop_sequence,token_limit,error]Example:
token_limitThe number of generated tokens.
Example:
3The number of input tokens consumed.
Example:
11The seed used, if it exists.
Example:
42The list of individual generated tokens. Extra token information is included based on the other flags in the
return_optionsof the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]The list of input tokens. Extra token information is included based on the other flags in the
return_optionsof the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details for a prompt template.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters.
This operation will return the output tokens as a stream of events.
If a serving_name is used then it must match the serving_name that is returned in the inference section when the deployment was created.
This API is legacy, consider using Deployment Text Chat Stream.
Return options
Note that there is currently a limitation in this operation when using return_options,
for input only input_text will be returned if requested,
for output the input_tokens and generated_tokens will not be returned, also the
rank and top_tokens will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generation_streamRequest
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
{
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"decoding_method": "sample",
"temperature": 0.8,
"max_new_tokens": 200
}
}The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }Properties that control the moderations, for usages such as
Hate and profanity(HAP) andPersonal identifiable information(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.
The
idof the model for inference.Example:
google/flan-ul2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished,max_tokens,eos_token,cancelled,time_limit,stop_sequence,token_limit,error]Example:
token_limitThe number of generated tokens.
Example:
3The number of input tokens consumed.
Example:
11The seed used, if it exists.
Example:
42The list of individual generated tokens. Extra token information is included based on the other flags in the
return_optionsof the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]The list of input tokens. Extra token information is included based on the other flags in the
return_optionsof the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Infer chat completions
Infers the next chat message for a given deployment. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation.
Related guides:
If stream is true, this operation will return the output tokens in a server-sent events (SSE) stream.
POST /ml/v1/deployments/{id_or_name}/chat/completionsRequest
Custom Headers
Allowable values: [
application/json,text/event-stream]
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
From a given prompt, infer the next chat message.
A chat prompt template request.
A prompt template request.
{
"messages": [
{
"role": "user",
"content": "Who won the world series in 2020?"
},
{
"role": "assistant",
"content": "The Los Angeles Dodgers won the World Series in 2020."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Where was it played?"
}
}
]
}A chat prompt template request with system_prompt and context.
A prompt template request.
{
"context": "Today is Wednesday",
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "Who are you and which day is tomorrow?"
}
}
]
}The messages for this chat session.
If the deployment references a prompt template then
systemrolecan not be inmessages. For such deployments, depending on the model, thecontentofsystemrolemay be fromsystem_promptof the prompt template, and will be automatically inserted intomessages. As an example, depending on the model, ifsystem_promptof a prompt template is "You are Granite Chat, an AI language model developed by IBM. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", a message withsystemrolehavingcontentthe same assystem_promptis inserted.Possible values: 1 ≤ number of items ≤ 1000
- messages
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.If specified,
contextwill be inserted intomessages. Depending on the model,contextmay be inserted into thecontentwithsystemrole; or into thecontentof the last message ofuserrole.In the example,
context"Today is Wednesday" is inserted as suchcontentofuserbecomes "Today is Wednesday. Who are you and which day is tomorrow?"Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation.
Content-Type: text/event-streamifstreamis true.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details for a prompt template.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "ibm/granite-3-2b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The 2020 World Series was played at the Globe Life Field in Arlington, Texas.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 27, "prompt_tokens": 186, "total_tokens": 213 } }The generated text from the model along with other details for a prompt template.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "ibm/granite-3-2b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! I am Granite Chat, created by IBM. I am here to assist you. Today is Wednesday.tomorrow is Thursday.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 32, "prompt_tokens": 154, "total_tokens": 186 } }
Infer text chat
Infers the next chat message for a given deployment. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation.
Related guides:
You can also use Deployment Chat Completions to achieve the same result.
POST /ml/v1/deployments/{id_or_name}/text/chatRequest
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next chat message.
The messages for this chat session.
If the deployment references a prompt template then
systemrolecan not be inmessages. For such deployments, depending on the model, thecontentofsystemrolemay be fromsystem_promptof the prompt template, and will be automatically inserted intomessages. As an example, depending on the model, ifsystem_promptof a prompt template is "You are Granite Chat, an AI language model developed by IBM. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", a message withsystemrolehavingcontentthe same assystem_promptis inserted.Possible values: 1 ≤ number of items ≤ 1000
- messages
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.If specified,
contextwill be inserted intomessages. Depending on the model,contextmay be inserted into thecontentwithsystemrole; or into thecontentof the last message ofuserrole.In the example,
context"Today is Wednesday" is inserted as suchcontentofuserbecomes "Today is Wednesday. Who are you and which day is tomorrow?"Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Infer text chat event stream
Infers the next chat message for a given deployment. This operation will return the output tokens as a stream of events. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model. When a prompt template is referenced, the model used for the chat request is specified by the deployment's base_model_id. Chat parameters are derived from the prompt template's model_parameters. If a serving_name is provided, it must match the serving_name returned in the inference section at the time of deployment creation.
Related guides:
You can also use Deployment Chat Completions with stream option set to true to achieve the same result.
POST /ml/v1/deployments/{id_or_name}/text/chat_streamRequest
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction. The deployment must reference either a prompt template with input_mode set to chat, a custom foundation model, or a curated foundation model.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next chat message in a server-sent events (SSE) stream.
The messages for this chat session.
If the deployment references a prompt template then
systemrolecan not be inmessages. For such deployments, depending on the model, thecontentofsystemrolemay be fromsystem_promptof the prompt template, and will be automatically inserted intomessages. As an example, depending on the model, ifsystem_promptof a prompt template is "You are Granite Chat, an AI language model developed by IBM. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior.", a message withsystemrolehavingcontentthe same assystem_promptis inserted.Possible values: 1 ≤ number of items ≤ 1000
- messages
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.If specified,
contextwill be inserted intomessages. Depending on the model,contextmay be inserted into thecontentwithsystemrole; or into thecontentof the last message ofuserrole.In the example,
context"Today is Wednesday" is inserted as suchcontentofuserbecomes "Today is Wednesday. Who are you and which day is tomorrow?"Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Time series forecast
Generate forecasts, or predictions for future time points, given historical time series data.
POST /ml/v1/deployments/{id_or_name}/time_series/forecastRequest
Path Parameters
The
id_or_namecan be either thedeployment_idthat identifies the deployment or aserving_namethat allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The forecast request.
A sample request.
{
"schema": {
"timestamp_column": "date",
"id_columns": [
"ID1"
]
},
"data": {
"date": [
"2020-01-01T00:00:00",
"2020-01-01T01:00:00",
"2020-01-05T01:00:00"
],
"ID1": [
"D1",
"D1",
"D1"
],
"TARGET1": [
1.46,
2.34,
4.55
]
}
}A payload of data matching
schema. We assume the following about your data:- All timeseries are of equal length and are uniform in nature (the time difference between two successive rows is constant). This implies that there are no missing rows of data;
- The data meet the minimum model-dependent historical context length which can be any number of rows per timeseries;
Note that the example payloads shown are for illustration purposes only. An actual payload would necessary be much larger to meet minimum model-specific context lengths.
- data
Contains metadata about your timeseries data input.
The parameters for the forecast request.
Exogenous or supporting features that extend into the forecasting horizon (e.g., a weather forecast or calendar of special promotions) which are known in advance.
future_datawould be in the same format asdataexcept that all timestamps would be in the forecast horizon and it would not include previously specifiedtarget_columns.- future_data
Response
The time series forecast response.
The model used to generate the forecast.
Example:
ibm/ttm-1024-96-r2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe list of prediction results. There will be a forecast for each time series in the input data. The
prediction_lengthfield in the request specifies the number of predictions in the results. The actual number of rows in the results will be equal to theprediction lengthmultiplied by the number of unique ids inid_columns. Thetimestamp_columnfield in the request indicates the name of the timestamp column in the results.Examples:[ { "date": [ "2020-01-01T03:00:00", "2020-01-01T04:00:00", "2020-01-01T05:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.78, 6.78 ] } ]- results
The number of input data points (number of rows in
data* number of input columns indata).Example:
512The number of forecasted data points (
prediction_length* number oftarget_columns* number of unique ids inid_columns).Example:
1024
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "model_id": "bc35d16e-dd21-472e-9cde-c6c3ad88e3b5", "created_at": "2020-05-02T16:27:51Z", "results": [ { "date": [ "2020-01-05T02:00:00", "2020-01-05T03:00:00", "2020-01-06T00:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.24, 6.78 ] } ], "input_data_points": 512, "output_data_points": 1024 }
Create a fine tuning job
Create a fine tuning job that will fine tune an LLM.
Since CloudPak for Data 5.0.3.
POST /ml/v1/fine_tunings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The details of the fine tuning job with the data used to tune the LLM.
full fine tuning with file system results reference
An example of a request to create a fine-tuning job.
Since watsonx.ai 2.2.1.
{
"project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
"name": "Example - Full fine tuning",
"auto_update_model": true,
"parameters": {
"base_model": {
"model_id": "meta-llama/llama-3-1-8b"
},
"task_id": "classification",
"accumulate_steps": 1,
"num_epochs": 10,
"learning_rate": 0.00005,
"batch_size": 16,
"max_seq_length": 2048,
"response_template": "\n### Response:",
"verbalizer": "### Input: {{input}} \n\n### Response: {{output}}",
"gpu": {
"num": 1
},
"gradient_checkpointing": true
},
"results_reference": {
"connection": {},
"location": {
"path": "fine-tuning/experiment3"
},
"type": "fs"
},
"training_data_references": [
{
"connection": {},
"location": {
"id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
"href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
},
"type": "data_asset"
}
]
}lora fine tuning with file system results reference
An example of a request to create a fine-tuning job with LoRA parameters.
Since watsonx.ai 2.2.1.
{
"project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
"name": "Example - Lora fine tuning",
"auto_update_model": true,
"parameters": {
"base_model": {
"model_id": "meta-llama/llama-3-1-8b"
},
"task_id": "classification",
"accumulate_steps": 1,
"num_epochs": 10,
"learning_rate": 0.00005,
"batch_size": 16,
"max_seq_length": 2048,
"response_template": "\n### Response:",
"verbalizer": "### Input: {{input}} \n\n### Response: {{output}}",
"gpu": {
"num": 1
},
"peft_parameters": {
"type": "lora",
"rank": 16,
"target_modules": [
"all-linear"
],
"lora_alpha": 32,
"lora_dropout": 0.05
},
"gradient_checkpointing": true
},
"results_reference": {
"connection": {},
"location": {
"path": "fine-tuning/experiment4"
},
"type": "fs"
},
"training_data_references": [
{
"connection": {},
"location": {
"id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
"href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
},
"type": "data_asset"
}
]
}qlora fine tuning with file system results reference
An example of a request to create a fine-tuning job with qlora minimal parameters.
Since watsonx.ai 2.2.1.
{
"project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb",
"name": "Example - QLora fine tuning",
"auto_update_model": false,
"parameters": {
"base_model": {
"model_id": "meta-llama/llama-3-1-70b-gptq"
},
"gpu": {
"num": 1
},
"peft_parameters": {
"type": "qlora"
}
},
"results_reference": {
"connection": {},
"location": {
"path": "fine-tuning/experiment5"
},
"type": "fs"
},
"training_data_references": [
{
"connection": {},
"location": {
"id": "69f07f10-ccfa-4137-816c-7a781f8c6b74",
"href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb"
},
"type": "data_asset"
}
]
}The name of the job.
The training datasets.
Possible values: 1 ≤ number of items ≤ 20
The training results. Normally this is specified as
type=containerwhich means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }- results_reference
The data source type.
The possible types will depend on the API and platform being used.
Contains a set of fields that describe the location of the data with respect to the
connection.- location
Contains a set of fields specific to each connection. See here for details about specifying connections.
Item identification inside a collection, if appropriate.
The description of the job.
A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fIf set to
truethen the result of the training, if successful, will be uploaded to the repository as a model.Default:
falseThe parameters for the job. Note that if
verbalizeris provided thenresponse_templatemust also be provided (and vice versa).The holdout/test datasets.
Possible values: 1 ≤ number of items ≤ 20
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16' -H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb", "name": "Example - Full fine tuning", "auto_update_model": true, "parameters": { "base_model": {"model_id": "meta-llama/llama-3-1-8b"}, "task_id": "classification", "accumulate_steps": 1, "num_epochs": 10, "learning_rate": 0.00005, "batch_size": 16, "max_seq_length": 2048, "response_template": " ### Response:", "verbalizer": "### Input: {{input}} ### Response: {{output}}", "gpu": {"num": 1}, "gradient_checkpointing": true }, "results_reference": { "connection": {}, "location": {"path": "fine-tuning/experiment3"}, "type": "fs" }, "training_data_references": [ { "connection": {}, "location": { "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74", "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb" }, "type": "data_asset" } ] }'curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16' -H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb", "name": "Example - Lora fine tuning", "auto_update_model": true, "parameters": { "base_model": {"model_id": "meta-llama/llama-3-1-8b"}, "task_id": "classification", "accumulate_steps": 1, "num_epochs": 10, "learning_rate": 0.00005, "batch_size": 16, "max_seq_length": 2048, "response_template": " ### Response:", "verbalizer": "### Input: {{input}} ### Response: {{output}}", "gpu": {"num": 1}, "peft_parameters": { "type": "lora", "rank": 16, "target_modules": ["all-linear"], "lora_alpha": 32, "lora_dropout": 0.05 }, "gradient_checkpointing": true }, "results_reference": { "connection": {}, "location": {"path": "fine-tuning/experiment4"}, "type": "fs" }, "training_data_references": [ { "connection": {}, "location": { "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74", "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb" }, "type": "data_asset" } ] }'curl --request POST 'https://{cluster_url}/ml/v1/fine_tunings?version=2024-05-16' -H 'Authorization: Bearer eyJraWQiOiIyMDI1MDcxOT...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb", "name": "Example - QLora fine tuning", "auto_update_model": false, "parameters": { "base_model" : {"model_id": "meta-llama/llama-3-1-70b-gptq"}, "gpu": {"num": 1}, "peft_parameters": {"type": "qlora"} }, "results_reference": { "connection": {}, "location": {"path" : "fine-tuning/experiment5"}, "type": "fs" }, "training_data_references": [ { "connection": {}, "location": { "id": "69f07f10-ccfa-4137-816c-7a781f8c6b74", "href": "https://{cluster_url}/v2/assets/69f07f10-ccfa-4137-816c-7a781f8c6b74?project_id=dbbbfd33-1cca-4c6b-a9fa-c939b5f611eb" }, "type": "data_asset" } ] }'
Response
The response of a fine tuning job.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-fine-tuning-job", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "owner": "guy", "created_at": "2023-08-04T13:22:55.289Z", "rev": "2", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }- metadata
The unique id of the resource.
The time when the resource was created.
The revision of the resource.
The user id which created this resource.
The time when the resource was last modified.
The id of the parent resource where applicable.
The name of the resource.
A description of the resource.
The id of the space this resource belongs to.
Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fA list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]Information related to the revision.
The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Status of the training job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A fine tuning response with a file system result reference.
Since watsonx.ai
2.2.1.{ "entity": { "auto_update_model": true, "parameters": { "accumulate_steps": 1, "base_model": { "model_id": "meta-llama/llama-3-1-8b" }, "batch_size": 2, "gpu": { "num": 1 }, "gradient_checkpointing": true, "learning_rate": 0.00005, "max_seq_length": 2048, "num_epochs": 10, "response_template": "\n### Response:", "task_id": "classification", "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}" }, "results_reference": { "connection": {}, "location": { "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3", "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53", "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/training-status.json", "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/model", "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets/320e8a31-a696-4ea1-afa6-440fd3ac8e53/resources/wml_model/request.json", "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/data/fine_tunings/training.log", "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets" }, "type": "fs" }, "status": { "completed_at": "2025-07-31T15:16:44.876Z", "metrics": [ { "context": { "fine_tuning": { "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment3/320e8a31-a696-4ea1-afa6-440fd3ac8e53/assets/320e8a31-a696-4ea1-afa6-440fd3ac8e53/resources/training_logs.jsonl" } }, "fine_tuning_metrics": { "training_loss": [ { "epoch": 1, "step": 101, "timestamp": "2025-07-31T15:09:31.354986", "value": 2.3847 }, { "epoch": 2, "step": 202, "timestamp": "2025-07-31T15:10:14.620318", "value": 0.8851 }, { "epoch": 3, "step": 303, "timestamp": "2025-07-31T15:10:58.228559", "value": 0.457 }, { "epoch": 4, "step": 404, "timestamp": "2025-07-31T15:11:41.689645", "value": 0.2944 }, { "epoch": 5, "step": 505, "timestamp": "2025-07-31T15:12:24.963389", "value": 0.2052 }, { "epoch": 6, "step": 606, "timestamp": "2025-07-31T15:13:08.119829", "value": 0.1561 }, { "epoch": 7, "step": 707, "timestamp": "2025-07-31T15:13:51.672343", "value": 0.1255 }, { "epoch": 8, "step": 808, "timestamp": "2025-07-31T15:14:35.192001", "value": 0.1089 }, { "epoch": 9, "step": 909, "timestamp": "2025-07-31T15:15:18.613572", "value": 0.1013 }, { "epoch": 10, "step": 1010, "timestamp": "2025-07-31T15:16:01.777923", "value": 0.0983 } ] }, "timestamp": "2025-07-31T15:16:02.286Z" } ], "running_at": "2025-07-31T15:05:15.787Z", "state": "completed" }, "training_data_references": [ { "connection": {}, "location": { "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2", "id": "c195ff4c-c287-475c-a7e0-7159c9381511" }, "type": "data_asset" } ], "tuned_model": { "id": "1b0cb90f-c1b8-4e27-afc4-f59797c62bcf", "name": "model-320e8a31-a696-4ea1-afa6-440fd3ac8e53" } }, "metadata": { "created_at": "2025-07-31T14:59:29.150Z", "id": "320e8a31-a696-4ea1-afa6-440fd3ac8e53", "modified_at": "2025-07-31T15:16:44.887Z", "name": "Example - Full fine tuning", "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2" } }A lora fine tuning response with a file system result reference.
Since watsonx.ai
2.2.1.{ "entity": { "auto_update_model": true, "parameters": { "accumulate_steps": 1, "base_model": { "model_id": "meta-llama/llama-3-1-8b" }, "batch_size": 32, "gpu": { "num": 1 }, "gradient_checkpointing": true, "learning_rate": 0.00005, "max_seq_length": 2048, "num_epochs": 10, "peft_parameters": { "lora_alpha": 32, "lora_dropout": 0.05, "rank": 16, "target_modules": [ "all-linear" ], "type": "lora" }, "response_template": "\n### Response:", "task_id": "classification", "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}" }, "results_reference": { "connection": {}, "location": { "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4", "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd", "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/training-status.json", "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/model", "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets/928bb582-7872-40bc-875f-c56d8676b9bd/resources/wml_model/request.json", "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/data/fine_tunings/training.log", "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets" }, "type": "fs" }, "status": { "completed_at": "2025-07-31T15:25:57.603Z", "metrics": [ { "context": { "fine_tuning": { "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment4/928bb582-7872-40bc-875f-c56d8676b9bd/assets/928bb582-7872-40bc-875f-c56d8676b9bd/resources/training_logs.jsonl" } }, "fine_tuning_metrics": { "training_loss": [ { "epoch": 1, "step": 7, "timestamp": "2025-07-31T15:24:49.806135", "value": 2.9094 }, { "epoch": 2, "step": 14, "timestamp": "2025-07-31T15:24:55.845440", "value": 2.102 }, { "epoch": 3, "step": 21, "timestamp": "2025-07-31T15:25:01.923967", "value": 1.7618 }, { "epoch": 4, "step": 28, "timestamp": "2025-07-31T15:25:07.940105", "value": 1.7091 }, { "epoch": 5, "step": 35, "timestamp": "2025-07-31T15:25:14.032330", "value": 1.5757 }, { "epoch": 6, "step": 42, "timestamp": "2025-07-31T15:25:20.094345", "value": 1.4984 }, { "epoch": 7, "step": 49, "timestamp": "2025-07-31T15:25:26.189688", "value": 1.4543 }, { "epoch": 8, "step": 56, "timestamp": "2025-07-31T15:25:32.269618", "value": 1.4443 }, { "epoch": 9, "step": 63, "timestamp": "2025-07-31T15:25:38.357029", "value": 1.3861 }, { "epoch": 10, "step": 70, "timestamp": "2025-07-31T15:25:44.431574", "value": 1.3456 } ] }, "timestamp": "2025-07-31T15:25:45.535Z" } ], "running_at": "2025-07-31T15:22:17.260Z", "state": "completed" }, "training_data_references": [ { "connection": {}, "location": { "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2", "id": "c195ff4c-c287-475c-a7e0-7159c9381511" }, "type": "data_asset" } ], "tuned_model": { "id": "5079b31b-a050-4ae2-a55c-864b91055ad8", "name": "model-928bb582-7872-40bc-875f-c56d8676b9bd" } }, "metadata": { "created_at": "2025-07-31T15:21:18.328Z", "id": "928bb582-7872-40bc-875f-c56d8676b9bd", "modified_at": "2025-07-31T15:25:57.613Z", "name": "Example - Lora fine tuning", "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2" } }A qlora tuning response with a file system result reference.
Since watsonx.ai
2.2.1.{ "entity": { "auto_update_model": false, "parameters": { "accumulate_steps": 1, "base_model": { "model_id": "meta-llama/llama-3-1-70b-gptq" }, "batch_size": 5, "gpu": { "num": 1 }, "gradient_checkpointing": true, "learning_rate": 0.00001, "max_seq_length": 1024, "num_epochs": 10, "peft_parameters": { "lora_alpha": 32, "lora_dropout": 0.05, "rank": 8, "target_modules": [], "type": "qlora" }, "response_template": "\n### Response:", "task_id": "generation", "verbalizer": "### Input: {{input}} \n\n### Response: {{output}}" }, "results_reference": { "connection": {}, "location": { "path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5", "training": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e", "training_status": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/training-status.json", "model_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/model", "model_request_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets/b3ed41c5-d7e1-47ec-b110-d1796f64577e/resources/wml_model/request.json", "training_log": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/data/fine_tunings/training.log", "assets_path": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets" }, "type": "fs" }, "status": { "completed_at": "2025-07-31T16:38:38.856Z", "metrics": [ { "context": { "fine_tuning": { "metrics_location": "/projects/1d59ce3d-d55f-4e38-a515-6cda5af08ab2/assets/fine-tuning/experiment5/b3ed41c5-d7e1-47ec-b110-d1796f64577e/assets/b3ed41c5-d7e1-47ec-b110-d1796f64577e/resources/training_logs.jsonl" } }, "fine_tuning_metrics": { "training_loss": [ { "epoch": 1, "step": 41, "timestamp": "2025-07-31T16:28:05.466593", "value": 2.9058 }, { "epoch": 2, "step": 82, "timestamp": "2025-07-31T16:29:13.709110", "value": 2.5783 }, { "epoch": 3, "step": 123, "timestamp": "2025-07-31T16:30:21.861881", "value": 1.9994 }, { "epoch": 4, "step": 164, "timestamp": "2025-07-31T16:31:29.929725", "value": 1.7515 }, { "epoch": 5, "step": 205, "timestamp": "2025-07-31T16:32:38.147384", "value": 1.6546 }, { "epoch": 6, "step": 246, "timestamp": "2025-07-31T16:33:46.835779", "value": 1.6284 }, { "epoch": 7, "step": 287, "timestamp": "2025-07-31T16:34:55.447423", "value": 1.6256 }, { "epoch": 8, "step": 328, "timestamp": "2025-07-31T16:36:04.004780", "value": 1.6072 }, { "epoch": 9, "step": 369, "timestamp": "2025-07-31T16:37:12.342551", "value": 1.5981 }, { "epoch": 10, "step": 410, "timestamp": "2025-07-31T16:38:21.105589", "value": 1.5792 } ] }, "timestamp": "2025-07-31T16:38:22.615Z" } ], "running_at": "2025-07-31T16:23:29.325Z", "state": "completed" }, "training_data_references": [ { "connection": {}, "location": { "href": "https://{cluster_url}/v2/assets/c195ff4c-c287-475c-a7e0-7159c9381511?project_id=1d59ce3d-d55f-4e38-a515-6cda5af08ab2", "id": "c195ff4c-c287-475c-a7e0-7159c9381511" }, "type": "data_asset" } ], "tuned_model": { "name": "model-b3ed41c5-d7e1-47ec-b110-d1796f64577e" } }, "metadata": { "created_at": "2025-07-31T16:22:51.651Z", "id": "b3ed41c5-d7e1-47ec-b110-d1796f64577e", "modified_at": "2025-07-31T16:38:38.866Z", "name": "Example - QLora fine tuning", "project_id": "1d59ce3d-d55f-4e38-a515-6cda5af08ab2" } }
Retrieve the list of fine tuning jobs
Retrieve the list of fine tuning jobs for the specified space or project.
Since CloudPak for Data 5.0.3.
GET /ml/v1/fine_tunings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned.
Possible values: value ≤ 200
Default:
100Compute the total count. May have performance impact.
Return only the resources with the given tag value.
Filter based on on the job state: queued, running, completed, failed etc.
The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources. Computed explicitly only when
total_count=truequery parameter is present. This is in order to avoid performance penalties.Example:
1A reference to the first item of the next page, if any.
The response of a fine tuning job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Get a fine tuning job
Get the results of a fine tuning job, or details if the job failed.
Since CloudPak for Data 5.0.3.
GET /ml/v1/fine_tunings/{id}Request
Path Parameters
The
idis the identifier that was returned in themetadata.idfield of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
The response of a fine tuning job.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-fine-tuning-job", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "owner": "guy", "created_at": "2023-08-04T13:22:55.289Z", "rev": "2", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }- metadata
The unique id of the resource.
The time when the resource was created.
The revision of the resource.
The user id which created this resource.
The time when the resource was last modified.
The id of the parent resource where applicable.
The name of the resource.
A description of the resource.
The id of the space this resource belongs to.
Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fA list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]Information related to the revision.
The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Status of the training job.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Cancel or delete a fine tuning job
Delete a fine tuning job if it exists, once deleted all trace of the job is gone.
Since CloudPak for Data 5.0.3.
DELETE /ml/v1/fine_tunings/{id}Request
Path Parameters
The
idis the identifier that was returned in themetadata.idfield of the request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcSet to true in order to also delete the job, request output and metadata.
List the available foundation models
Retrieve the list of deployed foundation models.
GET /ml/v1/foundation_model_specs
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50A set of filters to specify the list of models, filters are described as the
patternshown below.pattern: tfilter[,tfilter][:(or|and)] tfilter: filter | !filter filter: Requires existence of the filter. !filter: Requires absence of the filter. filter: one of modelid_*: Filters by model id. Namely, select a model with a specific model id. provider_*: Filters by provider. Namely, select all models with a specific provider. source_*: Filters by source. Namely, select all models with a specific source. input_tier_*: Filters by input tier. Namely, select all models with a specific input tier. output_tier_*: Filters by output tier. Namely, select all models with a specific output tier. tier_*: Filters by tier. Namely, select all models with a specific input or output tier. task_*: Filters by task id. Namely, select all models that support a specific task id. lifecycle_*: Filters by lifecycle state. Namely, select all models that are currently in the specified lifecycle state. function_*: Filters by function. Since CloudPak for Data `5.0.0`. Namely, select all models that support a specific function.Possible values: 1 ≤ length ≤ 1000, Value must match regular expression
^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$Example:
modelid_ibm/granite-13b-instruct-v1,modelid_ibm/granite-13b-instruct-v2:orSee all the
Tech Previewmodels if entitled.Default:
false
curl --request GET 'https://{cluster_url}/ml/v1/foundation_model_specs?version=2019-10-25&filters=function_time_series_forecast' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources.
Example:
1A reference to the first item of the next page, if any.
The supported foundation models.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The models that are currently deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02" }, "resources": [ { "model_id": "bigcode/starcoder", "label": "starcoder-15.5b", "provider": "BigCode", "source": "Hugging Face", "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions", "tasks": [ { "id": "code", "ratings": { "quality": 3 } } ], "min_shot_size": 0, "input_tier": "class_2", "output_tier": "class_2", "number_params": "15.5b" } ] }
List the supported tasks
Retrieve the list of tasks that are supported by the foundation models.
GET /ml/v1/foundation_model_tasks
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources.
Example:
1A reference to the first item of the next page, if any.
The supported foundation model tasks.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The tasks that are currently supported by models deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02" }, "resources": [ { "task_id": "question_answering", "label": "Question answering", "rank": 1, "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance." } ] }
Create a new notebook.
Create a new notebook
- either from scratch
- or by copying another notebook.
To create a notebook from scratch, you need to first upload the notebook content(ipynb format) to your project or space storage using Assets-files API and
then reference it with the attribute file_reference.
The other required attributes are name, project/space and runtime.
The attribute runtime is used to specify the environment on which the notebook runs.
Either project or space must be specified in the request body.
To copy a notebook, you only need to provide name and source_guid in the request body.
POST /v2/notebooks
Request
Specification of the notebook to be created.
Create a notebook from scratch in a project
{
"name": "my notebook",
"description": "this is my notebook",
"project": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"file_reference": "notebook/my_notebook.ipynb",
"runtime": {
"environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"spark_monitoring_enabled": true
}
}The name of the new notebook.
Example:
my notebookThe reference to the file in the object storage.
Example:
notebook/my_notebook.ipynbA notebook runtime.
The guid of the project in which to create the notebook.
Example:
92ae0e27-9b11-4de9-a646-d46ca3c183d4A more verbose description of the notebook.
Example:
this is my notebookThe notebook origin.
A notebook kernel.
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Created and returned a new notebook asset. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
The number of requests has exceeded the rate limit.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "space_id": "92ae0e27-9b11-4de9-a646-d46ca3c183d4" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-92ae0e27-9b11-4de9-a646-d46ca3c183d4", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?space_id=92ae0e27-9b11-4de9-a646-d46ca3c183d4" } }{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "notebook", "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "rate_limit", "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later." } ] }
Retrieve the details of a large number of notebooks inside a project.
Retrieve the details of a large number of notebooks inside a project.
POST /v2/notebooks/list
Request
Query Parameters
The guid of the project.
Additional info that will be included into the notebook details. Possible values are:
- runtime
Payload for a notebook list request.
List notebooks
{
"notebooks": [
"ca3c0e27-46ca-83d4-a646-d49b11c14de9"
]
}The list of notebooks whose details will be retrieved.
Response
A list of notebook info as returned by a list query.
The number of items in the resources list.
Example:
1An array of notebooks.
Status Code
Success. Returned a list of notebook assets. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "41d09a9a-f771-48a2-9534-50c0c622356d", "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d" }, "entity": { "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "asset": { "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "asset_type": "notebook", "created_at": "2021-07-01T12:37:01Z", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "version": 2, "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a particular notebook, including the notebook asset.
Delete a particular notebook, including the notebook asset.
DELETE /v2/notebooks/{notebook_guid}Response
Status Code
Successful request. Notebook is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Revert the main notebook to a version.
Revert the main notebook to a version.
PUT /v2/notebooks/{notebook_guid}Request
Path Parameters
The guid of the main notebook.
Payload for a request to revert to a specific notebook version.
Revert the notebook to a version
{
"source": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
}The guid of the notebook version.
Example:
ca3c0e27-46ca-83d4-a646-d49b11c14de9
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Reverted the main notebook to a version. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook v4.2", "description": "this is my notebook v4.2", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Request
Path Parameters
The guid of the notebook.
Payload for a notebook update request.
Update a notebook
{
"environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
"spark_monitoring_enabled": false,
"kernel": {
"display_name": "Python 3.9 with Spark",
"name": "python39",
"language": "python3"
}
}The guid of the environment on which the notebook runs.
Example:
d46ca0e27-a646-4de9-a646-9b113c183d4Spark monitoring enabled or not.
A notebook kernel.
Response
Notebook information as returned by a GET request.
Metadata of a notebook.
Entity of a notebook.
Status Code
Success. Updated the notebook. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4", "spark_monitoring_enabled": false }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Promote a notebook from project to space.
Promote a notebook from project to space.
POST /v2/notebooks/{notebook_guid}/promoteRequest
Path Parameters
The guid of the notebook.
Query Parameters
The guid of the notebook version.
The id of the project from which a notebook will be promoted.
Body parameters for promoting a notebook. space_id is required. name and description are optional. If not specified, the name and description of the source notebook in project will be used.
The id of the space to which a notebook will be promoted
Example:
b275be5f-10ff-47ee-bfc9-63f1ce5addbfThe name of the new notebook in space. If not specified, the name of the notebook in project will be used.
Example:
my notebookThe description of the new notebook in space. If not specified, the description of the notebook in project will be used.
Example:
this is my notebook in spaceA list of tags for the new notebook in space. If not specified, the tags will be ['notebook'].
Examples:[ "test", "promote" ]
Response
Notebook information in a space as returned by promoting a notebook from project to space.
Metadata of a notebook in a space.
Entity of a notebook without runtime.
Status Code
Success. Returned the notebook asset in the space.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new version.
Create a version of a given notebook.
POST /v2/notebooks/{notebook_guid}/versionsResponse
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the notebook version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
List the versions of a notebook.
List all versions of a particular notebook.
GET /v2/notebooks/{notebook_guid}/versionsResponse
A list of notebook versions in a project.
The number of items in the resources array.
Example:
1An array of notebook versions.
Status Code
Success. Returned a list of versions of the notebook.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } } ] }{ "total_results": 1, "resources": [ { "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Retrieve a notebook version.
Retrieve a particular version of a notebook.
GET /v2/notebooks/{notebook_guid}/versions/{version_guid}Response
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "space_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a notebook version.
Delete a particular version of a given notebook.
DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}Response
Status Code
Success. The version is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new prompt / prompt template
This creates a new prompt with the provided parameters.
POST /v1/prompts
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptAn optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured,freeform,chat,detached]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the prompt was updated.
Example:
1711504485261The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*Input mode in use for the prompt
Possible values: [
structured,freeform,chat,detached]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt
This retrieves a prompt / prompt template with the given id.
GET /v1/prompts/{prompt_id}Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Only return a set of model parameters compatiable with inferencing
Default:
true
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the prompt was updated.
Example:
1711504485261The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*Input mode in use for the prompt
Possible values: [
structured,freeform,chat,detached]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt
This updates a prompt / prompt template with the given id.
PATCH /v1/prompts/{prompt_id}Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptPossible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured,freeform]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the prompt was updated.
Example:
1711504485261The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*Input mode in use for the prompt
Possible values: [
structured,freeform,chat,detached]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt
This deletes a prompt / prompt template with the given id.
DELETE /v1/prompts/{prompt_id}Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt lock modifications
Modifies the current locked state of a prompt.
PUT /v1/prompts/{prompt_id}/lockRequest
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt lock status
Retrieves the current locked state of a prompt.
GET /v1/prompts/{prompt_id}/lockRequest
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get the inference input string for a given prompt
Computes the inference input string based on state of a prompt. Optionally replaces template params
POST /v1/prompts/{prompt_id}/inputRequest
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override input string that will be used to generate the response. The string can contain template parameters.
Possible values: Value must match regular expression
.*Example:
Some text with variables.Supply only to replace placeholders. Object content must be key:value pairs where the 'key' is the parameter to replace and 'value' is the value to use.
- prompt_variables
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
var1
Response
The prompt's input string used for inferences.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt
This adds new chat items to the given prompt.
POST /v1/prompts/{prompt_id}/chat_itemsRequest
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question,answer]Possible values: Value must match regular expression
.*Example:
Some textAllowable values: [
ready,error]Example:
1711504485261
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$Example:
Session 1The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}Example:
My First Prompt SessionTime the session was created.
Example:
1711504485261The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the session was updated.
Example:
1711504485261The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: 0 ≤ number of items ≤ 50
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the prompt was updated.
Example:
1711504485261The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*Input mode in use for the prompt
Possible values: [
structured,freeform,chat,detached]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session
This retrieves a prompt session with the given id.
GET /v1/prompt_sessions/{session_id}Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Include the most recent entry
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$Example:
Session 1The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}Example:
My First Prompt SessionTime the session was created.
Example:
1711504485261The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the session was updated.
Example:
1711504485261The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt session
This updates a prompt session with the given id.
PATCH /v1/prompt_sessions/{session_id}Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Possible values: Value must match regular expression
^.{0,100}$Example:
Session 1An optional description for the prompt.
Possible values: Value must match regular expression
^[\s\S]{0,250}Example:
My First Prompt Session
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$Example:
Session 1The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}Example:
My First Prompt SessionTime the session was created.
Example:
1711504485261The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the session was updated.
Example:
1711504485261The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session
This deletes a prompt session with the given id.
DELETE /v1/prompt_sessions/{session_id}Add a new prompt to a prompt session
This creates a new prompt associated with the given session.
POST /v1/prompt_sessions/{session_id}/entriesRequest
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptTime the prompt was created.
Example:
1711504485261The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First Prompt- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured,freeform,chat]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptTime the prompt was created.
Example:
1711504485261The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First Prompt- prompt_variables
- any property
Input mode in use for the prompt
Possible values: [
structured,freeform,chat]
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get entries for a prompt session
List entries from a given session.
GET /v1/prompt_sessions/{session_id}/entriesRequest
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Bookmark from a previously limited get request
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Limit for results to retrieve, default 20
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
- results
The prompt entry's ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The prompt entry's name
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
Name of an entryThe prompt entry's description
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
Description of an entryThe prompt entry's create time in millis
Example:
1711504485261
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Success - Returned when search completes
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt session entry
This adds new chat items to the given entry.
POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_itemsRequest
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question,answer]Possible values: Value must match regular expression
.*Example:
Some textAllowable values: [
ready,error]Example:
1711504485261
Prompt session lock modifications
Modifies the current locked state of a prompt session.
PUT /v1/prompt_sessions/{session_id}/lockRequest
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt session lock status
Retrieves the current locked state of a prompt session.
GET /v1/prompt_sessions/{session_id}/lockRequest
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit,governance]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session entry
This retrieves a prompt session entry with the given id.
GET /v1/prompt_sessions/{session_id}/entries/{entry_id}Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My PromptThe prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My First PromptTime the prompt was created.
Example:
1711504485261The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the prompt was updated.
Example:
1711504485261The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*Input mode in use for the prompt
Possible values: [
structured,freeform,chat,detached]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$Example:
2.0.0-rc.7User provived tag.
Possible values: Value must match regular expression
.*Example:
tagDescription of the version.
Possible values: Value must match regular expression
.*Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session entry
This deletes a prompt session entry with the given id.
DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Chat Completions
Infer the next tokens for a given deployed model with a set of parameters.
If stream is true, this operation will return the output tokens in a server-sent events (SSE) stream.
Since watsonx.ai 2.2.0.
POST /ml/v1/chat/completions
Request
Custom Headers
Allowable values: [
application/json,text/event-stream]
From a given prompt, infer the next tokens.
chat_completions
A chat example.
{
"model": "meta-llama/llama-3-8b-instruct",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
{
"role": "assistant",
"content": "The Los Angeles Dodgers won the World Series in 2020."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Where was it played?"
}
}
],
"max_tokens": 100,
"temperature": 0,
"time_limit": 1000
}tool_call
A tool calling example.
{
"model": "meta-llama/llama-3-8b-instruct",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the weather like in Boston today?"
}
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"description": "The city, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"enum": [
"celsius",
"fahrenheit"
],
"type": "string"
}
},
"required": [
"location"
]
}
}
}
],
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
}
}
}json_mode
A chat example with json output.
{
"model": "meta-llama/llama-3-8b-instruct",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"response_format": {
"type": "json_object"
},
"messages": [
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Who won the world series in 2020?"
}
}
]
}The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fTool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation.
Content-Type: text/event-streamifstreamis true.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A chat example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model": "meta-llama/llama-3-8b-instruct", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 47, "prompt_tokens": 59, "total_tokens": 106 } }A tool calling example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model": "meta-llama/llama-3-8b-instruct", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "tool_calls": [ { "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99", "type": "function", "function": { "name": "get_current_weather", "arguments": "{\n \"location\": \"Boston, MA\",\n \"unit\": \"fahrenheit\"\n}\n" } } ] }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 18, "prompt_tokens": 19, "total_tokens": 37 } }A chat example with json output.
{ "id": "cmpl-09945b25c805491fb49e15439b8e5d84", "model": "meta-llama/llama-3-8b-instruct", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 35, "prompt_tokens": 20, "total_tokens": 55 } }
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
You can also use /v1/chat/completions to achieve the same result.
Since watsonx.ai 2.1.0.
POST /ml/v1/text/chat
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens.
text_chat
A text chat example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
{
"role": "assistant",
"content": "The Los Angeles Dodgers won the World Series in 2020."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Where was it played?"
}
}
],
"max_tokens": 100,
"temperature": 0,
"time_limit": 1000
}tool_call
A tool calling example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the weather like in Boston today?"
}
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"description": "The city, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"enum": [
"celsius",
"fahrenheit"
],
"type": "string"
}
},
"required": [
"location"
]
}
}
}
],
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
}
}
}json_mode
A text chat example with json output.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"response_format": {
"type": "json_object"
},
"messages": [
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Who won the world series in 2020?"
}
}
]
}The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fTool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.If specified, the output will be exactly one of the choices.
If specified, the output will follow the regex pattern.
If specified, the output will follow the context free grammar.
If specified, the output will follow the JSON schema. See the JSON Schema reference for documentation about the format.
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": [ { "type": "text", "text": "Who won the world series in 2020?" } ] }, { "role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020." }, { "role": "user", "content": [ { "type": "text", "text": "Where was it played?" } ] } ], "max_tokens": 100, "temperature": 0, "time_limit": 1000 }'curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What is the weather like in Boston today?" } ] } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "description": "The city, e.g. San Francisco, CA", "type": "string" }, "unit": { "enum": [ "celsius", "fahrenheit" ], "type": "string" } }, "required": [ "location" ] } } } ], "tool_choice": { "type": "function", "function": { "name": "get_current_weather", } } }'curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "response_format": { "type": "json_object" }, "messages": [ { "role": "system", "content": "You are a helpful assistant designed to output JSON." }, { "role": "user", "content": [ { "type": "user", "text": "Who won the world series in 2020?" } ] } ] }'
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A text chat example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 47, "prompt_tokens": 59, "total_tokens": 106 } }A tool calling example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "tool_calls": [ { "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99", "type": "function", "function": { "name": "get_current_weather", "arguments": "{\n \"location\": \"Boston, MA\",\n \"unit\": \"fahrenheit\"\n}\n" } } ] }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 18, "prompt_tokens": 19, "total_tokens": 37 } }A text chat example with json output.
{ "id": "cmpl-09945b25c805491fb49e15439b8e5d84", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 35, "prompt_tokens": 20, "total_tokens": 55 } }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
This operation will set stream in the request body to true and
return the output tokens as a stream of events.
You can also use /v1/chat/completions
with stream option set to true to achieve the same result.
Since watsonx.ai 2.1.0.
POST /ml/v1/text/chat_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fTool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Specify either
tool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.Using
automeans the model can pick between generating a message or calling one or more tools. Default isauto.Using
nonemeans the model will not call any tool and instead generates a message.Using
requiredmeans the model must call one or more tools.Allowable values: [
auto,none,required]Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}forces the model to call that tool. Specify eithertool_choice_optionto allow the model to pick ortool_choiceto force the model to call a tool.If specified, the output will be exactly one of the choices.
If specified, the output will follow the regex pattern.
If specified, the output will follow the context free grammar.
If specified, the output will follow the JSON schema. See the JSON Schema reference for documentation about the format.
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
Additional kwargs to pass to the chat template, described as a JSON Schema object. See the JSON Schema reference for documentation about the format.
Examples:{ "thinking": true }Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0Whether to include
reasoning_contentin the response. Default istrue.Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
falseAn integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobsmust be set totrueif this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
Default:
1024The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length. Set to 0 for the model's configured max generated tokens.
This value is now deprecated in favor of
max_completion_tokens. If specified together withmax_completion_tokens,max_tokenswill be ignored.Default:
1024How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0A lower reasoning effort can result in faster responses, fewer tokens used, and shorter
reasoning_contentin the responses. Supported values arelow,medium, andhigh.Allowable values: [
low,medium,high]The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]If set to true, this operation will return the output tokens as a stream of events, and usage is always included in the response.
Example:
trueWhat sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_pbut not both.Possible values: 0 < value < 2
Default:
1An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperaturebut not both.Possible values: 0 < value < 1
Default:
1Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
nis greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZUsage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
HAP and PII detection for text
This operation is used mainly for Hate And Profanity (HAP)
and Personal Identifiable Information (PII) filtering.
This is a detection-only end-point. It supports natural language input and output and returns the result of the detection. It can be configured for HAP, PII, or any combination with other available detectors.
Since CloudPak for Data 5.1.0.
POST /ml/v1/text/detection
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a content, detect text.
text PII detection
A PII text detection example.
{
"input": "my text to check",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"detectors": {
"pii": {}
}
}A PII text detection example.
A HAP text detection example.
{
"input": "my text to check",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"detectors": {
"hap": {
"threshold": 0.5
}
}
}text detection with multiple detectors
A text detection with multiple detectors.
{
"input": "my text to check",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"detectors": {
"pii": {},
"hap": {
"threshold": 0.6
}
}
}The text to be examined.
The detectors to use, these can be IBM provided
HAPorPIIdetectors or a custom content detector.- detectors
The detectors to use, this is a map of
detector-namewith a map of optional key/value pairs.- any property
The optional key/value pairs for the detector.
- any property
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fInformation required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "input": "my text to check", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "detectors": { "pii": {} } }'curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "input": "my text to check", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "detectors": { "hap": { "threshold": 0.5 } } }'curl --request POST 'https://{cluster_url}/ml/v1/text/detection?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "input": "my text to check", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "detectors": { "pii": {}, "hap": { "threshold": 0.6 } } }'
Response
The response for text detection.
The text that was detected.
Possible values: number of items ≥ 0
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A PII text detection example.
{ "detections": [ { "start": 20, "end": 24, "detection_type": "pii", "detection": "xxxx", "score": 0.846 } ] }A HAP text detection example.
{ "detections": [ { "start": 122, "end": 239, "detection_type": "hap", "detection": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "score": 0.846 } ] }A text detection with multiple detectors.
{ "detections": [ { "start": 20, "end": 24, "detection_type": "pii", "detection": "xxxx", "score": 0.846 }, { "start": 122, "end": 239, "detection_type": "hap", "detection": "xxxxxxxxxxxxxxxxxxxxxxxxxx", "score": 0.846 } ] }
Detection task on input content based on context documents
This operation supports context relevance and faithfulness (or groundedness).
The input is analyzed, along with the context information,
and the model will return any detections that it found.
POST /ml/v1/text/detection/context
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a content, detect text using a detection model.
Since CloudPak for Data 5.1.0.
The text to be examined.
The detectors to use, this is a map of
detector-namewith a map of optional key/value pairs.- detectors
The optional key/value pairs for the detector.
- any property
The type of the context.
Allowable values: [
docs]Context documents.
Possible values: 1 ≤ number of items ≤ 100
Examples:[ "https://en.wikipedia.org/wiki/IBM", "https://research.ibm.com/" ]The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fInformation required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
Response
The response for text context detection.
The text that was detected.
Possible values: number of items ≥ 0
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Detection task performing detection on prompt and generated text
This operation supports answer relevance.
The prompt is analyzed, along with the generated text,
and the model will return any detections that it found.
Since CloudPak for Data 5.1.0.
POST /ml/v1/text/detection/generated
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a content, detect text using a detection model.
The text prompt.
The generated text.
The detectors to use, this is a map of
detector-namewith a map of optional key/value pairs.- detectors
The optional key/value pairs for the detector.
- any property
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fInformation required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
Response
The response for generated text detection.
The text that was detected.
Possible values: number of items ≥ 0
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Generate embeddings
Generate embeddings from text input.
See the documentation for a description of text embeddings.
Since watsonx.ai 2.0.0.
POST /ml/v1/text/embeddings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The text input for a given model to be used to generate the embeddings.
A sample request.
A simple request.
{
"model_id": "slate",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
"Youth craves thrills while adulthood cherishes wisdom."
]
}The
idof the model to be used for this request. Please refer to the list of models.The input text.
Possible values: number of items ≤ 1000
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fParameters for text embedding requests.
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "inputs": [ "Youth craves thrills while adulthood cherishes wisdom.", "Youth seeks ambition while adulthood finds contentment.", "Dreams chased in youth while goals pursued in adulthood." ], "model_id": "slate", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
idof the model to be used for this request. Please refer to the list of models.The embedding values for a given text.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe number of input tokens that were consumed.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "slate", "results": [ { "embedding": [ -0.006929283, -0.005336422, -0.024047505 ] } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 10 }
Start a text extraction request
Start a request to extract text and metadata from documents.
See the documentation for a description of text extraction.
Since watsonx.ai 2.1.0.
POST /ml/v1/text/extractions
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The input for the text extraction request.
simple_request
A simple request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "connection_asset",
"connection": {
"id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
},
"location": {
"file_name": "files/document.pdf"
}
},
"results_reference": {
"type": "connection_asset",
"connection": {
"id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
},
"location": {
"file_name": "results"
}
},
"steps": {
"tables_processing": {
"enabled": true
}
}
}container_request
A simple request with docuemt and result reference to container.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "container",
"location": {
"path": "files/document.pdf"
}
},
"results_reference": {
"type": "container",
"location": {
"path": "results/"
}
},
"parameters": {
"requested_outputs": [
"assembly"
],
"mode": "high_quality",
"ocr_mode": "enabled",
"languages": [
"latn"
]
}
}ocr_request
An OCR request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "connection_asset",
"connection": {
"id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
},
"location": {
"file_name": "files/document.pdf"
}
},
"results_reference": {
"type": "connection_asset",
"connection": {
"id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
},
"location": {
"file_name": "results"
}
},
"steps": {
"ocr": {
"languages_list": [
"en",
"fr"
]
},
"tables_processing": {
"enabled": false
}
}
}Multiple outputs
A request for multiple outputs.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "connection_asset",
"connection": {
"id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
},
"location": {
"file_name": "files/document.pdf"
}
},
"results_reference": {
"type": "connection_asset",
"connection": {
"id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
},
"location": {
"file_name": "results"
}
},
"parameters": {
"requested_outputs": [
"assembly",
"md"
],
"mode": "high_quality",
"ocr_mode": "enabled"
}
}A reference to data.
A reference to data.
The parameters for the text extraction.
Since watsonx.ai
2.2.0.The steps for the text extraction pipeline.
Use
parametersinstead.Set this as an empty object to specify
jsonoutput.Use
parameters.requested_outputsinstead.Default:
{}Set this as an empty object to specify
markdownoutput.Use
parameters.requested_outputsinstead.Example:
{}User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
The project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } } }'curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "ocr": { "languages_list": [ "en" ] }, "tables_processing": { "enabled": false } } }'curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "parameters": { "requested_outputs": [ "assembly", "md" ], "mode": "high_quality", "ocr_mode": "enabled" } }'
Response
The text extraction response.
Common metadata for a resource where
project_idorspace_idmust be present.The document details for the text extraction.
- entity
A reference to data.
A reference to data.
The current status of the text extraction.
The parameters for the text extraction.
Since watsonx.ai
2.2.0.The steps for the text extraction pipeline.
Use
parametersinstead.Set this as an empty object to specify
jsonoutput.Use
parameters.requested_outputsinstead.Set this as an empty object to specify
markdownoutput.Use
parameters.requested_outputsinstead.Example:
{}User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Created. The
Content-Locationheader will contain the URI reference to the created resource.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } }, "results": { "status": "submitted", "number_pages_processed": 0 } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "container", "location": { "path": "files/document.pdf" } }, "results_reference": { "type": "container", "location": { "path": "results/" } }, "parameters": { "requested_outputs": [ "assembly" ], "mode": "high_quality", "ocr_mode": "enabled" }, "results": { "status": "submitted", "number_pages_processed": 0 } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "ocr": { "languages_list": [ "en", "fr" ] } }, "tables_processing": { "enabled": false }, "results": { "status": "submitted", "number_pages_processed": 0 } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "parameters": { "requested_outputs": [ "assembly", "md" ], "mode": "high_quality", "ocr_mode": "enabled" }, "results": { "status": "submitted", "number_pages_processed": 0 } } }
Retrieve the text extraction requests
Retrieve the list of text extraction requests for the specified space or project.
This operation does not save the history, any requests that were deleted or purged will not appear in this list.
Since watsonx.ai 2.1.0.
GET /ml/v1/text/extractions
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcToken required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50
Response
A paginated list of resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources. Computed explicitly only when
total_count=truequery parameter is present. This is in order to avoid performance penalties.Example:
1A reference to the first item of the next page, if any.
A list of resources.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "limit": 10, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions" }, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "results": { "status": "completed", "number_pages_processed": 3, "running_at": "2023-05-02T16:28:03Z", "completed_at": "2023-05-02T16:29:31Z" } } } ] }
Get the results of the request
Retrieve the text extraction request with the specified identifier.
Note that there is a retention period of 2 days. If this retention
period is exceeded then the request will be deleted and the results
no longer available. In this case this operation will return 404.
Since watsonx.ai 2.1.0.
GET /ml/v1/text/extractions/{id}Request
Path Parameters
The identifier of the extraction request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request GET 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
The text extraction response.
Common metadata for a resource where
project_idorspace_idmust be present.The document details for the text extraction.
- entity
A reference to data.
A reference to data.
The current status of the text extraction.
The parameters for the text extraction.
Since watsonx.ai
2.2.0.The steps for the text extraction pipeline.
Use
parametersinstead.Set this as an empty object to specify
jsonoutput.Use
parameters.requested_outputsinstead.Set this as an empty object to specify
markdownoutput.Use
parameters.requested_outputsinstead.Example:
{}User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } }, "results": { "status": "running", "number_pages_processed": 2, "running_at": "2023-05-02T16:28:03Z" } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "ocr": { "languages_list": [ "en", "fr" ] }, "tables_processing": { "enabled": false } }, "results": { "status": "submitted", "number_pages_processed": 0 } } }{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "parameters": { "requested_outputs": [ "assembly", "md" ], "mode": "high_quality", "ocr_mode": "enabled" }, "results": { "status": "running", "number_pages_processed": 2, "running_at": "2023-05-02T16:28:03Z" } } }
Delete the request
Cancel the specified text extraction request and delete any associated results.
Since watsonx.ai 2.1.0.
DELETE /ml/v1/text/extractions/{id}Request
Path Parameters
The identifier of the extraction request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcSet to true in order to also delete the job or request metadata.
curl --request DELETE 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
This API is legacy, consider using Text Chat.
POST /ml/v1/text/generation
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
idof the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fProperties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }Properties that control the moderations, for usages such as
Hate and profanity(HAP) andPersonal identifiable information(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }- moderations
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
idof the model for inference.Example:
google/flan-ul2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished,max_tokens,eos_token,cancelled,time_limit,stop_sequence,token_limit,error]Example:
token_limitThe number of generated tokens.
Example:
3The number of input tokens consumed.
Example:
11The seed used, if it exists.
Example:
42The list of individual generated tokens. Extra token information is included based on the other flags in the
return_optionsof the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]The list of input tokens. Extra token information is included based on the other flags in the
return_optionsof the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }The generated text from the model along with other details.
{ "model_id": "google/flan-t5-xl", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.", "generated_token_count": 118, "input_token_count": 11, "stop_reason": "eos_token", "moderations": { "pii": [ { "score": 0.8, "input": false, "position": { "start": 74, "end": 88 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 200, "end": 212 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 244, "end": 259 }, "entity": "EmailAddress" } ] } } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
This API is legacy, consider using Text Chat Stream.
POST /ml/v1/text/generation_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
idof the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fProperties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }Properties that control the moderations, for usages such as
Hate and profanity(HAP) andPersonal identifiable information(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }- moderations
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>} where the schema of the individual json event is described below.
The
idof the model for inference.Example:
google/flan-ul2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished,max_tokens,eos_token,cancelled,time_limit,stop_sequence,token_limit,error]Example:
token_limitThe number of generated tokens.
Example:
3The number of input tokens consumed.
Example:
11The seed used, if it exists.
Example:
42The list of individual generated tokens. Extra token information is included based on the other flags in the
return_optionsof the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]The list of input tokens. Extra token information is included based on the other flags in the
return_optionsof the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The input texts and the queries for reranking.
A sample request.
{
"model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
{
"text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
},
{
"text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
}
],
"query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
"parameters": {
"return_options": {
"top_n": 2
}
}
}The
idof the model to be used for this request. Please refer to the list of models.The rank input strings.
Possible values: 0 ≤ number of items ≤ 1000
The rank query.
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe properties used for reranking.
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "inputs": [ { "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine." }, { "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience." } ], "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.", "parameters": { "return_options": { "top_n": 2 } } }'
Response
System details.
The
idof the model to be used for this request. Please refer to the list of models.The ranked results.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe number of input tokens that were consumed.
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$The rank query, if requested.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "results": [ { "index": 1, "score": 0.7461 }, { "index": 0, "score": 0.8274 } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 20 }
Text tokenization
The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.
POST /ml/v1/text/tokenization
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The input string to tokenize.
A sample request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"model_id": "google/flan-ul2",
"input": "Write a tagline for an alumni association: Together we",
"parameters": {
"return_tokens": true
}
}The
idof the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2The input string to tokenize.
Example:
Write a tagline for an alumni association: Together weThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe parameters for text tokenization.
Information required to retrieve a Data-Encryption-Key (DEK) that is stored in an external keys management service. The retrieved DEK is used to encrypt and decrypt the inference request during the inference process.
Setup Instructions
To enable encryption, you must first configure credentials for your chosen keys management service , before making inference calls that require encryption.
For IBM Key Protect
Step 1: Generate Wrapped DEK
Create your Key Protect Instance using IBM Key Protect and generate a wrapped key:
-
Navigate to your root key in the Key Protect dashboard
-
Click the three-dot menu → Select Envelope encryption
-
Select Wrap key for me
-
Click Wrap key
-
Copy the displayed
Ciphertextvalue (base64-encoded wrapped key)
Alternatively, follow Wrap a Data Encryption Key to call the API endpoint.
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:crn:v1:bluemix:public:kms:<region>:a/<account-id>:<service-instance>:key:<key-id>:wdek:<ciphertext>Step 3: Create a
task_credentialsobject with the typekey_manager_api_keyby making a curl call such as the following:curl --request POST \ --url <DATAPLATFORM_ENDPOINT>/v1/task_credentials \ --header 'authorization: Bearer <TOKEN>'\ --header 'content-type: application/json' \ --data '{ "name": "task credentials", "description": "This is my task credentials", "type": "key_manager_api_key" }'Replace
nameanddescriptionfields accordingly.For AWS KMS
Step 1: Generate Encrypted DEK (CiphertextBlob):
Use the AWS CLI to generate a wrapped DEK:
aws kms generate-data-key \ --key-id <KMS_KEY_ID> \ --key-spec AES_256 \ --output text \ --query CiphertextBlobImportant Notes:
- Save
CiphertextBlobvalue securely - Do NOT decode the base64 output (keep it encoded for
key_ref)
Step 2: Format Key Reference (
key_ref)Construct the
key_refparameter using the format below:arn:aws:kms:<region>:<account_id>:key/<key_id>:edek:<ciphertextblob>Step 3: Follow the official guide to configure Account Delegation.
This grants IBM permission to use temporary credentials to call AWS KMS on your behalf.
Examples:{ "key_ref": "crn:v1:bluemix:public:kms:us-south:a/12345:b/67890::key:abcd-1234-ef56-7890", "algorithm": "AES-256" }-
curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-ul2,", "input": "Write a tagline for an alumni association: Together we", "parameters": { "return_tokens": true }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
The tokenization result.
The
idof the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2The result of tokenizing the input string.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The response with the token count and the tokens, if requested.
{ "model_id": "google/flan-ul2", "result": { "token_count": 11, "tokens": [ "Write", "a", "tag", "line", "for", "an", "alumni", "associ", "ation:", "Together", "we" ] } }
Time series forecast
Generate forecasts, or predictions for future time points, given historical time series data.
POST /ml/v1/time_series/forecast
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The forecast request.
A sample request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"model_id": "ibm/ttm-1024-96-r2",
"schema": {
"timestamp_column": "date",
"id_columns": [
"ID1"
]
},
"data": {
"date": [
"2020-01-01T00:00:00",
"2020-01-01T01:00:00",
"2020-01-05T01:00:00"
],
"ID1": [
"D1",
"D1",
"D1"
],
"TARGET1": [
1.46,
2.34,
4.55
]
}
}The model to be used for generating a forecast. You can get the list of models by using Foundation Model Specs with
filters=function_time_series_forecast.Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^\S+$A payload of data matching
schema. We assume the following about your data:- All timeseries are of equal length and are uniform in nature (the time difference between two successive rows is constant). This implies that there are no missing rows of data;
- The data meet the minimum model-dependent historical context length which can be 512 or more rows per timeseries;
Note that the example payloads shown are for illustration purposes only. An actual payload would necessary be much larger to meet minimum model-specific context lengths.
- data
Contains metadata about your timeseries data input.
The project id of the resource.
Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fThe space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe parameters for the forecast request.
Response
The time series forecast response.
The model used to generate the forecast.
Example:
ibm/ttm-1024-96-r2The time when the response was created in ISO 8601 format.
Example:
2020-05-02T16:27:51ZThe list of prediction results. There will be a forecast for each time series in the input data. The
prediction_lengthfield in the request specifies the number of predictions in the results. The actual number of rows in the results will be equal to theprediction lengthmultiplied by the number of unique ids inid_columns. Thetimestamp_columnfield in the request indicates the name of the timestamp column in the results.Examples:[ { "date": [ "2020-01-01T03:00:00", "2020-01-01T04:00:00", "2020-01-01T05:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.78, 6.78 ] } ]- results
The number of input data points (number of rows in
data* number of input columns indata).Example:
512The number of forecasted data points (
prediction_length* number oftarget_columns* number of unique ids inid_columns).Example:
1024
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "model_id": "ibm/ttm-1024-96-r2", "created_at": "2020-05-02T16:27:51Z", "results": [ { "date": [ "2020-01-05T02:00:00", "2020-01-05T03:00:00", "2020-01-06T00:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.24, 6.78 ] } ], "input_data_points": 512, "output_data_points": 1024 }
Create a new watsonx.ai training
Create a new watsonx.ai training in a project or a space.
In order to deploy the tuned model you need to follow the following steps:
-
Create a WML model asset, in a space or a project, by providing the
request.jsonas shown below:curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \ -H "Authorization: Bearer <replace with your token>" \ -H "content-type: application/json" \ --data '{ "name": "replace_with_a_meaningful_name", "space_id": "replace_with_your_space_id", "type": "prompt_tune_1.0", "software_spec": { "name": "watsonx-textgen-fm-1.0" }, "metrics": [ from the training job ], "training": { "id": "05859469-b25b-420e-aefe-4a5cb6b595eb", "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "generation", "verbalizer": "Input: {{input}} Output:" }, "training_data_references": [ { "connection": { "id": "20933468-7e8a-4706-bc90-f0a09332b263" }, "id": "file_to_tune1.json", "location": { "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl", "path": "file_to_tune1.json" }, "type": "connection_asset" } ] }'Notes:
- If you used the training request field
auto_update_model: truethen you can skip this step as the model will have been saved at the end of the training job. - Rather than creating the payload for the model you can use the
generated
request.jsonthat was stored in theresults_referencefield, look for the path in the fieldentity.results_reference.location.model_request_path. - The model
typemust beprompt_tune_1.0. - The software spec name must be
watsonx-textgen-fm-1.0.
- If you used the training request field
-
Create a tuned model deployment as described in the create deployment documentation.
POST /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07
The training_data_references contain the training datasets and the
results_reference the connection where results will be stored.
{
"name": "my-prompt-tune-training",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"prompt_tuning": {
"base_model": {
"model_id": "google/flan-t5-xl"
},
"tuning_type": "prompt_tuning",
"num_epochs": 30,
"learning_rate": 0.4,
"accumulate_steps": 3,
"batch_size": 10,
"max_input_tokens": 100,
"max_output_tokens": 100,
"task_id": "classification"
},
"training_data_references": [
{
"id": "tune1_data.json",
"location": {
"path": "tune1_data.json"
},
"type": "container"
}
],
"auto_update_model": true,
"results_reference": {
"location": {
"path": "tune1/results"
},
"type": "container"
}
}The name of the training.
Example:
my-prompt-trainingThe training results. Normally this is specified as
type=container(Service) ortype=fs(Software) which means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }- results_reference
The data source type like
connection_asset,container(Service) orfs(Software).Allowable values: [
connection_asset,container,fs]Example:
connection_assetContains a set of fields that describe the location of the data with respect to the
connection.- location
Item identification inside a collection.
Contains a set of fields specific to each connection. See here for details about specifying connections.
The space that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
3fc54cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idhas to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
12ac4cf1-252f-424b-b52d-5cdd9814987fA description of the training.
Example:
My prompt training.A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]Training datasets.
Examples:[ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ]User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }- custom
If set to
truethen the result of the training, if successful, will be uploaded to the repository as a model.Default:
false
curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "name": "my-prompt-tune-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification", "tuning_type": "prompt_tuning", "num_epochs": 30, "learning_rate": 0.4, "accumulate_steps": 3, "batch_size": 10, "max_input_tokens": 100, "max_output_tokens": 100 }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results" }, "type": "container" } }'
Response
Training resource.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
The training job has been created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the list of trainings
Retrieve the list of trainings for the specified space or project.
GET /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
nextfield.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100Example:
50Compute the total count. May have performance impact.
Return only the resources with the given tag value.
Filter based on on the training job state.
Allowable values: [
queued,pending,running,storing,completed,failed,canceled]The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Information for paging when querying resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10The reference to the first item in the current page.
The total number of resources. Computed explicitly only when
total_count=truequery parameter is present. This is in order to avoid performance penalties.Example:
1A reference to the first item of the next page, if any.
The training resources.
Optional details coming from the service and related to the API call or the associated resource.
Examples:{ "warnings": [ { "message": "This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations.", "id": "DisclaimerWarning" } ] }- system
Any warnings coming from the system.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the training
Retrieve the training with the specified identifier.
GET /ml/v4/trainings/{training_id}Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Training resource.
Common metadata for a resource where
project_idorspace_idmust be present.Examples:{ "rev": "2", "owner": "guy", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" } }Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Cancel or delete the training
Cancel or delete the specified training, once deleted all trace of the job is gone.
DELETE /ml/v4/trainings/{training_id}Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD.Example:
2023-07-07The space that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
63dc4cf1-252f-424b-b52d-5cdd9814987fThe project that contains the resource. Either
space_idorproject_idquery parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edcSet to true in order to also delete the job or request metadata.
Create a vector index.
This creates a new vector index with the provided parameters.
POST /v1/vector_indexes
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexThe IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Allowable values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores. REQUIRED FOR EXTERNAL VECTOR STORES
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
An optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexThe status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Create a vector index.
{ "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-New", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" }
Get a vector index
This retrieves a vector index with the given id.
GET /v1/vector_indexes/{index_id}Request
Path Parameters
Vector index ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Get vector index.
{ "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-New", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" }
Update a vector index
This updates a vector index with the given id.
PATCH /v1/vector_indexes/{index_id}Request
Path Parameters
Vector index ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexThe IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Allowable values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Response with updated vector index.
{ "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-Patched", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" }
Delete a vector index
This deletes a vector index with the given id.
DELETE /v1/vector_indexes/{index_id}Vector Index attachments modifications
TO BE USED ONLY WITH IN-MEMORY VECTOR STORE. This is to update the attachments/objects associated with the vector index.
PUT /v1/vector_indexes/{index_id}/attachmentRequest
Path Parameters
Vector index ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
{
"object_key": "vector_index/canadaConstitution__da__5bpykqpyzx.gz"
}The object key of the gzipped file, that is available in the attached COS Bucket.
Possible values: Value must match regular expression
vector_index/[a-zA-Z0-9-]__da__*.gzExample:
vector_index/myvectorindex__da__vi82w8pghq.gz
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
Ok - Returned when the attachment is successfull.
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Create a vector index.
This creates a new vector index with the provided parameters.
POST /v1/transactional_vector_indexes
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexThe IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Allowable values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
An optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexThe status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Create a vector index.
{ "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-New", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" }
Update a vector index
This updates a vector index with the given id.
PATCH /v1/transactional_vector_indexes/{index_id}Request
Path Parameters
Vector index ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexThe IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Allowable values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Response
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Response with updated vector index.
{ "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-Patched", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" }
Get status of a vector index
This retrieves a vector index with the given id.
GET /v1/transactional_vector_indexes/{index_id}/statusRequest
Path Parameters
Vector index ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
Current status of vector index
Possible values: [
STARTING,PENDING,READY,RUNNING,COMPLETED,FAILED]The vector asset data
- asset
The vector index's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aName used to display the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My vector indexAn optional description for the vector index asset.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
My first vector indexTime the vector index was created.
Example:
1711504485261The ID of the original vector index creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0Time the vector index was updated.
Example:
1711504485261The ID of the last user that modifed the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
IBMid-000000YYY0The IDs of the associated data assets used in the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "713259df-0540-4301-982e-693a81da462c", "713259df-0540-4301-982e-693a81da573d" ]External vector store. (elasticsearch or watsonx.data)
- store
The type of the vector store
Example:
watsonx.dataThe ID of the external store connection.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147The name of the index in the vector store
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
wx_my_sample_collectionTo determine if a new index was used.
Name of the database
Possible values: [
default,rag]
- settings
The number of text (or tokens) that are grouped together before converting into a vector.
Possible values: 100 ≤ value ≤ 10000
The number of characters to overlap for chunking data
Possible values: 0 ≤ value ≤ 2000
number of most similar results to retrieve (lower values lead to greater similarity between the question and answer)
Possible values: 1 ≤ value ≤ 10
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
sentence-transformers/all-minilm-l6-v2- schema_fields
Field to use for finding the document name.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
document_nameField to use for the text in the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
textField to use for finding the document page number.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
pageThe vector_query_field that contains the text_expansion query. Only applicable for Elasticsearch
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
vectorThe optional field that contains a url for the document.
Possible values: Value must match regular expression
[a-zA-Z0-9-//]*Example:
documentURL
The associated build to process the data for external vector stores
- build
The ID of the associated notebook.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3aThe ID of the associated job run.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
12c45f78-a2b4-1b3d-aa2c-zy09y87w6a3a
The status of the vector index.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Example:
readyTags attached to the asset
Possible values: Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "new vector index 1, document for cars" ]Frequently asked questions
Possible values: number of items ≤ 6, Value must match regular expression
[a-zA-Z0-9-]*Examples:[ "summarize the document, name of the author" ]
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
Get vector index.
{ "status": "COMPLETED", "asset": { "id": "43499a2a-7656-43d6-8ce0-374d34449d4f", "name": "Milvus-VI-New", "description": "", "created_at": 1739888788777, "created_by": "IBMid-6910003SE8", "last_updated_at": 1739888804362, "last_updated_by": "IBMid-6910003SE8", "data_assets": [ "9624a20d-ecd0-450e-b7d2-9941ce7d1c57" ], "store": { "type": "watsonx.data", "connection_id": "9bdb5dbe-d896-4539-b969-5bf17fcc1f0a", "index": "wx_test_collection_japanese", "new_index": true, "database": "default" }, "settings": { "chunk_size": 2000, "chunk_overlap": 200, "split_pdf_pages": true, "top_k": 3, "rerank": false, "embedding_model_id": "sentence-transformers/all-minilm-l6-v2", "schema_fields": { "document_name": "document_name", "text": "text", "page_number": "page" } }, "build": { "notebook_id": "53a61d7d-7d09-4392-9d5d-eb90e46aee1c", "job_id": "843e36f9-3550-4b0c-8755-4976e41bce35" }, "status": "ready" } }