Introduction to IBM watsonx.ai as a Service
Using IBM watsonx.ai as a Service
APIs, you can run text inference, prompt tuning and more on Large Language Models (LLM).
If you are looking for the IBM watsonx.ai software
APIs, see here.
Step-by-step instructions on how to use IBM watsonx.ai as a Service
can be found
here.
There is a specialized python library that is available to access this REST API.
Endpoint URLs
The following URL represents the base URLs for the watsonx.ai API endpoints. When you call the API, use the URL and add the path for each method to form the complete API endpoint for your requests.
- Dallas:
https://us-south.ml.cloud.ibm.com
- Frankfurt -
https://eu-de.ml.cloud.ibm.com
- London -
https://eu-gb.ml.cloud.ibm.com
- Tokyo -
https://jp-tok.ml.cloud.ibm.com
- Sydney -
https://au-syd.ml.cloud.ibm.com
Note that for prompts
and notebooks
the base URLs are the following:
- Dallas:
https://api.dataplatform.cloud.ibm.com/wx
- Frankfurt -
https://api.eu-de.dataplatform.cloud.ibm.com/wx
- London -
https://api.eu-gb.dataplatform.cloud.ibm.com/wx
- Tokyo -
https://api.jp-tok.dataplatform.cloud.ibm.com/wx
- Sydney -
https://api.au-syd.dai.cloud.ibm.com/wx
Example request to a Dallas endpoint:
curl -H "Authorization: Bearer {token}" -X {request_method} "https://us-south.ml.cloud.ibm.com/{method_endpoint}"
Replace {request_method}
, and {method_endpoint}
in this example with the values
for your particular API call. See the Authentication
section below for more details about the bearer {token}
.
Authentication
This API uses IBM Cloud Identity and Access Management (IAM) to authenticate requests.
To work with the API, authenticate your application or service by including your IBM Cloud IAM access token in API requests.
IAM authentication. Replace {token}
and {url}/{method}
with your service credentials.
curl -H "Authorization:Bearer {token}" -X "{url}/{method}"
Authorization: Bearer {token}
For example, if the token is tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5
in the service credentials, include the credentials in your call like this:
curl -H "Authorization:Bearer tzLbqWhyALQawBg5TjRIf5sAznhrKQyvBFFaZbtF60m5" -X "https://us-south.ml.cloud.ibm.com/ml/v4/models"
Error handling
This API uses standard HTTP response codes to indicate whether a method completed successfully.
A 200
type response indicates success.
HTTP Code | Description | Recovery |
---|---|---|
200 |
Success | The request was successful. |
400 |
Bad Request | The input parameters in the request body are either incomplete, or in the wrong format, or some other input validation failed. Be sure to include all required parameters in your request and check the request body. |
401 |
Unauthorized | You are not authorized to make this request. Log in and try again or provide a valid token. See Authenticating with IAM tokens for instructions on logging in. If this error persists, contact the account owner to check your permissions. |
403 |
Forbidden | The supplied authentication is not authorized. |
404 |
Not Found | The requested resource could not be found. |
Note that 429
and 503
errors may mean that the model is overloaded or unavailable,
check the error description for more details.
Error response
Name | Description |
---|---|
trace | An identifier that can be used to trace the request. This can be set using X-Global-Transaction-Id . |
errors | The list of errors. |
Errors
Name | Description |
---|---|
code | A simple string code that should convey the general sense of the error. |
message | The message that describes the error. |
more_info | A reference to a more detailed explanation when available. |
Additional headers
Some additional headers might be required to make successful requests to the API. Those additional headers are described below.
An optional transaction ID can be passed to your request, which can be useful for tracking calls through multiple services using one identifier. The header key must be set to X-Global-Transaction-Id
and the value is anything that you choose.
If there is not a transaction ID that is passed in, then one is generated randomly.
API change log
In this change log you can learn about the latest changes, improvements, and updates for the watsonx.ai
API.
The change log lists changes that have been made, ordered by the date they were released.
Changes to existing API versions are designed to be compatible with existing client applications,
if this is not the case then a new version date will be created.
18 April 2024
The /ml/v1/text/embeddings API was added to watsonx.ai
, this is a non-breaking change
and just adds this single API operation.
Versioning
API requests require a version parameter that takes the date in the format version=YYYY-MM-DD
. Send the version parameter with every API request.
When the API is changed in a way that is not compatible with previous versions, a new minor version is released. To take advantage of the changes in a new version, change the value of the version parameter to the new date. If you're not ready to update to that version, don't change your version date.
Data References
Accessing data in a remote location (such as a Cloud Object Storage bucket, or an SQL/no-SQL database) requires
the use of connection_asset
or data_asset
reference types.
These reference types are created within a space or a project and are referenced in WML requests to represent input
data and results locations. These types contain two parameter objects, connection
and location
, which require
different values to be supplied based on the reference type. Using a data_asset
, requires an href
to be supplied
to the location
object whereas using a connection_asset
requires the connection_id
for the connection
object
and different location
fields depending on the data source type.
Example connection_asset
payload:
{
"training_data_references": [
{
"type": "connection_asset",
"connection": {
"id": "<connection_guid>"
},
"location": {
"<wdp-properties depending on the type>": "<value depending on the type>"
}
}
]
}
Example data_asset
payload:
{
"training_data_references": [
{
"type": "data_asset",
"location": {
"href": "/v2/assets/<asset_id>?space_id=<space_id>"
}
}
]
}
Example container
payload:
{
"training_data_references": [
{
"location":{
"path":"filename_in_project_or_space"
},
"type":"container"
}
]
}
Activity Tracker events
You can monitor API activity within your account by using the IBM Cloud Activity Tracker service. Whenever an API method is called, an event is generated that you can then track and audit from within Activity Tracker. The specific event type is listed for each individual method.
Methods
Create a new AI service
Create a new AI service with the given payload. A AI service is some code that can be deployed as a deployment.
POST /ml/v4/ai_services
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Payload for creating the AI service. Either space_id
or project_id
has to be provided and is mandatory.
A sample request.
{
"name": "ai-app-1",
"software_spec": {
"id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309"
},
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"documentation": {
"request": {
"application/json": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"query": {
"type": "string"
},
"parameters": {
"properties": {
"max_new_tokens": {
"type": "integer"
},
"top_p": {
"type": "number"
}
},
"required": [
"max_new_tokens",
"top_p"
]
}
},
"required": [
"query"
]
},
"application/png": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"image": {
"type": "string",
"format": "binary"
}
},
"required": [
"image"
]
}
},
"response": {
"application/json": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"query": {
"type": "string"
},
"result": {
"type": "string"
}
},
"required": [
"query",
"result"
]
},
"application/png": {
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "string",
"format": "binary"
}
}
}
}
The space that contains the resource.
Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The name of the resource.
Example:
my-resource
A software specification.
A description of the resource.
Example:
This is my first resource.
A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]
The type that allows the deployment service to know how to setup the code during deployment.
Allowable values: [
python
]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
curl --request POST 'https://{cluster_url}/ml/v4/ai_services?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "name": "ai-service-1", "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "software_spec": { "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309" }, "documentation": { "request": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "parameters": { "properties": { "max_new_tokens": { "type": "integer" }, "top_p": { "type": "number" } }, "required": [ "max_new_tokens", "top_p" ] } }, "required": [ "query" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "image": { "type": "string", "format": "binary" } }, "required": [ "image" ] } }, "response": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "result": { "type": "string" } }, "required": [ "query", "result" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "string", "format": "binary" } } } }'
Response
The information for a flow.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python
]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service created
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The response with the result.
{ "metadata": { "id": "b53c5118-b1ca-43ef-a597-ef839ff7129f", "name": "ai-app-1", "space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-05-02T16:27:51Z" }, "entity": { "software_spec": { "id": "45f12dfe-aa78-5b8d-9f38-0ee223c47309" }, "documentation": { "request": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "parameters": { "properties": { "max_new_tokens": { "type": "integer" }, "top_p": { "type": "number" } }, "required": [ "max_new_tokens", "top_p" ] } }, "required": [ "query" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "image": { "type": "string", "format": "binary" } }, "required": [ "image" ] } }, "response": { "application/json": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "object", "properties": { "query": { "type": "string" }, "result": { "type": "string" } }, "required": [ "query", "result" ] }, "application/png": { "$schema": "http://json-schema.org/draft-07/schema#", "type": "string", "format": "binary" } } } } }
Retrieve the AI services
Retrieve the AI services for the specified space or project.
GET /ml/v4/ai_services
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Return only the resources with the given tag values, separated by
or
orand
to support multiple tags.Example:
tf2.0 or tf2.1
Returns only resources that match this search string. The path to the field must be the complete path to the field, and this field must be one of the indexed fields for this resource type. Note that the search string must be URL encoded.
Possible values: length ≥ 1
Response
A paginated list of AI services.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
A list of AI services.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the AI service
Retrieve the AI service with the specified identifier. If rev
query parameter is provided,
rev=latest
will fetch the latest revision. A call with rev={revision_number}
will fetch the given revision_number record. Either space_id
or project_id
has to be provided and is mandatory.
GET /ml/v4/ai_services/{id}
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
The revision number of the resource.
Example:
2
Response
The information for a flow.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python
]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Update the AI service
Update the AI service with the provided patch data. The following fields can be patched:
/tags
/name
/description
/custom
PATCH /ml/v4/ai_services/{id}
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Input For Patch. This is the patch body which corresponds to the JavaScript Object Notation (JSON) Patch standard (RFC 6902).
The operation to be performed.
Allowable values: [
add
,remove
,replace
]The pointer that identifies the field that is the target of the operation.
The value to be used within the operation.
Response
The information for a flow.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python
]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service has been patched successfully
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Delete the AI service
Delete the AI service with the specified identifier. This will delete all revisions of this flow as well. For each revision all attachments will also be deleted.
DELETE /ml/v4/ai_services/{id}
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Upload the AI service code
Upload the flow code. AI services expect a zip file that contains the code files that make up the flow.
PUT /ml/v4/ai_services/{id}/code
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
A gzip file containing code files.
Response
The metadata related to the attachment.
The content id for the attachment.
Example:
fd45606f-8098-459c-8961-32b136123fgc
Status Code
AI service code uploaded
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Download the AI service code
Download the AI service code.
It is possible to download the code
for a given revision of the flow
.
AI services expect a zip file that contains the code files that make up the flow.
GET /ml/v4/ai_services/{id}/code
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
The revision number of the resource.
Example:
2
Create a new AI service revision
Create a new AI service revision.
The current metadata and content for
id
will be taken and a new revision created.
Either space_id
or project_id
has to be provided and is mandatory.
POST /ml/v4/ai_services/{id}/revisions
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The details for the revision.
{
"space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f",
"commit_message": "Updated for TF 2.0"
}
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
An optional commit message for the revision.
Response
The information for a flow.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The details of the AI service to be created.
- entity
A software specification.
The type that allows the deployment service to know how to setup the code during deployment.
Possible values: [
python
]The documentation of the AI service request body and response body.
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service revision created
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Retrieve the AI service revisions
Retrieve the AI service revisions.
GET /ml/v4/ai_services/{id}/revisions
Request
Path Parameters
AI service identifier.
Example:
64dc8921-345f-234b-462d-78e41246987f
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Response
A paginated list of AI services.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
A list of AI services.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
AI service revisions
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Create a new watsonx.ai deployment
Create a new deployment, currently the only supported type is online
.
If this is a deployment for a prompt tune then the asset
object must exist
and the id
must be the id
of the model
that was created after the
prompt training.
If this is a deployment for a prompt template then the prompt_template
object should
exist and the id
must be the id
of the prompt template to be deployed.
POST /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The deployment request entity.
The following important fields are described for each use case:
- Prompt template:
base_model_id
: requiredpromt_template.id
: requiredonline
: requiredhardware_spec
: forbiddenhardware_request
: forbidden- response
deployed_asset_type
:foundation_model
- Prompt tune:
asset.id
: requiredonline
: requiredhardware_spec
: forbiddenhardware_request
: forbiddenbase_model_id
: forbidden- response
deployed_asset_type
:prompt_tune
- Custom foundation model:
asset.id
: requiredonline
: requiredonline.parameters.foundation_model
: optionalhardware_spec
: forbiddenhardware_request
: requiredbase_model_id
: forbiddenbase_deployment_id
: forbidden- response
deployed_asset_type
:custom_foundation_model
- Deploy on Demand model:
asset.id
: requiredonline
: requiredonline.parameters.foundation_model
: forbiddenhardware_spec
: forbiddenhardware_request
: forbiddenbase_model_id
: forbiddenbase_deployment_id
: forbiddenspace_id
: requiredproject_id
: forbidden- response
deployed_asset_type
:curated_foundation_model
Create a prompt tune deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "text_classification",
"asset": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {}
}
Create a prompt template deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "text_classification",
"base_model_id": "google/flan-ul2",
"prompt_template": {
"id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab"
},
"online": {}
}
Create a custom foundation model deployment.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"name": "my_tuned_flan",
"asset": {
"id": "366c31e9-1a6b-417a-8e25-06178a1514a1"
},
"online": {
"parameters": {
"serving_name": "myflan"
}
}
}
Deploy a curated model.
{
"space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03",
"name": "my_granite_13b_chat_v2",
"asset": {
"id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4"
},
"online": {
"parameters": {
"serving_name": "granite_13b_chat_v2"
}
}
}
The name of the resource.
Possible values: 1 ≤ length ≤ 250
Example:
my-resource
Indicates that this is an online deployment. An object has to be specified but can be empty. The
serving_name
can be provided in theonline.parameters
.The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
A description of the resource.
Possible values: length ≤ 1000
Example:
This is my first resource.
A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
A reference to a resource.
A hardware specification.
The requested hardware for deployment.
A reference to a resource.
The base model that is required for this deployment if this is for a prompt template or a prompt tune for an IBM foundation model.
Example:
google/flan-t5-xl
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "prompt_tune", "base_model_id": "google/flan-ul2", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "online": { "parameters": { "serving_name": "myflan" } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
{ "metadata": { "id": "c9240431-8697-42ad-8ab3-1cced97fc6db", "created_at": "2024-12-12T10:42:52.298Z", "name": "my_granite_13b_chat_v2", "space_id": "8ca6eec6-ce39-4285-877b-97a9720cdd03" }, "entity": { "asset": { "id": "38d30589-286c-4b9f-82d5-5006d5fa3bb4" }, "base_model_id": "ibm/granite-13b-chat-v2-curated", "deployed_asset_type": "curated_foundation_model", "hardware_request": { "num_nodes": 1, "size": "gpu_s" }, "online": { "parameters": { "serving_name": "granite_13b_chat_v2" } }, "status": { "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/curated_test_22/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/curated_test_22/text/generation_stream", "uses_serving_name": true, "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/c9240431-8697-42ad-8ab3-1cced97fc6db/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/c9240431-8697-42ad-8ab3-1cced97fc6db/text/generation_stream", "sse": true } ], "state": "ready" } } }
Retrieve the deployments
Retrieve the list of deployments for the specified space or project.
GET /ml/v4/deployments
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Retrieves the deployment, if any, that contains this
serving_name
.Example:
classification
Retrieves only the resources with the given tag value.
Retrieves only the resources with the given asset_id, asset_id would be the model id.
Retrieves only the resources with the given prompt_template_id.
Retrieves only the resources with the given name.
Retrieves the resources filtered with the given type. There are the deployment types as well as an additional
prompt_template
if the deployment type includes a prompt template.The supported deployment types are (see the description for
deployed_asset_type
in the deployment entity):prompt_tune
- when a prompt tuned model is deployed.foundation_model
- when a prompt template is used on a pre-deployed IBM provided model.custom_foundation_model
- when a custom foundation model is deployed.
These can be combined with the flag
prompt_template
like this:type=prompt_tune
- return all prompt tuned model deployments.type=prompt_tune and prompt_template
- return all prompt tuned model deployments with a prompt template.type=foundation_model
- return all prompt template deployments.type=foundation_model and prompt_template
- return all prompt template deployments - this is the same as the previous query because afoundation_model
can only exist with a prompt template.type=prompt_template
- return all deployments with a prompt template.
Retrieves the resources filtered by state. Allowed values are
initializing
,updating
,ready
andfailed
.Returns whether
serving_name
is available for use or not. This query parameter cannot be combined with any other parameter except forserving_name
.Default:
false
Response
The deployment resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
A list of deployment resources.
System details including warnings.
Status Code
OK.
serving_name
is available for use. Returned whenserving_name
andconflict
query parameters are used.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
Returned when
serving_name
andconflict
query parameters are used. The response body will contain the reason.
{ "limit": 10, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments" }, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "deployed_asset_type": "prompt_tune", "online": {}, "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } } ] }
Retrieve the deployment details
Retrieve the deployment details with the specified identifier.
GET /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment details.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt tuned model deployment", "tags": [ "classification" ] }, "entity": { "asset": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "prompt_tune", "base_model_id": "google/flan-ul2", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "text_classification", "description": "Classification prompt template deployment", "tags": [ "classification" ] }, "entity": { "prompt_template": { "id": "4cedab6d-e8e4-4214-b81a-2ddb122db2ab" }, "online": {}, "deployed_asset_type": "foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/2cd0bcda-581d-4f04-8028-ec2bc90cc375/text/generation_stream", "sse": true } ] } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "my_tuned_flan" }, "entity": { "asset": { "id": "366c31e9-1a6b-417a-8e25-06178a1514a1" }, "online": { "parameters": { "serving_name": "myflan" } }, "deployed_asset_type": "custom_foundation_model", "base_model_id": "google/flan-t5-xl", "status": { "state": "ready", "message": { "level": "info", "text": "The deployment is successful" }, "inference": [ { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation" }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation", "uses_serving_name": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/6213cf1-252f-424b-b52d-5cdd9814956c/text/generation_stream", "sse": true }, { "url": "https://us-south.ml.cloud.ibm.com/ml/v1/deployments/myflan/text/generation_stream", "sse": true, "uses_serving_name": true } ] } } }
Update the deployment metadata
Update the deployment metadata. The following parameters of deployment metadata are supported for the patch operation.
/name
/description
/tags
/custom
/online/parameters
/asset
-replace
only/prompt_template
-replace
only/hardware_spec
/hardware_request
/base_model_id
-replace
only (applicable only to prompt template deployments referring to IBM base foundation models)
The PATCH operation with path specified as /online/parameters
can be used to update the serving_name
.
PATCH /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
The json patch.
The operation to be performed.
Allowable values: [
add
,remove
,replace
]The pointer that identifies the field that is the target of the operation.
The value to be used within the operation.
Response
A deployment resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
The definition of the deployment.
Status Code
Deployment accepted
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Delete the deployment
Delete the deployment with the specified identifier.
DELETE /ml/v4/deployments/{deployment_id}
Request
Path Parameters
The deployment id.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when the deployment was created.
Return options
Note that there is currently a limitation in this operation when using return_options
,
for input only input_text
will be returned if requested,
for output the input_tokens
and generated_tokens
will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generation
Request
Path Parameters
The
id_or_name
can be either thedeployment_id
that identifies the deployment or aserving_name
that allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
A prompt tune request.
A prompt tune request.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "how far is paris from bangalore:\n",
"parameters": {
"max_new_tokens": 100
}
}
A prompt tune request with moderations.
A prompt tune request with moderations.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
A prompt template request.
A prompt template request.
{
"space_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "how far is paris from bangalore:\n",
"parameters": {
"max_new_tokens": 100
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
System details.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details for a prompt tune.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
The generated text from the model along with other details for a prompt tune with moderations.
{ "model_id": "google/flan-t5-xl", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.", "generated_token_count": 118, "input_token_count": 11, "stop_reason": "eos_token", "moderations": { "pii": [ { "score": 0.8, "input": false, "position": { "start": 74, "end": 88 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 200, "end": 212 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 244, "end": 259 }, "entity": "EmailAddress" } ] } } ] }
The generated text from the model along with other details for a prompt template.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters.
This operation will return the output tokens as a stream of events.
If a serving_name
is used then it must match the serving_name
that is returned in the inference
section when the deployment was created.
Return options
Note that there is currently a limitation in this operation when using return_options
,
for input only input_text
will be returned if requested,
for output the input_tokens
and generated_tokens
will not be returned, also the
rank
and top_tokens
will not be returned.
POST /ml/v1/deployments/{id_or_name}/text/generation_stream
Request
Path Parameters
The
id_or_name
can be either thedeployment_id
that identifies the deployment or aserving_name
that allows a predefined URL to be used to post a prediction.The WML instance that is associated with the deployment will be used for limits and billing (if a paid plan).
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
{
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"decoding_method": "sample",
"temperature": 0.8,
"max_new_tokens": 200
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
This field is ignored if there is a prompt template.
The template properties if this request refers to a prompt template.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50, "typical_p": 0.5 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, }'
curl --request POST 'https://{cluster_url}/ml/v1/deployments/{id_or_name}/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "parameters": { "max_new_tokens": 100, "time_limit": 1000, "prompt_variables": { "name": "joe", "count": 3 }, }, }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
List the available foundation models
Retrieve the list of deployed foundation models.
GET /ml/v1/foundation_model_specs
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
A set of filters to specify the list of models, filters are described as the
pattern
shown below.pattern: tfilter[,tfilter][:(or|and)] tfilter: filter | !filter filter: Requires existence of the filter. !filter: Requires absence of the filter. filter: one of modelid_*: Filters by model id. Namely, select a model with a specific model id. provider_*: Filters by provider. Namely, select all models with a specific provider. source_*: Filters by source. Namely, select all models with a specific source. input_tier_*: Filters by input tier. Namely, select all models with a specific input tier. output_tier_*: Filters by output tier. Namely, select all models with a specific output tier. tier_*: Filters by tier. Namely, select all models with a specific input or output tier. task_*: Filters by task id. Namely, select all models that support a specific task id. lifecycle_*: Filters by lifecycle state. Namely, select all models that are currently in the specified lifecycle state. function_*: Filters by function. Namely, select all models that support a specific function.
Possible values: 1 ≤ length ≤ 1000, Value must match regular expression
^([!]?[^,!]+)(,[!]?[^,!]+)*(:(or|and))?$
Example:
modelid_ibm/granite-13b-instruct-v2
See all the
Tech Preview
models if entitled.Default:
false
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources.
Example:
1
A reference to the first item of the next page, if any.
The supported foundation models.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The models that are currently deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_specs?version=2023-05-02" }, "resources": [ { "model_id": "bigcode/starcoder", "label": "starcoder-15.5b", "provider": "BigCode", "source": "Hugging Face", "short_description": "The StarCoder models are 15.5B parameter models that can generate code from natural language descriptions", "tasks": [ { "id": "code", "ratings": { "quality": 3 } } ], "min_shot_size": 0, "input_tier": "class_2", "output_tier": "class_2", "number_params": "15.5b" } ] }
List the supported tasks
Retrieve the list of tasks that are supported by the foundation models.
GET /ml/v1/foundation_model_tasks
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Response
System details.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources.
Example:
1
A reference to the first item of the next page, if any.
The supported foundation model tasks.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK
Bad request, the response body should contain the reason.
The specified resource was not found.
The tasks that are currently supported by models deployed in the cluster.
{ "total_count": 1, "limit": 100, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/foundation_model_tasks?version=2023-05-02" }, "resources": [ { "task_id": "question_answering", "label": "Question answering", "rank": 1, "description": "Based on a set of documents or dynamic content, create a chatbot or a question-answering feature grounded on specific content. E.g. building a Q&A resource from a broad knowledge base, providing customer service assistance." } ] }
Create a new notebook.
Create a new notebook
- either from scratch
- or by copying another notebook.
To create a notebook from scratch, you need to first upload the notebook content(ipynb
format) to the project Cloud Object Storage (COS)
and then reference it with the attribute file_reference
.
The other required attributes are name
, project
and runtime
.
The attribute runtime
is used to specify the environment on which the notebook runs.
To copy a notebook, you only need to provide name
and source_guid
in the request body.
POST /v2/notebooks
Request
Specification of the notebook to be created.
Create a notebook from scratch in a project
{
"name": "my notebook",
"description": "this is my notebook",
"project": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"file_reference": "notebook/my_notebook.ipynb",
"runtime": {
"environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf",
"spark_monitoring_enabled": true
}
}
The name of the new notebook.
Example:
my notebook
The reference to the file in the object storage.
Example:
notebook/my_notebook.ipynb
A notebook runtime.
The guid of the project in which to create the notebook.
Example:
92ae0e27-9b11-4de9-a646-d46ca3c183d4
A more verbose description of the notebook.
Example:
this is my notebook
The notebook origin.
A notebook kernel.
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Created and returned a new notebook asset. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
The number of requests has exceeded the rate limit.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python3", "language": "python3" }, "originates_from": { "type": "notebook", "guid": "ca3c0e27-46ca-83d4-a646-d49b11c14de9" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "rate_limit", "message": "The requests from IBMid-310000A00A exceeds rate limit. Please try again later." } ] }
Retrieve the details of a large number of notebooks inside a project.
Retrieve the details of a large number of notebooks inside a project.
POST /v2/notebooks/list
Request
Query Parameters
The guid of the project.
Additional info that will be included into the notebook details. Possible values are:
- runtime
Payload for a notebook list request.
List notebooks
{
"notebooks": [
"ca3c0e27-46ca-83d4-a646-d49b11c14de9"
]
}
The list of notebooks whose details will be retrieved.
Response
A list of notebook info as returned by a list query.
The number of items in the resources list.
Example:
1
An array of notebooks.
Status Code
Success. Returned a list of notebook assets. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "41d09a9a-f771-48a2-9534-50c0c622356d", "url": "/v2/notebooks/41d09a9a-f771-48a2-9534-50c0c622356d" }, "entity": { "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "asset": { "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "asset_type": "notebook", "created_at": "2021-07-01T12:37:01Z", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "version": 2, "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a particular notebook, including the notebook asset.
Delete a particular notebook, including the notebook asset.
DELETE /v2/notebooks/{notebook_guid}
Response
Status Code
Successful request. Notebook is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Revert the main notebook to a version.
Revert the main notebook to a version.
PUT /v2/notebooks/{notebook_guid}
Request
Path Parameters
The guid of the main notebook.
Payload for a request to revert to a specific notebook version.
Revert the notebook to a version
{
"source": "ca3c0e27-46ca-83d4-a646-d49b11c14de9"
}
The guid of the notebook version.
Example:
ca3c0e27-46ca-83d4-a646-d49b11c14de9
Response
Notebook information in a project as returned by a GET request.
Metadata of a notebook in a project.
Entity of a notebook.
Status Code
Success. Reverted the main notebook to a version. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook v4.2", "description": "this is my notebook v4.2", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "spark33py39-b275be5f-10ff-47ee-bfc9-63f1ce5addbf", "spark_monitoring_enabled": true }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Request
Path Parameters
The guid of the notebook.
Payload for a notebook update request.
Update a notebook
{
"environment": "d46ca0e27-a646-4de9-a646-9b113c183d4",
"spark_monitoring_enabled": false,
"kernel": {
"display_name": "Python 3.9 with Spark",
"name": "python39",
"language": "python3"
}
}
The guid of the environment on which the notebook runs.
Example:
d46ca0e27-a646-4de9-a646-9b113c183d4
Spark monitoring enabled or not.
A notebook kernel.
Response
Notebook information as returned by a GET request.
Metadata of a notebook.
Entity of a notebook.
Status Code
Success. Updated the notebook. Format follows v2/assets.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "name": "my notebook", "description": "this is my notebook", "asset_type": "notebook", "created": 1540471021134, "created_at": "2021-07-01T12:37:01Z", "owner_id": "IBMid-310000SG2Y", "catalog_id": "463cb8d8-8480-4a98-b75a-f7443b7d0af9", "asset_id": "41d09a9a-f771-48a2-9534-50c0c622356d", "project_id": "b275be5f-10ff-47ee-bfc9-63f1ce5addbf" }, "entity": { "notebook": { "kernel": { "display_name": "Python 3.9 with Spark", "name": "python39", "language": "python3" }, "originates_from": { "type": "blank" } }, "runtime": { "environment": "d46ca0e27-a646-4de9-a646-9b113c183d4", "spark_monitoring_enabled": false }, "href": "/v2/assets/41d09a9a-f771-48a2-9534-50c0c622356d?project_id=b275be5f-10ff-47ee-bfc9-63f1ce5addbf" } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new version.
Create a version of a given notebook.
POST /v2/notebooks/{notebook_guid}/versions
Response
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the notebook version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
List the versions of a notebook.
List all versions of a particular notebook.
GET /v2/notebooks/{notebook_guid}/versions
Response
A list of notebook versions in a project.
The number of items in the resources array.
Example:
1
An array of notebook versions.
Status Code
Success. Returned a list of versions of the notebook.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "total_results": 1, "resources": [ { "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Retrieve a notebook version.
Retrieve a particular version of a notebook.
GET /v2/notebooks/{notebook_guid}/versions/{version_guid}
Response
A notebook version in a project.
Notebook version metadata.
A notebook version entity in a project.
Status Code
Success. Returned the version definition.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "metadata": { "guid": "19d63b6b-81a1-4c05-bad2-36a2957bd6d0", "url": "v2/notebooks/a528b427-d1cd-4039-8ddc-04203c2521e2/versions/1a1329e0-fd05-409a-8411-52db106e2142", "created_at": 1543681714106 }, "entity": { "master_notebook_guid": "a528b427-d1cd-4039-8ddc-04203c2521e2", "project_id": "0f7c1111-a79d-45b2-9699-d4950e742964", "created_by_iui": "IBMid-123456ABCD", "file_reference": "myproject-donotdelete-pr-6p65nym92j1bv0/notebooks/GPU_ENVIRONMENT_DEFAULT_GBUXVKHH_version_1543781324804.ipynb", "rev_id": 1 } }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Delete a notebook version.
Delete a particular version of a given notebook.
DELETE /v2/notebooks/{notebook_guid}/versions/{version_guid}
Response
Status Code
Success. The version is deleted.
Bad request. One of the fields has invalid format/content.
Unauthorized. No/Malformed authentication provided.
Forbidden. User is not allowed to perform the target operation.
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_type", "message": "The `project` field needs to be a uuid v4, but is 12345.", "target": { "type": "field", "name": "project" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "invalid_auth_token", "message": "The IAM bearer token is not valid.", "target": { "type": "header", "name": "Authentication" } } ] }
{ "trace": "b12692e1-8582-4628-88ca-7a13fefb73e2", "errors": [ { "code": "endpoint_access_forbidden", "message": "max.mustermann@ibm.com is neither editor/admin of project b275be5f-10ff-47ee-bfc9-63f1ce5addbf nor allowlisted Service ID." } ] }
Create a new prompt / prompt template
This creates a new prompt with the provided parameters.
POST /v1/prompts
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
,chat
,detached
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt
This retrieves a prompt / prompt template with the given id.
GET /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Only return a set of model parameters compatiable with inferencing
Default:
true
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt
This updates a prompt / prompt template with the given id.
PATCH /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt
This delets a prompt / prompt template with the given id.
DELETE /v1/prompts/{prompt_id}
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt lock modifications
Modifies the current locked state of a prompt.
PUT /v1/prompts/{prompt_id}/lock
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt lock status
Retrieves the current locked state of a prompt.
GET /v1/prompts/{prompt_id}/lock
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get the inference input string for a given prompt
Computes the inference input string based on state of a prompt. Optionally replaces template params
POST /v1/prompts/{prompt_id}/input
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override input string that will be used to generate the response. The string can contain template parameters.
Possible values: Value must match regular expression
.*
Example:
Some text with variables.
Supply only to replace placeholders. Object content must be key:value pairs where the 'key' is the parameter to replace and 'value' is the value to use.
- prompt_variables
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
var1
Response
The prompt's input string used for inferences.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt
This adds new chat items to the given prompt.
POST /v1/prompts/{prompt_id}/chat_items
Request
Path Parameters
Prompt ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the space ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question
,answer
]Possible values: Value must match regular expression
.*
Example:
Some text
Allowable values: [
ready
,error
]Example:
1711504485261
Request
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session
This retrieves a prompt session with the given id.
GET /v1/prompt_sessions/{session_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Include the most recent entry
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Update a prompt session
This updates a prompt session with the given id.
PATCH /v1/prompt_sessions/{session_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
An optional description for the prompt.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Response
Name used to display the prompt session.
Possible values: Value must match regular expression
^.{0,100}$
Example:
Session 1
The prompt session's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]{32}
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt session.
Possible values: Value must match regular expression
^[\s\S]{0,250}
Example:
My First Prompt Session
Time the session was created.
Example:
1711504485261
The ID of the original session creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the session was updated.
Example:
1711504485261
The ID of the last user that modifed the session.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: 0 ≤ number of items ≤ 50
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session
This deletes a prompt session with the given id.
DELETE /v1/prompt_sessions/{session_id}
Add a new prompt to a prompt session
This creates a new prompt associated with the given session.
POST /v1/prompt_sessions/{session_id}/entries
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
Time the prompt was created.
Example:
1711504485261
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
- prompt_variables
- any property
Input mode in use for the prompt
Allowable values: [
structured
,freeform
,chat
]
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
Time the prompt was created.
Example:
1711504485261
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
- prompt_variables
- any property
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
]
Status Code
Created - Returned when created
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get entries for a prompt session
List entries from a given session.
GET /v1/prompt_sessions/{session_id}/entries
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Bookmark from a previously limited get request
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Limit for results to retrieve, default 20
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
- results
The prompt entry's ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
The prompt entry's name
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
Name of an entry
The prompt entry's description
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
Description of an entry
The prompt entry's create time in millis
Example:
1711504485261
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
Success - Returned when search completes
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Add a new chat item to a prompt session entry
This adds new chat items to the given entry.
POST /v1/prompt_sessions/{session_id}/entries/{entry_id}/chat_items
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Allowable values: [
question
,answer
]Possible values: Value must match regular expression
.*
Example:
Some text
Allowable values: [
ready
,error
]Example:
1711504485261
Prompt session lock modifications
Modifies the current locked state of a prompt session.
PUT /v1/prompt_sessions/{session_id}/lock
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Override a lock if it is currently taken.
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Allowable values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned when lock change is successful
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get current prompt session lock status
Retrieves the current locked state of a prompt session.
GET /v1/prompt_sessions/{session_id}/lock
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
True if the prompt is currently locked.
Lock type: 'edit' for working on prompts/templates or 'governance'. Can only be supplied in PUT /lock requests.
Possible values: [
edit
,governance
]Locked by is computed by the server and shouldn't be passed.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Status Code
Ok - Returned on success
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Get a prompt session entry
This retrieves a prompt session entry with the given id.
GET /v1/prompt_sessions/{session_id}/entries/{entry_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Response
Name used to display the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My Prompt
The prompt's id. This value cannot be set. It is returned in responses only.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
1c29d9a1-9ba6-422d-aa39-517b26adc147
An optional description for the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
My First Prompt
Time the prompt was created.
Example:
1711504485261
The ID of the original prompt creator.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Time the prompt was updated.
Example:
1711504485261
The ID of the last user that modifed the prompt.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Example:
IBMid-000000YYY0
Possible values: number of items = 1, Value must match regular expression
[a-zA-Z0-9-]*
Input mode in use for the prompt
Possible values: [
structured
,freeform
,chat
,detached
]- model_version
User provided semantic version for tracking in IBM AI Factsheets
Possible values: Value must match regular expression
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Example:
2.0.0-rc.7
User provived tag.
Possible values: Value must match regular expression
.*
Example:
tag
Description of the version.
Possible values: Value must match regular expression
.*
Example:
Description of the model version.
- prompt_variables
- any property
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Status Code
OK - Returned from GET when it succeeds
Bad Request - Returned when the request parameters are invalid
Unauthorized - Returned when caller does not have a valid authorization token, or it is missing
No Sample Response
Delete a prompt session entry
This deletes a prompt session entry with the given id.
DELETE /v1/prompt_sessions/{session_id}/entries/{entry_id}
Request
Path Parameters
Prompt Session ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Prompt Session Entry ID
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Query Parameters
[REQUIRED] Specifies the project ID as the target. One target must be supplied per request.
Possible values: Value must match regular expression
[a-zA-Z0-9-]*
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
POST /ml/v1/text/chat
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
text_chat
A text chat example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Who won the world series in 2020?"
},
{
"role": "assistant",
"content": "The Los Angeles Dodgers won the World Series in 2020."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Where was it played?"
}
}
],
"max_tokens": 100,
"temperature": 0,
"time_limit": 1000
}
tool_call
A tool calling example.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"messages": [
{
"role": "user",
"content": {
"type": "text",
"text": "What is the weather like in Boston today?"
}
}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"description": "The city, e.g. San Francisco, CA",
"type": "string"
},
"unit": {
"enum": [
"celsius",
"fahrenheit"
],
"type": "string"
}
},
"required": [
"location"
]
}
}
}
],
"tool_choice": {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather for a location.\nCall this whenever you need to know the weather,\nor for example when a customer asks 'What is the weather like in New York'\n"
}
}
}
json_mode
A text chat example with json output.
{
"model_id": "meta-llama/llama-3-8b-instruct",
"project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f",
"response_format": {
"type": "json_object"
},
"messages": [
{
"role": "system",
"content": "You are a helpful assistant designed to output JSON."
},
{
"role": "user",
"content": {
"type": "text",
"text": "Who won the world series in 2020?"
}
}
]
}
The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Using
auto
means the model can pick between generating a message or calling one or more tools. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0
Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
false
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobs
must be set totrue
if this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
Default:
1024
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0
The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41
Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_p
but not both.Possible values: 0 < value < 2
Default:
1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperature
but not both.Possible values: 0 < value < 1
Default:
1
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": [ { "type": "text", "text": "Who won the world series in 2020?" } ] }, { "role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020." }, { "role": "user", "content": [ { "type": "text", "text": "Where was it played?" } ] } ], "max_tokens": 100, "temperature": 0, "time_limit": 1000 }'
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "What is the weather like in Boston today?" } ] } ], "tools": [ { "type": "function", "function": { "name": "get_current_weather", "description": "Get the current weather in a given location", "parameters": { "type": "object", "properties": { "location": { "description": "The city, e.g. San Francisco, CA", "type": "string" }, "unit": { "enum": [ "celsius", "fahrenheit" ], "type": "string" } }, "required": [ "location" ] } } } ], "tool_choice": { "type": "function", "function": { "name": "get_current_weather", } } }'
curl --request POST 'https://{cluster_url}/ml/v1/text/chat?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "model_id": "meta-llama/llama-3-8b-instruct", "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f", "response_format": { "type": "json_object" }, "messages": [ { "role": "system", "content": "You are a helpful assistant designed to output JSON." }, { "role": "user", "content": [ { "type": "user", "text": "Who won the world series in 2020?" } ] } ] }'
Response
System details.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2
The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
n
is greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
Usage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
A text chat example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The 2020 World Series was played at Globe Life Field in Arlington, Texas,\nwhich is the home stadium of the Texas Rangers.\nHowever, the series was played with no fans in attendance due to the COVID-19 pandemic.\n" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 47, "prompt_tokens": 59, "total_tokens": 106 } }
A tool calling example.
{ "id": "cmpl-15475d0dea9b4429a55843c77997f8a9", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "tool_calls": [ { "id": "chatcmpl-tool-ef093f0cbbff4c6a973aa0873f73fc99", "type": "function", "function": { "name": "get_current_weather", "arguments": "{\n \"location\": \"Boston, MA\",\n \"unit\": \"fahrenheit\"\n}\n" } } ] }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 18, "prompt_tokens": 19, "total_tokens": 37 } }
A text chat example with json output.
{ "id": "cmpl-09945b25c805491fb49e15439b8e5d84", "model_id": "meta-llama/llama-3-8b-instruct", "created": 1689958352, "created_at": "2023-07-21T16:52:32.190Z", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "[\"The Los Angeles Dodgers won the World Series in 2020. They defeated the Tampa Bay Rays in six games.\"]" }, "finish_reason": "stop" } ], "usage": { "completion_tokens": 35, "prompt_tokens": 20, "total_tokens": 55 } }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
POST /ml/v1/text/chat_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
The model to use for the chat completion.
Please refer to the list of models.
The messages for this chat session.
Possible values: 1 ≤ number of items ≤ 1000
- messages
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Tool functions that can be called with the response.
Possible values: 1 ≤ number of items ≤ 128
Using
auto
means the model can pick between generating a message or calling one or more tools. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Specifying a particular tool via
{"type": "function", "function": {"name": "my_function"}}
forces the model to call that tool. Specify eithertool_choice_option
to allow the model to pick ortool_choice
to force the model to call a tool.Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
Possible values: -2 < value < 2
Default:
0
Increasing or decreasing probability of tokens being selected during generation; a positive bias makes a token more likely to appear, while a negative bias makes it less likely.
Examples:{ "1003": -100, "1004": -100 }
Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message.
Default:
false
An integer specifying the number of most likely tokens to return at each token position, each with an associated log probability. The option
logprobs
must be set totrue
if this parameter is used.Possible values: 0 ≤ value ≤ 20
The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.
Default:
1024
How many chat completion choices to generate for each input message. Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.
Default:
1
Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
Possible values: -2 < value < 2
Default:
0
The chat response format parameters.
Random number generator seed to use in sampling mode for experimental repeatability.
Example:
41
Stop sequences are one or more strings which will cause the text generation to stop if/when they are produced as part of the output. Stop sequences encountered prior to the minimum number of tokens being generated will be ignored.
Possible values: 0 ≤ number of items ≤ 4, contains only unique items
Examples:[ "this", "the" ]
What sampling temperature to use,. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
We generally recommend altering this or
top_p
but not both.Possible values: 0 < value < 2
Default:
1
An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
We generally recommend altering this or
temperature
but not both.Possible values: 0 < value < 1
Default:
1
Time limit in milliseconds - if not completed within this time, generation will stop. The text generated so far will be returned along with the `TIME_LIMIT`` stop reason. Depending on the users plan, and on the model being used, there may be an enforced maximum time limit.
Possible values: value > 0
Example:
600000
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
A unique identifier for the chat completion.
The model used for the chat completion.
Example:
google/flan-ul2
The Unix timestamp (in seconds) of when the chat completion was created.
A list of chat completion choices. Can be more than one if
n
is greater than 1.Possible values: number of items ≥ 1
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
Usage statistics for the completion request.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Generate embeddings
Generate embeddings from text input.
See the documentation for a description of text embeddings.
POST /ml/v1/text/embeddings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The text input for a given model to be used to generate the embeddings.
A sample request.
A simple request.
{
"model_id": "slate",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
"Youth craves thrills while adulthood cherishes wisdom."
]
}
The
id
of the model to be used for this request. Please refer to the list of models.The input text.
Possible values: number of items ≤ 1000
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Parameters for text embedding requests.
curl --request POST 'https://{cluster_url}/ml/v1/text/embeddings?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "inputs": [ "Youth craves thrills while adulthood cherishes wisdom.", "Youth seeks ambition while adulthood finds contentment.", "Dreams chased in youth while goals pursued in adulthood." ], "model_id": "slate", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
id
of the model to be used for this request. Please refer to the list of models.The embedding values for a given text.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The number of input tokens that were consumed.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "slate", "results": [ { "embedding": [ -0.006929283, -0.005336422, -0.024047505 ] } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 10 }
Start a text extraction request
Start a request to extract text and metadata from documents.
See the documentation for a description of text extraction.
POST /ml/v1/text/extractions
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The input for the text extraction request.
simple_request
A simple request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "connection_asset",
"connection": {
"id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
},
"location": {
"file_name": "files/document.pdf"
}
},
"results_reference": {
"type": "connection_asset",
"connection": {
"id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
},
"location": {
"file_name": "results"
}
},
"steps": {
"tables_processing": {
"enabled": true
}
}
}
ocr_request
A simple request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"document_reference": {
"type": "connection_asset",
"connection": {
"id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d"
},
"location": {
"file_name": "files/document.pdf"
}
},
"results_reference": {
"type": "connection_asset",
"connection": {
"id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159"
},
"location": {
"file_name": "results"
}
},
"steps": {
"ocr": {
"languages_list": [
"en",
"fr"
]
},
"tables_processing": {
"enabled": false
}
}
}
A reference to data.
A reference to data.
The steps for the text extraction pipeline.
Set this as an empty object to specify
json
output.Note that this is not strictly required because if an
assembly_md
object is not found then the default will bejson
.Default:
{}
Set this as an empty object to specify
markdown
output.Example:
{}
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } } }'
curl --request POST 'https://{cluster_url}/ml/v1/text/extractions?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "ocr": { "languages_list": [ "en" ] }, "tables_processing": { "enabled": false } } }'
Response
The text extraction response.
Common metadata for a resource where
project_id
orspace_id
must be present.The document details for the text extraction.
- entity
A reference to data.
A reference to data.
The current status of the text extraction.
The steps for the text extraction pipeline.
Set this as an empty object to specify
json
output.Note that this is not strictly required because if an
assembly_md
object is not found then the default will bejson
.Set this as an empty object to specify
markdown
output.Example:
{}
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Created. The
Content-Location
header will contain the URI reference to the created resource.Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } }, "results": { "status": "submitted", "number_pages_processed": 0 } } }
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "ocr": { "languages_list": [ "en", "fr" ] }, "tables_processing": { "enabled": false } }, "results": { "status": "submitted", "number_pages_processed": 0 } } }
Retrieve the text extraction requests
Retrieve the list of text extraction requests for the specified space or project.
This operation does not save the history, any requests that were deleted or purged will not appear in this list.
GET /ml/v1/text/extractions
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Response
A paginated list of resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
A list of resources.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "limit": 10, "first": { "href": "https://us-south.ml.cloud.ibm.com/ml/v1/text_extractions" }, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "results": { "status": "completed", "number_pages_processed": 3, "running_at": "2023-05-02T16:28:03Z", "completed_at": "2023-05-02T16:29:31Z" } } } ] }
Get the results of the request
Retrieve the text extraction request with the specified identifier.
Note that there is a retention period of 2 days. If this retention
period is exceeded then the request will be deleted and the results
no longer available. In this case this operation will return 404
.
GET /ml/v1/text/extractions/{id}
Request
Path Parameters
The identifier of the extraction request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
curl --request GET 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json'
Response
The text extraction response.
Common metadata for a resource where
project_id
orspace_id
must be present.The document details for the text extraction.
- entity
A reference to data.
A reference to data.
The current status of the text extraction.
The steps for the text extraction pipeline.
Set this as an empty object to specify
json
output.Note that this is not strictly required because if an
assembly_md
object is not found then the default will bejson
.Set this as an empty object to specify
markdown
output.Example:
{}
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
Optional details coming from the service and related to the API call or the associated resource.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "created_at": "2023-05-02T16:27:51Z", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "name": "extract" }, "entity": { "document_reference": { "type": "connection_asset", "connection": { "id": "6f5688fd-f3bf-42c2-a18b-49c0d8a1920d" }, "location": { "file_name": "files/document.pdf" } }, "results_reference": { "type": "connection_asset", "connection": { "id": "2a7c11bc-2913-48d0-9581-a8d9f40fa159" }, "location": { "file_name": "results" } }, "steps": { "tables_processing": { "enabled": true } }, "results": { "status": "running", "number_pages_processed": 2, "running_at": "2023-05-02T16:28:03Z" } } }
Delete the request
Cancel the specified text extraction request and delete any associated results.
DELETE /ml/v1/text/extractions/{id}
Request
Path Parameters
The identifier of the extraction request.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Set to true in order to also delete the job or request metadata.
curl --request DELETE 'https://{cluster_url}/ml/v1/text/extractions/{id}?version=2023-10-25&project_id=12ac4cf1-252f-424b-b52d-5cdd9814987f' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...'
Infer text
Infer the next tokens for a given deployed model with a set of parameters.
POST /ml/v1/text/generation
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}
A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Properties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/text/generation?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
System details.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The generated text from the model along with other details.
{ "model_id": "google/flan-ul2", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "4,000 km", "generated_token_count": 4, "input_token_count": 12, "stop_reason": "eos_token" } ] }
The generated text from the model along with other details.
{ "model_id": "google/flan-t5-xl", "created_at": "2023-07-21T16:52:32.190Z", "results": [ { "generated_text": "c/o USPS, PO Box 3000, Washington, D.C. 20001-5000, www.usps.com, or call **************. You can also visit the website at https://www.usps.com/contactus/. You can also contact them by telephone at 1-************. You can also send an email to ***************. You can find the US Postal Service on Facebook at https://www.facebook.com/postalservice/.", "generated_token_count": 118, "input_token_count": 11, "stop_reason": "eos_token", "moderations": { "pii": [ { "score": 0.8, "input": false, "position": { "start": 74, "end": 88 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 200, "end": 212 }, "entity": "PhoneNumber" }, { "score": 0.8, "input": false, "position": { "start": 244, "end": 259 }, "entity": "EmailAddress" } ] } } ] }
Infer text event stream
Infer the next tokens for a given deployed model with a set of parameters. This operation will return the output tokens as a stream of events.
POST /ml/v1/text/generation_stream
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
From a given prompt, infer the next tokens in a server-sent events (SSE) stream.
A request without moderations.
A simple request.
{
"model_id": "google/flan-ul2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Generate a marketing email advertising a new sale with the following characteristics:\n\nCompany: Swimwear Unlimited\n\nOffer Keywords: {Select customers only, mid-summer fun, swimwear sale}\n\nOffer End Date: July 15\n\nAdvertisement Tone: Exciting!\n\nInclude no URLs.\n\nInclude no telephone numbers.\n",
"parameters": {
"temperature": 0.8,
"max_new_tokens": 30
}
}
A request with moderations.
A simple request with moderations.
{
"model_id": "google/flan-t5-xl",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"input": "Tell me how to reach the US Postal service",
"parameters": {
"max_new_tokens": 120,
"min_new_tokens": 100,
"repetition_penalty": 2
},
"moderations": {
"hap": {
"output": {
"enabled": true,
"threshold": 0.5
}
},
"pii": {
"output": {
"enabled": true
},
"mask": {
"remove_entity_value": true
}
}
}
}
The prompt to generate completions. Note: The method tokenizes the input internally. It is recommended not to leave any trailing spaces.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
Properties that control the model and response.
Examples:{ "temperature": 0.8, "top_p": 0.5, "top_k": 50, "random_seed": 111, "repetition_penalty": 2, "min_new_tokens": 30, "max_new_tokens": 50 }
Properties that control the moderations, for usages such as
Hate and profanity
(HAP) andPersonal identifiable information
(PII) filtering. This list can be extended with new types of moderations.Examples:{ "hap": { "output": { "enabled": true, "threshold": 0.5 } }, "pii": { "output": { "enabled": true }, "mask": { "remove_entity_value": true } } }
- moderations
curl --request POST 'https://{cluster_url}/ml/v1/text/generation_stream?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-t5-xxl", "input": "how far is paris from bangalore:", "parameters": { "max_new_tokens": 100, "time_limit": 1000 }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
A set of server sent events, each event contains a response for one or more tokens. The results will be an array of events of the form data: {<json event>}
where the schema of the individual json event
is described below.
The
id
of the model for inference.Example:
google/flan-ul2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The generated tokens.
Possible values: number of items ≥ 1
- results
The text that was generated by the model.
Example:
Swimwear Unlimited- Mid-Summer Sale! ...
The reason why the call stopped, can be one of:
- not_finished - Possibly more tokens to be streamed.
- max_tokens - Maximum requested tokens reached.
- eos_token - End of sequence token encountered.
- cancelled - Request canceled by the client.
- time_limit - Time limit reached.
- stop_sequence - Stop sequence encountered.
- token_limit - Token limit reached.
- error - Error encountered.
Note that these values will be lower-cased so test for values case insensitive.
Possible values: [
not_finished
,max_tokens
,eos_token
,cancelled
,time_limit
,stop_sequence
,token_limit
,error
]Example:
token_limit
The number of generated tokens.
Example:
3
The number of input tokens consumed.
Example:
11
The seed used, if it exists.
Example:
42
The list of individual generated tokens. Extra token information is included based on the other flags in the
return_options
of the request.Possible values: number of items ≥ 1
Examples:[ { "text": "_", "rank": 1, "logprob": -2.5, "top_tokens": [ { "text": "_", "logprob": -2.5 }, { "text": "_2", "logprob": -3.1777344 } ] }, { "text": "4,000", "rank": 1, "logprob": -3.0957031, "top_tokens": [ { "text": "4,000", "logprob": -3.0957031 }, { "text": "57", "logprob": -3.3691406 } ] } ]
The list of input tokens. Extra token information is included based on the other flags in the
return_options
of the request, but for decoder-only models.Possible values: number of items ≥ 1
Examples:[ { "text": "_how" }, { "text": "_far" }, { "text": "_is" }, { "text": "</s>" } ]
The result of any detected moderations.
- moderations
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation (
Content-Type: text/event-stream
).Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
No Sample Response
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The input texts and the queries for reranking.
A sample request.
{
"model_id": "cross-encoder/ms-marco-minilm-l-12-v2",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"inputs": [
{
"text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine."
},
{
"text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience."
}
],
"query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.",
"parameters": {
"return_options": {
"top_n": 2
}
}
}
The
id
of the model to be used for this request. Please refer to the list of models.The rank input strings.
Possible values: 0 ≤ number of items ≤ 1000
The rank query.
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The properties used for reranking.
curl --request POST 'https://{cluster_url}/ml/v1/text/rerank?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "inputs": [ { "text": "In my younger years, I often reveled in the excitement of spontaneous adventures and embraced the thrill of the unknown, whereas in my grownup life, I've come to appreciate the comforting stability of a well-established routine." }, { "text": "As a young man, I frequently sought out exhilarating experiences, craving the adrenaline rush of life's novelties, while as a responsible adult, I've come to understand the profound value of accumulated wisdom and life experience." } ], "query": "As a Youth, I craved excitement while in adulthood I followed Enthusiastic Pursuit.", "parameters": { "return_options": { "top_n": 2 } } }'
Response
System details.
The
id
of the model to be used for this request. Please refer to the list of models.The ranked results.
Possible values: number of items ≥ 0
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The number of input tokens that were consumed.
The model version (using semantic versioning) if set.
Possible values: 5 ≤ length ≤ 20, Value must match regular expression
^\d+.\d+.\d+$
The rank query, if requested.
Optional details coming from the service and related to the API call or the associated resource.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
An array of embeddings for each input string.
{ "model_id": "cross-encoder/ms-marco-minilm-l-12-v2", "results": [ { "index": 1, "score": 0.7461 }, { "index": 0, "score": 0.8274 } ], "created_at": "2024-02-21T17:32:28Z", "input_token_count": 20 }
Text tokenization
The text tokenize operation allows you to check the conversion of provided input to tokens for a given model. It splits text into words or sub-words, which then are converted to ids through a look-up table (vocabulary). Tokenization allows the model to have a reasonable vocabulary size.
POST /ml/v1/text/tokenization
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The input string to tokenize.
A sample request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"model_id": "google/flan-ul2",
"input": "Write a tagline for an alumni association: Together we",
"parameters": {
"return_tokens": true
}
}
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The input string to tokenize.
Example:
Write a tagline for an alumni association: Together we
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The parameters for text tokenization.
curl --request POST 'https://{cluster_url}/ml/v1/text/tokenization?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "model_id": "google/flan-ul2,", "input": "Write a tagline for an alumni association: Together we", "parameters": { "return_tokens": true }, "project_id": "63dc4cf1-252f-424b-b52d-5cdd9814987f" }'
Response
The tokenization result.
The
id
of the model to be used for this request. Please refer to the list of models.Example:
google/flan-ul2
The result of tokenizing the input string.
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
The response with the token count and the tokens, if requested.
{ "model_id": "google/flan-ul2", "result": { "token_count": 11, "tokens": [ "Write", "a", "tag", "line", "for", "an", "alumni", "associ", "ation:", "Together", "we" ] } }
Time series forecast
Generate forecasts, or predictions for future time points, given historical time series data.
POST /ml/v1/time_series/forecast
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The forecast request.
A sample request.
{
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"model_id": "ibm/ttm-1024-96-r2",
"schema": {
"timestamp_column": "date",
"id_columns": [
"ID1"
]
},
"data": {
"date": [
"2020-01-01T00:00:00",
"2020-01-01T01:00:00",
"2020-01-05T01:00:00"
],
"ID1": [
"D1",
"D1",
"D1"
],
"TARGET1": [
1.46,
2.34,
4.55
]
}
}
The model to be used for generating a forecast.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^\S+$
A payload of data matching
schema
. We assume the following about your data:- All timeseries are of equal length and are uniform in nature (the time difference between two successive rows is constant). This implies that there are no missing rows of data;
- The data meet the minimum model-dependent historical context length which can be 512 or more rows per timeseries;
Note that the example payloads shown are for illustration purposes only. An actual payload would necessary be much larger to meet minimum model-specific context lengths.
- data
Contains metadata about your timeseries data input.
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The parameters for the forecast request.
curl --request POST 'https://{cluster_url}/ml/v1/time_series/forecast?version=2023-10-25' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Accept: application/json' -d '{ "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "model_id": "ibm/ttm-1024-96-r2", "schema": { "timestamp_column": "date", "id_columns": [ "ID1" ] }, "data": { "date": [ "2020-01-01T00:00:00", "2020-01-01T01:00:00", "2020-01-05T01:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.46, 2.34, 4.55 ] } }'
Response
The time series forecast response.
The model used to generate the forecast.
Example:
ibm/ttm-1024-96-r2
The time when the response was created in ISO 8601 format.
Example:
2024-01-01T02:00:00
The list of prediction results. There will be a forecast for each time series in the input data. The
prediction_length
field in the request specifies the number of predictions in the results. The actual number of rows in the results will be equal to theprediction length
multiplied by the number of unique ids inid_columns
. Thetimestamp_column
field in the request indicates the name of the timestamp column in the results.Examples:[ { "date": [ "2020-01-01T03:00:00", "2020-01-01T04:00:00", "2020-01-01T05:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.78, 6.78 ] } ]
- results
Status Code
Successful operation
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "model_id": "ibm/ttm-1024-96-r2", "created_at": "2024-01-01T02:00:00", "results": [ { "date": [ "2020-01-05T02:00:00", "2020-01-05T03:00:00", "2020-01-06T00:00:00" ], "ID1": [ "D1", "D1", "D1" ], "TARGET1": [ 1.86, 3.24, 6.78 ] } ] }
Create a new watsonx.ai training
Create a new watsonx.ai training in a project or a space.
The details of the base model and parameters for the training
must be provided in the prompt_tuning
object.
In order to deploy the tuned model you need to follow the following steps:
-
Create a WML model asset, in a space or a project, by providing the
request.json
as shown below:curl -X POST "https://{cpd_cluster}/ml/v4/models?version=2024-01-29" \ -H "Authorization: Bearer <replace with your token>" \ -H "content-type: application/json" \ --data '{ "name": "replace_with_a_meaningful_name", "space_id": "replace_with_your_space_id", "type": "prompt_tune_1.0", "software_spec": { "name": "watsonx-textgen-fm-1.0" }, "metrics": [ from the training job ], "training": { "id": "05859469-b25b-420e-aefe-4a5cb6b595eb", "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "generation", "verbalizer": "Input: {{input}} Output:" }, "training_data_references": [ { "connection": { "id": "20933468-7e8a-4706-bc90-f0a09332b263" }, "id": "file_to_tune1.json", "location": { "bucket": "wxproject-donotdelete-pr-xeyivy0rx3vrbl", "path": "file_to_tune1.json" }, "type": "connection_asset" } ] }'
Notes:
- If you used the training request field
auto_update_model: true
then you can skip this step as the model will have been saved at the end of the training job. - Rather than creating the payload for the model you can use the
generated
request.json
that was stored in theresults_reference
field, look for the path in the fieldentity.results_reference.location.model_request_path
. - The model
type
must beprompt_tune_1.0
. - The software spec name must be
watsonx-textgen-fm-1.0
.
- If you used the training request field
-
Create a tuned model deployment as described in the create deployment documentation.
POST /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The training_data_references
contain the training datasets and the
results_reference
the connection where results will be stored.
Start a prompt tune training job.
{
"name": "my-prompt-tune-training",
"project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f",
"prompt_tuning": {
"base_model": {
"model_id": "google/flan-t5-xl"
},
"tuning_type": "prompt_tuning",
"task_id": "classification",
"num_epochs": 30,
"learning_rate": 0.4,
"accumulate_steps": 3,
"batch_size": 10,
"max_input_tokens": 100,
"max_output_tokens": 100
},
"training_data_references": [
{
"id": "tune1_data.json",
"location": {
"path": "tune1_data.json"
},
"type": "container"
}
],
"auto_update_model": true,
"results_reference": {
"location": {
"path": "tune1/results"
},
"type": "container"
}
}
The name of the training.
Example:
my-prompt-training
The training results. Normally this is specified as
type=container
(Service) ortype=fs
(Software) which means that it is stored in the space or project.Examples:{ "location": { "path": "results" }, "type": "container" }
- results_reference
The data source type like
connection_asset
,container
(Service) orfs
(Software).Allowable values: [
connection_asset
,container
]Example:
connection_asset
Contains a set of fields that describe the location of the data with respect to the
connection
.Examples:{ "bucket": "wml-v4-fvt-remote-tests", "file_name": "heart_testpy379.csv" }
- location
Item identification inside a collection.
Contains a set of fields specific to each connection. See here for details about specifying connections.
Examples:{ "id": "2d07a6b4-8fa9-43ab-91c8-befcd9dab8d2" }
The space that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
3fc54cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
12ac4cf1-252f-424b-b52d-5cdd9814987f
A description of the training.
Example:
My prompt training.
A list of tags for this resource.
Possible values: number of items ≤ 64
Examples:[ "t1", "t2" ]
Properties to control the prompt tuning.
Examples:{ "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "summarization", "tuning_type": "prompt_tuning", "num_epochs": 30, "learning_rate": 0.4, "accumulate_steps": 3, "batch_size": 10, "max_input_tokens": 100, "max_output_tokens": 100 }
Training datasets.
Examples:[ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ]
User defined properties specified as key-value pairs.
Examples:{ "name": "model", "size": 2 }
- custom
If set to
true
then the result of the training, if successful, will be uploaded to the repository as a model.Default:
false
Example:
true
curl --request POST 'https://{cluster_url}/ml/v4/trainings?version=2023-05-02' -H 'Authorization: Bearer eyJhbGciOiJSUzUxM...' -H 'Content-Type: application/json' -H 'Accept: application/json' --data-raw '{ "name": "my-prompt-tune-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification", "tuning_type": "prompt_tuning", "num_epochs": 30, "learning_rate": 0.4, "accumulate_steps": 3, "batch_size": 10, "max_input_tokens": 100, "max_output_tokens": 100 }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results" }, "type": "container" } }'
Response
Training resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" } }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
The training job has been created.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } }
Retrieve the list of trainings
Retrieve the list of trainings for the specified space or project.
GET /ml/v4/trainings
Request
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
Token required for token-based pagination. This token cannot be determined by end user. It is generated by the service and it is set in the href available in the
next
field.How many resources should be returned. By default limit is 100. Max limit allowed is 200.
Possible values: 1 ≤ value ≤ 200
Default:
100
Example:
50
Compute the total count. May have performance impact.
Return only the resources with the given tag value.
Filter based on on the training job state.
Allowable values: [
queued
,pending
,running
,storing
,completed
,failed
,canceled
]The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Information for paging when querying resources.
The number of items to return in each page.
Possible values: 1 ≤ value ≤ 200
Example:
10
The reference to the first item in the current page.
The total number of resources. Computed explicitly only when 'total_count=true' query parameter is present. This is in order to avoid performance penalties.
Example:
1
A reference to the first item of the next page, if any.
The training resources.
Optional details coming from the service and related to the API call or the associated resource.
Examples:{ "warnings": [ { "message": "This model is a Non-IBM Product governed by a third-party license that may impose use restrictions and other obligations.", "id": "DisclaimerWarning" } ] }
- system
Any warnings coming from the system.
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "limit": 100, "first": { "href": "https://{cluster_url}/ml/v4/trainings" }, "total_count": 1, "resources": [ { "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } } ] }
Retrieve the training
Retrieve the training with the specified identifier.
GET /ml/v4/trainings/{training_id}
Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Response
Training resource.
Common metadata for a resource where
project_id
orspace_id
must be present.Examples:{ "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "rev": "2", "owner": "guy", "created_at": "2020-05-02T16:27:51Z", "modified_at": "2020-05-02T16:30:51Z", "parent_id": "dfe1cf1-252f-424b-b52d-5cdd9814600c", "name": "my-name", "description": "My resource", "tags": [ "t1", "t2" ], "commit_info": { "committed_at": "2020-05-02T16:27:51Z", "commit_message": "Updated to TF 2.0" }, "space_id": "3fc54cf1-252f-424b-b52d-5cdd9814987f" }
Status of the training job.
Examples:{ "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" } }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "message": { "level": "info", "text": "Training job 360c40f7-ac0c-43ca-a95f-1a5421f93b82 completed" } } }
Status Code
OK.
Bad request, the response body should contain the reason.
Unauthorized.
Forbidden, an authentication error including trying to access an unauthorized space or project.
The specified resource was not found.
{ "metadata": { "id": "6213cf1-252f-424b-b52d-5cdd9814956c", "name": "my-prompt-training", "project_id": "12ac4cf1-252f-424b-b52d-5cdd9814987f", "created_at": "2023-08-04T13:22:47.000Z" }, "entity": { "prompt_tuning": { "base_model": { "model_id": "google/flan-t5-xl" }, "task_id": "classification" }, "training_data_references": [ { "id": "tune1_data.json", "location": { "path": "tune1_data.json" }, "type": "container" } ], "auto_update_model": true, "results_reference": { "location": { "path": "tune1/results", "training": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82", "training_status": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/training-status.json", "assets_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets", "model_request_path": "tune1/results/360c40f7-ac0c-43ca-a95f-1a5421f93b82/assets/c29e7544-dfd0-4427-bc66-20fa6023e2e0/resources/wml_model/request.json" }, "type": "container" }, "status": { "state": "completed", "running_at": "2023-08-04T13:22:48.000Z", "completed_at": "2023-08-04T13:22:55.289Z", "metrics": [ { "iteration": 0, "ml_metrics": { "loss": 4.49988 }, "timestamp": "2023-09-22T02:52:03.324Z" }, { "iteration": 1, "ml_metrics": { "loss": 3.86884 }, "timestamp": "2023-09-22T02:52:03.689Z" }, { "iteration": 2, "ml_metrics": { "loss": 4.05115 }, "timestamp": "2023-09-22T02:52:04.053Z" } ] } } }
Cancel or delete the training
Cancel or delete the specified training, once deleted all trace of the job is gone.
DELETE /ml/v4/trainings/{training_id}
Request
Path Parameters
The training identifier.
Query Parameters
The version date for the API of the form
YYYY-MM-DD
.Example:
2023-07-07
The space that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
63dc4cf1-252f-424b-b52d-5cdd9814987f
The project that contains the resource. Either
space_id
orproject_id
query parameter has to be given.Possible values: length = 36, Value must match regular expression
^[a-zA-Z0-9-]*$
Example:
a77190a2-f52d-4f2a-be3d-7867b5f46edc
Set to true in order to also delete the job or request metadata.