Introduction
IMPORTANT: Watson Data API documentation is deprecated and might be out of date. The relevant functionality is now described in the following documents:
You can use a collection of Watson Data REST APIs associated with Watson Studio and Watson Knowledge Catalog to manage data-related assets and the people who need to use these assets.
Refine data Use the sampling APIs to create representative subsets of the data on which to test and refine your data cleansing and shaping operations. To better understand the contents of your data, you can create profiles of your data assets that include a classification of the data and additional distribution information which assists in determining the data quality.
Catalog data Use the catalog APIs to create catalogs to administer your assets, associate properties with those assets, and organize the users who use the assets. Assets can be notebooks or connections to files, database sources, or data assets from a connection.
Data policies Use the data policy APIs to implement data policies and a business glossary that fits to your organization to control user access rights to assets and to make it easier to find data.
Ingest streaming data Use the streams flow APIs to hook up continuous, unidirectional flows of massive volumes of moving data that you can analyze in real time.
API Endpoint
https://api.dataplatform.cloud.ibm.com
Creating an IAM bearer token
Before you can call a Watson Data API you must first create an IAM bearer token. Each token is valid only for one hour, and after a token expires you must create a new one if you want to continue using the API. The recommended method to retrieve a token programmatically is to create an API key for your IBM Cloud identity and then use the IAM token API to exchange that key for a token.
You can create a token in IBM Cloud or by using the IBM Cloud command line interface (CLI).
To create a token in the IBM Cloud:
- Log in to IBM Cloud and select Manage > Access (IAM) > API keys.
- Create an API key for your own personal identity, copy the key value, and save it in a secure place. After you leave the page, you will no longer be able to access this value.
- With your API key, set up Postman or another REST API tool and run the following command to the right
You can read more about managing API keys at Understanding API keys documentation page.
- Use the value of the
access_token
property for your Watson Data API calls. Set theaccess_token
value as the authorization header parameter for requests to the Watson Data APIs. The format isAuthorization: Bearer <access_token_value_here>
. For example:Authorization: Bearer eyJhbGciOiJIUz......sgrKIi8hdFs
To create a token by using the IBM Cloud CLI:
-
Follow the steps to install the CLI, log in to IBM Cloud, and get the token described here.
Remove
Bearer
from the returned IAM token value in your API calls.
Curl command with API key to retrieve token
curl -X POST 'https://iam.cloud.ibm.com/identity/token' -H 'Content-Type: application/x-www-form-urlencoded' -d 'grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=MY_APIKEY'
Response
{
"access_token": "eyJhbGciOiJIUz......sgrKIi8hdFs",
"refresh_token": "SPrXw5tBE3......KBQ+luWQVY=",
"token_type": "Bearer",
"expires_in": 3600,
"expiration": 1473188353
}
Versioning
Watson Data API has a major, minor, and patch version, following industry conventions on semantic versioning: Using the version number format MAJOR.MINOR.PATCH, the MAJOR version is incremented when incompatible API changes are made, the MINOR version is incremented when functionality is added in a backwards-compatible manner, and the PATCH version is incremented when backwards-compatible bug fixes are made. The service major version is represented in the URL path.
Sorting
Some of the Watson Data API collections provide custom sorting support. Custom sorting is implemented using the sort
query parameter. Service collections can also support single-field or multi-field sorting. The sort
parameter in collections that support single-field sorting can contain any one of the valid sort fields.
For example, the following expression would sort accounts on company name (ascending):GET /v2/accounts?sort=company_name
.
You can also add a + or - character, indicating “ascending” or “descending,” respectively.
For example, the expression below would sort on the last name of the account owner, in descending order:GET /v2/accounts?sort=-owner.last_name
.
The sort
parameter in collections that support sorting on multiple fields can contain a comma-separated sequence of fields (each, optionally, with a + or -) in the same format as the single-field sorting. Sorts are applied to the data set in the order that they are provided. For example, the expression below would sort accounts first on company name (ascending) and second on owner last name (descending): GET /v2/accounts?sort=company_name,-owner.last_name
Filtering
Some of the Watson Data API collections provide filtering support. You can specify one or more filters where each supported field is required to match a specific value for basic filtering. The query parameter names for a basic filter must exactly match the name of a primitive field on a resource in the collection or a nested primitive field where the '.' character is the hierarchical separator. The only exception to this rule is for primitive arrays. In primitive arrays, such as tags, a singular form of the field is supported as a filter that matches the resource if the array contains the supplied value. Some of the Watson Data API collections can also support extended filtering comparisons for the following field types: Integer and float, date and date/time, identifier and enumeration, and string.
Rate Limiting
The following rate limiting headers are supported by some of the Watson Data service APIs: 1. X-RateLimit-Limit: If rate limiting is active, this header indicates the number of requests permitted per hour; 2. X-RateLimit-Remaining: If rate limiting is active, this header indicates the number of requests remaining in the current rate limit window; 3. X-RateLimit-Reset: If rate limiting is active, this header indicates the time at which the current rate limit window resets, as a UNIX timestamp.
Error Handling
Responses with 400-series or 500-series status codes are returned when a request cannot be completed. The body of these responses follows the error model, which contains a code field to identify the problem and a message field to explain how to solve the problem. Each individual endpoint has specific error messages. All responses with 500 or 503 status codes are logged and treated as a critical failure requiring an emergency fix.
Connections
A connection is the information necessary to create a connection to a data source or a repository. You create a connection asset by providing the connection information.
List data source types
Data sources are where data can be written or read and might include relational database systems, file systems, object storage systems and others.
To list supported data source types, call the following GET method:
GET /v2/datasource_types
The response to the GET method includes information about each of the sources and targets that are currently supported. The response includes a unique ID property value metadata.asset_id
, name, and a label. The metadata.asset_id
property value should be used for the data source in other APIs that reference a data source type. Additional useful information such as whether that data source can be used as a source or target (or both) is also included.
You can also view a table of the individual data source properties obtained by this GET method at https://dataplatform.cloud.ibm.com/connections/docs.
Use the connection_properties=true
query parameter to return a set of properties for each data source type that is used to define a connection to it. Use the interaction_properties=true
query parameter to return a set of properties for each data source type that is used to interact with a created connection. Interaction properties for a relational database might include the table name and schema from which to retrieve data.
Use the _sort
query parameter to order the list of data source type returned in the response.
A default maximum of 100 data source type entries are returned per page of results. Use the _limit
query parameter with an integer value to specify a lower limit.
More data source types than those on the first page of results might be available. Additional properties generated from the page size initially specified with _limit
are returned in the response. Call a GET method using the value of the next.href
property to retrieve the next page of results. Call a GET method using the value in the prev.href
property to retrieve the previous page of results. Call a GET method using the value in the last.href
property to retrieve the last page of results.
These URIs use the _offset
and _limit
query parameters to retrieve a specific block of data source types from the full list. Alternatively, you can use a combination of the _offset
and _limit
query parameters to retrieve a custom block of results.
Create a connection
Connections to any of the supported data source types returned by the previous method can be created and persisted in a catalog or project.
To create a connection, call the following POST method:
POST /v2/connections
A new connection can be created in a catalog or project. Use the catalog_id
or project_id
query parameter to specify where to create the connection asset. Either catalog_id
or project_id
is required.
The request body for the method is a UTF-8 encoded JSON document and includes the data source type ID (obtained in the List data source types
section), its unique name in the catalog or project space, and a set of connection properties specific to the data source. Some connection properties are required.
The following example shows the request body used for creating a connection to IBM dashDB:
{
"datasource_type": "cfdcb449-1204-44ba-baa6-9a8a878e6aa7",
"name":"My-DashDB-Connection",
"properties": {
"host":"dashDBhost.com",
"port":"50001",
"database":"MYDASHDB",
"password": "mypassword",
"username": "myusername"
}
}
By default, the physical connection to the data source is tested when the connection is created. Use the test=false
query parameter to disable the connection test.
A response payload containing a connection ID and other metadata is returned when a connection is successfully created. Use the connection ID as path parameter in other REST APIs when a connection resource must be referenced.
Discover connection assets
Data sources contain data and metadata describing the data they contain.
To discover or browse the data or metadata in a data source, call the following GET method:
GET /v2/connections/{connection_id}/assets?path=
Use the catalog_id
or project_id
query parameter to specify where the connection asset was created. Either catalog_id
or project_id
is required.
connection_id
is the ID of the connection asset returned from the POST https://{service_URL}/v2/connections
method, which created the connection asset.
The path
query parameter is required and is used to specify the hierarchical path of the asset within the data source to be browsed. In a relational database, for example, the path might represent a schema and table. For a file object, the path might represent a folder hierarchy.
Each asset in the assets array returned by this method includes a property containing its path in the hierarchy to facilitate the next call to drill down deeper in the hierarchy.
For example, starting at the root path in an RDBMS will return a list of schemas:
{
"path": "/",
"asset_types": [
{
"type": "schema",
"dataset": false,
"dataset_container": true
}
],
"assets": [
{
"id": "GOSALES",
"type": "schema",
"name": "GOSALES",
"path": "/GOSALES"
},
],
"fields": [],
"first": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=0&_limit=100"
},
"prev": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=0&_limit=100"
},
"next": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=100&_limit=100"
}
}
Drill down into the GOSALES schema using the path
property for the GOSALES schema asset to discover the list of table assets in the schema.
GET /v2/connections/{connection_id}/assets?catalog_id={catalog_id}&path=/GOSALES
The list of table type assets is returned in the response.
{
"path": "/GOSALES",
"asset_types": [
{
"type": "table",
"dataset": true,
"dataset_container": false
}
],
"assets": [
{
"id": "BRANCH",
"type": "table",
"name": "BRANCH",
"description": "BRANCH contains address information for corporate offices and distribution centers.",
"path": "/GOSALES/BRANCH"
},
{
"id": "CONVERSION_RATE",
"type": "table",
"name": "CONVERSION_RATE",
"description": "CONVERSION_RATE contains currency exchange values.",
"path": "/GOSALES/CONVERSION_RATE"
}
],
"fields": [],
"first": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=0&_limit=100"
},
"prev": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=0&_limit=100"
},
"next": {
"href": "https://wdp-dataconnect-ys1dev.stage1.mybluemix.net/v2/connections/4b28b5c1-d818-4ad2-bcf9-7de08e776fde/assets?catalog_id=75a3062b-e40f-4bc4-9519-308ee1b5b251&_offset=100&_limit=100"
}
}
Use the fetch
query parameter with a value of either data
, metadata
, or both. Data can only be fetched for data set assets. In the response above, note the asset_type
has the property type
value of table. Its dataset
property value is true. This means that data can be fetched from table type assets. However, if you fetched assets from the connection root, the response would contain schema asset types, which are not data sets and thus fetching this data is not relevant.
A default maximum of 100 metadata assets are returned per page of results. Use the _limit
query parameter with an integer value to specify a lower limit. More assets than those on the first page of results might be available.
Additional properties generated from the page size initially specified with _limit
are returned in the response. Call a GET method using the value of the next.href
property to retrieve the next page of results. Call a GET method using the value in the prev.href
property to retrieve the previous page of results. Call a GET method using the value in the last.href
property to retrieve the last page of results.
These URIs use the _offset
and _limit
query parameters to retrieve a specific block of assets from the full list. Alternatively, use a combination of the _offset
and _limit
query parameters to retrieve a custom block of results.
Specify properties for reading delimited files
When reading a delimited file using this method, specify property values to correctly parse the file based on its format. These properties are passed to the method as a JSON object using the properties
query parameter. The default file format (property file_format
) is a CSV file. If the file is a CSV, the following property values are set by default:
Property Name | Property Description | Default Value | Value Description |
---|---|---|---|
quote_character |
quote character | double_quote |
double quotation mark |
field_delimiter |
field delimiter | comma |
comma |
row_delimiter |
row delimiter | carriage_return_linefeed |
carriage return followed by line feed |
escape_character |
escape character | double_quote |
double quotation mark |
For CSV file formats, these property values can not be overwritten. If it is necessary to modify these properties to properly read a delimited file, set the file_format
property to delimited
. For generic delimited files, these properties have the following values:
Property Name | Property Description | Default Value | Value Description |
---|---|---|---|
quote_character |
quote character | none |
no character is used for a quote |
field_delimiter |
field delimiter | null | no field delimiter value is set by default |
row_delimiter |
row delimiter | new_line |
Any new line representation |
escape_character |
escape character | none |
no character is used for an escape |
This example sets file format properties for a generic delimited file:
GET https://{service_URL}/v2/connections/{connection_id}/assets?catalog_id={catalog_id}&path=/myFolder/myFile.txt&fetch=data&properties={"file_format":"delimited", "quote_character":"single_quote","field_delimiter":"colon","escape_character":"backslash"}
For more information about this method see the REST API Reference.
Discover assets using a transient connection
A data source's assets can be discovered without creating a persistent connection.
To browse assets without first creating a persistent connection, call the following POST method:
POST https://{service_URL}/v2/connections/assets?path=
This method is identical in behavior to the GET method in the Discover connection assets
section except for two differences:
- You define the connection properties in the request body of the REST API. You do not reference the connection ID of a persistent connection with a query parameter. The same JSON object used to create a persistent connection is used in the request body.
- You do not specify a catalog or project ID with a query parameter.
See the previous section to learn how to set properties used to read delimited files.
For more information about this method see the REST API Reference.
Update a connection
To modify the properties of a connection, call the following PATCH method:
PATCH /v2/connections/{connection_id}
connection_id
is the ID of the connection asset returned from the POST https://{service_URL}/v2/connections
method, which created the connection asset.
Use the catalog_id
or project_id
query parameter to specify where the connection asset was created. Either catalog_id
or project_id
is required.
Set the Content-Type
header to application/json-patch+json
. The request body contains the connection properties to update using a JSON object in JSON Patch format.
Change the port number of the connection and add a description using this JSON Patch:
[
{
"op": "add",
"path": "/description",
"value": "My new PATCHed description"
},
{
"op":"replace",
"path":"/properties/port",
"value":"40001"
}
]
By default, the physical connection to the data source is tested when the connection is modified. Use the test=false
query parameter to disable the connection test.
For more information about this method see the REST API Reference.
Delete a connection
To delete a persistent connection, call the following DELETE method:
DELETE /v2/connections/{connection_id}
connection_id
is the ID of the connection asset returned from the POST https://{service_URL}/v2/connections
method, which created the connection asset.
Use the catalog_id
or project_id
query parameter to specify where the connection asset was created. Either catalog_id
or project_id
is required.
Schedules
Introduction
Schedules allow you to run a data flow, a notebook, a data profile, or any other given source more than once. It supports various repeat types namely hour, day, week, month, and year with 2 repeat end options namely, end date and the maximum number of runs.
Create a schedule
To create a schedule in a specified catalog or project, call the following POST method:
HTTP Method : POST
URI : /v2/schedules
Before you create a schedule, you must consider the following points:
-
You must have a valid IAM token to make REST API calls and a project or catalog ID.
-
You must be authorized (be assigned the correct role) to create schedules in the catalog or project.
-
The start and end dates must be in the following format:
YYYY-MM-DDTHH:mm:ssZ
orYYYY-MM-DDTHH:mm:ss.sssZ
(specified in RFC 3339). -
The supported repeat types are
hour
,day
,week
,month
, andyear
. -
There are 2 repeat end options, namely
max_invocations
andend_date
. -
The supported repeat interval is 1.
-
There are 3 statuses for schedules, namely
enabled
,disabled
, andfinished
. To create a schedule, the status must beenabled
. The scheduling service updates the status tofinished
once it has finished running. You can stop or pause the scheduling service by updating the status todisabled
. -
You can update the endpoint URI in the target HREF. Supported target methods are POST, PUT, PATCH, DELETE, and GET.
-
Set
generate_iam_token=true
. When this option is set to true, the scheduling service generates an IAM token and passes it to the target URL at runtime. This IAM token is required to run schedules automatically at the scheduled intervals. This token is not to be confused with the IAM token required to make Watson Data API REST calls.
This POST method creates a schedule in a catalog with a defined start and a given end date:
{
"catalog_id": "aeiou",
"description": "aeiou",
"name": "aeiou",
"tags": ["aeiou"],
"start_date": "2017-08-22T01:02:14.859Z",
"status": "enabled",
"repeat": {
"repeat_interval": 1,
"repeat_type": "hour"
},
"repeat_end": {
"end_date": "2017-08-24T01:02:14.859Z"
},
"target": {
"href": "https://api.dataplatform.cloud.ibm.com/v2/data_profiles?start=false",
"generate_iam_token": true,
"method": "POST",
"payload": "aeiou",
"headers": [
{
"name": "content-type",
"value": "application/json",
"sensitive": false
}
]
}
}
Get multiple schedules in a catalog or project
To get all schedules in the specified catalog or project, call the following GET method:
HTTP Method: GET
URI :/v2/schedules
You need the following information to get multiple schedules:
-
A valid IAM token, schedule ID, and the catalog or project ID.
-
You must be authorized to get schedules in the catalog or project.
You can filter the returned results by using the options entity.schedule.name
and entity.schedule.status
and can filter matching types by using StartsWith(starts:)
and Equals(e:)
.
You can sort the returned results either in ascending or descending order by using one or more of the following options: entity.schedule.name
, metadata.create_time
, and entity.schedule.status
.
Get a schedule
To get a schedule in the specified catalog or project, call the following GET method:
HTTP Method: GET
URI :/v2/schedules/{schedule_id}
You need the following information to get a schedule:
-
A valid IAM token, schedule ID, and the catalog or project ID.
-
You must be authorized to get a schedule in the catalog or project.
Update a schedule
To update a schedule in the specified catalog or project, call the following PATCH method:
HTTP Method: PATCH
URI :/v2/schedules/{schedule_id}
You need the following information to update a schedule:
-
A valid IAM token, schedule ID, and the catalog or project ID.
-
You must be authorized to update a schedule in the catalog or project.
You can update all the attributes under entity but can't update the attributes under meta-data.
Patch supports the replace
, add
, and remove
operations. The replace
operation can be used with all the attributes under entity. The add
and remove
operations can only be used with the repeat end options, namely max_invocations
and end_date
.
The start and end dates must be in the following format: YYYY-MM-DDTHH:mm:ssZ
or YYYY-MM-DDTHH:mm:ss.sssZ
(specified in RFC 3339).
This PATCH method replaces the repeat type, removes the max invocations and adds an end date:
[
{
"op": "remove",
"path": "/entity/schedule/repeat_end/max_invocations",
"value": 20
},
{
"op": "add",
"path": "/entity/schedule/repeat_end/end_date",
"value": "date"
},
{
"op": "replace",
"path": "/entity/schedule/repeat/repeat_type",
"value": "week"
}
]
Delete a schedule
To delete a schedule in the specified catalog or project, call the following DELETE method:
HTTP Method : DELETE
URI :{GATEWAY_URL}/v2/schedules/{schedule_id}
":guid" represents the schedule_id of the deleted schedule.
You need the following information to delete a schedule:
-
A valid IAM token, schedule ID, and the catalog or project ID.
-
You must be authorized to delete a schedule in the catalog or project.
Delete multiple schedules
To delete multiple schedules in the specified catalog or project, call the following DELETE method:
HTTP Method: DELETE
URI :{GATEWAY_URL}/v2/schedules
":guid" represents the schedule_id of the deleted schedule.
You need the following information to delete multiple schedules:
-
A valid IAM token, schedule ID, and the catalog or project ID.
-
You must be authorized to delete schedules in the catalog or project.
-
A comma-separated list of the schedule IDs. If schedule IDs are not listed in the parameter
schedule_ids
, the scheduling service will delete all the schedules in the catalog or project.
Catalogs
IBM Knowledge Catalog helps you easily organize, find and share data assets, analytical assets, etc. for many data science projects and for the users who need to use those assets.
You can use the Catalog API to create catalogs which are rich metadata repositories for organizing and exploring metadata.
There are two phrases that will be used repeatedly throughout this (and the "Assets" and "Asset Types") documentation:
-
asset resource
: The primary content of the asset. Many assets have a resource that is stored in an external repository: a data file, connected data set, notebook file, dashboard definition, or model definition. -
asset metadata
: The information about the asset resource. Each asset has a primary metadata document in a project or catalog and might have additional metadata documents.
See the Asset Terminology section for more information about those two phrases.
There is one special user-provided storage that must be specified by the creator of a catalog at the time the catalog is created: a Cloud Object Storage bucket for public cloud deployment and a file system for hybrid cloud deployment. We'll informally call that the "catalog's bucket". The creator of the catalog owns that bucket, but by providing that bucket's identification info during catalog creation, the catalog creator is allowing the IBM Knowledge Catalog graphical User Interface to store asset resources in that bucket and is allowing other IBM Knowledge Catalog APIs to stored (extended) asset metadata in that bucket.
If a user wants to store and retrieve asset resources (like spreadsheets, images, etc.) in the catalog's bucket, then that user can use the Assets API API to assist in that process.
In some cases, one of the other IBM Knowledge Catalog APIs (for example, the "Profiling" API) will store (extended) asset metadata documents in the catalog's bucket.
This section describes some of the individual Catalog APIs.
Get a Catalog
You can get metadata about a catalog using the get Catalog API. (Note: you aren't retrieving the actual data catalog with the GET Catalog API - you're just retrieving metadata that describes the catalog.)
Get Catalog - Request URL:
GET {service_URL}/v2/catalogs/{catalog_id}
Get Catalog - Response Body:
{
"metadata": {
"guid": "c6f3cbd8-2b7f-42fb-aa60-___",
"url": "https://api.dataplatform.cloud.ibm.com/v2/catalogs/c6f3cbd8-2b7f-42fb-aa60-___",
"creator_id": "IBMid-___",
"create_time": "2018-11-06T17:40:32Z"
},
"entity": {
"name": "CatalogForGettingStartedDoc",
"description": "Catalog created for Getting Started doc",
"generator": "Your catalog generator",
"bss_account_id": "12345___",
"capacity_limit": 0,
"is_governed": false,
"saml_instance_name": "IBM w3id"
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/catalogs/c6f3cbd8-2b7f-42fb-aa60-___"
}
In this case, the response for the Get Catalog request is identical to the response for the Create Catalog request. If more activity had occurred with the catalog between the Create Catalog and the Get Catalog requests then there might have been some differences between the two responses.
Get Catalogs
To obtain the metadata for all the catalogs that you have access to (ie, are a collaborator of), you can call the GET Catalogs API.
Get Catalogs - Request URL:
GET {service_URL}/v2/catalogs
Note: the above URL is the simplest URL for getting catalogs because it doesn't contain any parameters. There are a number of optional parameters (limit
, bookmark
, skip
, include
, bss_account_id
) to the above URL that you can make use of to limit the number of catalogs for which metadata is returned.
Get Catalogs - Response Body:
{
"catalogs": [
{
"metadata": {
"guid": "c6f3cbd8-2b7f-42fb-aa60-___",
"url": "https://api.dataplatform.cloud.ibm.com/v2/catalogs/c6f3cbd8-2b7f-42fb-aa60-___",
"creator_id": "IBMid-___",
"create_time": "2018-11-06T17:40:32Z"
},
"entity": {
"name": "CatalogForGettingStartedDoc",
"description": "Catalog created for Getting Started doc",
"generator": "Your catalog generator",
"bss_account_id": "12345___",
"capacity_limit": 0,
"is_governed": false,
"saml_instance_name": "IBM w3id"
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/catalogs/c6f3cbd8-2b7f-42fb-aa60-___"
}
],
"nextBookmark": "g1AAAAFCeJzLYWBgYMlgTmHQSklKzi9KdUhJMjT___",
"nextSkip": 0
}
In the above example, metadata for only one catalog is returned - the catalog created above. An advantage of calling the GET Catalogs API is you don't have to remember the ID of any particular catalog in order to get the metadata for that catalog.
Assets
From a high level, an asset is an item of data or data analysis in a project or catalog. Most of these assets consist of two parts:
-
Asset resource: The primary content of the asset. Many assets have a resource that is stored in an external repository: a data file (eg. text file, image, video, etc.), connected data set (eg. database table), notebook file, dashboard definition, or model definition. The Assets API does not affect this part of the asset. Think of this as the object that's being described by asset metadata (ie, an asset resource is a "decribee").
-
Asset metadata: The information about the asset resource. Each asset has a primary metadata document in a project or catalog and might have additional metadata documents. This is the part of the asset that you can get, create, or operate on with the Assets API. Think of this as the object that's doing the describing of an asset resource (ie, asset metadata is a "describer").
A library is a useful analogy for understanding the scope of the Assets API. A library contains a set of books and an index. The index, or card catalog, contains a card about each book. A card has information about the book, including the location of the book. A Watson project or catalog contains only the card catalog part of the library. The books, or asset resources, are elsewhere. Consequently, the Assets API can return the location of an asset resource, but not affect the asset resource in any way.
The term asset encapsulates the following:
- [1] asset resource: the primary / initial resource that a user wants described by a primary metadata document.
- [2] primary metadata document: a document added to a catalog to describe an asset resource.
- [3] attributes: chunks of data inside a primary metadata document that describe either the asset resource or a secondary / extended metadata document.
- [4] secondary / extended metadata documents: additional documents containing information related to the asset resource. Attached to the primary metadata document. Can be generated by catalog processes, such as profiling.
- [5] a combination of all of the above: the IBM Knowledge Catalog UI presents information from each of the above on a single page and calls all that information an "asset".
For example, when you call the Get Assets API, you receive asset metadata (in a primary metadata document). The asset metadata might point to the location of the asset resource, but the Get Assets API does not return the asset resource. Similarly, when you run the Create Assets API, you create a primary metadata document that can, eventually, include the location of an existing asset resource.
This overview section provides a picture of the parts of a "primary metadata document" and then explains the parts of that picture. The picture provides a kind of "map" of a primary metadata document, so it's recommended to spend a few minutes studying it. Readers who prefer API examples can skip over the explanation of that picture that follows, and go straight to the Assets API Examples section. However, the Assets API Examples section will often refer back to the terms and explanations discussed in this Assets API Overview section.
Note: when calling any of the endpoints in the Assets API you must specify either a catalog ID or a project ID to indicate whether the metadata for an asset is (to be) in a catalog or a project. Because the Assets API endpoints can be applied to either a catalog or a project, rather than repeating the phrase "either a catalog or a project" over and over throughout the rest of this documentation, only the term "catalog" will be used. The possibility of instead using a "project" will be implied.
Asset Primary Metadata Document (or Card)
A primary metadata document is a document that contains the primary metadata for an asset resource. Once a primary metadata document has been created and stored in the catalog, it's often informally said that that asset resource has been "cataloged", or "added to the catalog". Note: being cataloged, or added to the catalog, does not mean the asset resource has been moved or copied and is now physically stored inside the catalog - it just means a primary metadata document has been created for that asset resource, and that primary metadata document is now stored in the catalog.
Almost every Assets API endpoint revolves around creating, reading, modifying or deleting a primary metadata document. JSON is natively used to store primary metadata documents in a catalog, and to transfer those documents in Assets API REST calls. So, JSON examples of primary metadata documents will be used throughout this documentation.
In this documentation, the term card (as in, an index card in a library's catalog) will often be used as a short nickname for the phrase "primary metadata document". In this documentation, "card" and "primary metadata document" mean exactly the same thing. The term "card" just saves us from reading and writing the lengthier phrase "primary metadata document" over and over.
A primary metadata document (ie, card) is a JSON object that's composed of up to three top-level fields, named as follows:
1. [**metadata**](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document__metadata_group): a JSON object containing metadata _common to all_ [asset types](#Section_Assets__Overview_and_Terminology__Asset_Type)
2. [**entity**](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document__entity_group): a JSON object containing [attributes](#Section_Assets__Overview_and_Terminology__Attributes), each containing metadata _specific to one_ asset type
3. [**attachments**](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document__attachments_group): an optional JSON array, each item of which is a JSON object containing _metadata for_ an attached (ie, externally stored) [asset resource](#Section_Asset__Terminology_Overview__definition__Asset_Resource) or [extended metadata document](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document_Overview__attachment__extended_metadata)
For a pictorial representation of a primary metadata document (ie, card) and its associated asset resource and extended metadata documents, see the Parts of a Primary Metadata Document figure below:
In particular, note that:
- red rectangles are used in the figure to highlight the [three top-level fields of a card](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document_Overview__three_top_level_fields_of_a_card).
- the green rectangles illustrate how important the _name_ of the [primary asset type](#Section_Assets__Overview_and_Terminology__Asset_Type__Primary_Asset_Type_definition) is in relating various parts of the card, and the attached [asset resource](#Section_Asset__Terminology_Overview__definition__Asset_Resource), to each other. In the example figure, the value of `"metadata.asset_type"` is "data\_asset". The value you'll see in your card depends on the "asset_type" you've specified for your asset.
"metadata" field of a Primary Metadata Document
The "metadata" field of a primary metadata document (ie, of a card) is a JSON object that contains metadata fields that are common across all types of assets. (See the top red rectangle in the parts figure.) The Assets API specifies the names of the fields that go into the "metadata" part of the card. The user must supply values for some of the fields in "metadata"; the values of other fields in "metadata" will be filled in by the Assets API during the life of the card. Here's a list of some of the fields inside "metadata" (see example cards in the Get Asset section for more extensive lists):
- "asset_id":
- The ID of the card (ie, primary metadata document) rather than of the asset resource described by the card.
- Created internally by the Assets API at the time the card is created. That is, you do not supply this value.
- "asset_type":
- You must supply this value.
- Declares the primary asset type of this card.
- Describes the type of the asset resource attached (if any) to this card.
- Specifies the name of the primary attribute in this card.
- See Asset Types for more details on asset types.
- "asset_attributes":
- You must not supply any value for this field when creating a primary metadata document. The Assets APIs maintain the contents of this field.
- An array of attribute names (only the names, not the actual attributes).
- Each attribute / asset type name listed in this array will have a correspondingly named attribute in the "entity" field of the card.
- The name of each attribute must match the name of an existing asset type, so this is also an array of the names of the primary and secondary / extended asset types used by this card.
- "name": the name of the asset resource this card describes
- "description": a description of the asset resource
- "origin_country": the originating country for the asset resource
- "tags": an array of terms that users want to associate with the asset resource
- "rov": Rules Of Visibility.
- "mode": -1 - this is the default, which corresponds to "mode" : 0, public (see below)
- "mode": 0 - indicates public visibility, in which everybody can view and search the values of the asset's primary metadata document (card), and preview the asset's data. Note: access can still be denied based on actionable governance policy rules.
"rov": {
"mode": 0,
"collaborator_ids": []
}
-
- "mode": 8 - indicates private visibility, which allows users listed as members of the asset (as denoted by
collaborator_ids
list) to view and search all fields (includingmetadata
,entity
, andattachments
) of the asset's primary metadata document (card), and preview the asset's data. Non-members are allowed to only view and search themetadata
field, and cannot preview the asset's data. This mode is only available through the API, and is not exposed in the IKC UI. Note: access can still be denied based on actionable governance policy rules.
- "mode": 8 - indicates private visibility, which allows users listed as members of the asset (as denoted by
"rov": {
"mode": 8,
"collaborator_ids": [
{
"IBMid-06___": {
"user_iam_id": "IBMid-06___"
}
},
{
"IBMid-27___": {
"user_iam_id": "IBMid-27___"
}
}
]
}
-
- "mode": 16 - indicates hidden visibility, in which only users listed as members of the asset (as denoted by
collaborator_ids
list) have any access to fields in the asset, and non-members have no access to the asset. Note: access can still be denied based on actionable governance policy rules.
- "mode": 16 - indicates hidden visibility, in which only users listed as members of the asset (as denoted by
"rov": {
"mode": 16,
"collaborator_ids": [
{
"IBMid-06___": {
"user_iam_id": "IBMid-06___"
}
},
{
"IBMid-27___": {
"user_iam_id": "IBMid-27___"
}
}
]
}
"entity" field of a Primary Metadata Document
The "entity" field of a card (ie, primary metadata document) is a JSON object that contains additional JSON objects called attributes, each of which contains metadata fields that are specific to one asset type. (See the middle red rectangle in the parts figure.) The only contents of the "entity" field are attributes, which are discussed in the next section.
Note: the fact that the "entity" section contains attributes for more than one asset type does not mean that a single card contains metadata for more than one asset resource. A card always contains metadata for exactly one asset resource, and that asset resource will have exactly one attribute associated with it (see primary attribute below). All the other attributes in the "entity" field contain extended metadata describing the single asset resource that the card was created for. Really, asset types ought to be thought of as attribute types because asset types literally define (some of) the fields that will appear in attributes.
Attributes
- is contained directly inside the ["entity"](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document__entity_group) field of the [primary metadata document](#Section_Assets__Overview_and_Terminology__definition__primary_metadata_document).
- is identically named with, and has fields that are partially defined by, an [Asset Type](#Section_Assets__Overview_and_Terminology__Asset_Type)
- describes an [asset resource](#Section_Asset__Terminology_Overview__definition__Asset_Resource) or something related to that asset resource, such as an [extended metadata document](#Section_Assets__Overview_and_Terminology__Asset_Metadata_Document_Overview__attachment__extended_metadata)
There is one attribute in the "entity" field for each attribute name that appears in the "metadata.asset_attributes" array. So, for example, if the "metadata.asset_attributes"
array contains these two attribute names:
"metadata": {
...
"asset_attributes": [
"data_asset",
"data_profile"
],
}
then the "entity" field will contain these two correspondingly named attributes:
"entity": {
"data_asset": { // attribute name matches "data_asset" in "metadata.asset_attributes"
...attribute contents...
},
"data_profile": { // attribute name matches "data_profile" in "metadata.asset_attributes"
...attribute contents...
}
}
The name of each attribute in "entity" must also match the name of an existing asset type. That is, an attribute named "X" will contain metadata related to an asset type also named "X". So, an attribute's name can be thought of as simultaneously telling us that attribute's "type". For example, in this asset metadata document example, both the attribute names "data_asset" and "data_profile" refer to asset types with those same names.
There is one special attribute that will be referred to as the primary attribute. The primary attribute is the main attribute used to describe an asset resource. Every primary metadata document will have exactly one primary attribute. The name of the primary attribute is the same as the name that appears in the "metadata.asset_type"
field.
Any attribute other than the primary attribute is a "secondary" / "extended" attribute whose name must match the name of a secondary / extended asset type. A common example of an attribute for extended metadata is named "data_profile", which is created by the Profiling API. For example, see the underlined names in the Parts of a Primary Metadata Document figure, or the "entity.data_profile"
field in this asset metadata document.
Although the Assets API restricts the names of attribute objects to match the names of asset types, the Assets API does not (in general) specify what the contents of those attributes should be. So, in some sense, the fields within an attribute are the opposite of the fields within the "metadata" field:
- the Assets API "owns" (or, specifies) which fields go inside "metadata"
- the user "owns" (or, specifies) which fields go inside the attributes (except for some fields of already available asset types)
The following example shows two attributes, whose names must match asset types, but whose contents are (for the most part) up to the user:
"entity": {
"data_asset": { // attribute name must match some asset type's name
...
data_asset *type creator* and
data_asset *attribute creator*
decide what fields go here
...
},
"data_profile": { // attribute name must match some asset type's name
...
data_profile *type creator* and
data_profile *attribute creator*
decide what fields go here
...
}
}
Because the Asset Types API is itself the creator of some already available asset types, the Asset Types API specifies some of the fields for any attribute whose name corresponds to one of those already available asset types. For example, see the discussion of the already available asset type called "data_asset".
Note: there is a GET attribute
API that can be used to retrieve just the attributes in the "entity" section of the primary metadata document, instead of the entire primary metadata document as returned by the GET asset
API.
"attachments" (optional) field of a Primary Metadata Document
The "attachments" field of a card (ie, primary metadata document) is a JSON array, each item of which contains metadata for one attachment. (See the bottom red rectangle in the parts figure.)
-
the "attachments" array in the primary metadata document
-
an attachment item in the "attachments" array
-
a metadata document that will be returned from a call to the GET Attachment API. That metadata document will contain information that points to, and can be used to retrieve, either...
-
the asset resource being described by the primary metadata document
-
an extended metadata document stored in the catalog's bucket and containing extended metadata for the asset resource
Each attribute in the "entity" field can have a corresponding attachment item in the "attachments" array. An attribute and its corresponding attachment item are related to each other by using the name of the attribute as the value for the attachment item's "asset_type" field. For example, notice in the following card snippet how the attribute name "data_asset" is used to link that "data_asset" attribute to its attachment item in the "attachments" array:
"entity": {
...other attributes
"data_asset": { // <-- attribute's name matches its...
...
},
...other attributes
},
"attachments": [
...other attachment items
{
...
"asset_type": "data_asset", // <-- ...attachment's asset_type
...
"connection_id": "...", // connection_ fields are one way
"connection_path": "...", // that item points to attached object
...
},
...other attachment items
]
Notice also in the above card snippet that, in this case, the attachment item contains two "connection_..." fields that point to the attachment object located in external storage. So, an attribute has an attachment item which points to an attachment object.
Like the fields of "metadata", the fields of an attachment item are specified by the Assets API. Some of the most important fields in an attachment item are:
- "asset_type":
- describes the type of the attachment
- figuratively connects the attachment item to the attribute with the same name
- "connection_id" and "connection_path" (optional):
- this pair of fields specify the ID of a
WDP Connection
and a path in the associated data repository that points to the attached object - always used for an attached asset like a database table
- can also be used for an attached asset resource (eg, spreadsheet) that can be stored in the catalog's bucket
- the presence of these two fields means the attachment will be known as a remote attachment
- this pair of fields specify the ID of a
- "object_key" and "handle" (optional):
For any attachment, only one of the following two pairs of fields will be used:
"connection_id"
and"connection_path"
(ie, remote attachment), or"object_key"
and"handle"
(ie, referenced attachment).
Interestingly, being remote does not tell you whether or not an attachment is in the catalog. Remote only tells you how the attached object can be retrieved: by using a connection.
An attachment item (in the card) points to one of two kinds of attached object (in external storage):
-
an asset, or
-
and extended metadata document.
Those are briefly discussed in the next 2 sections.
Asset Resource Attachment
The most typical attachment object is the asset resource being described by the card.
Follow the green arrows in the Parts of a Primary Metadata Document figure to see how:
- the asset's type name leads to
- an attribute name, which leads to
- a primary attribute, which leads to
- an attachment metadata item for that attribute, which finally leads to
- the attached asset resource.
For a full example that shows an attachment metadata item for an attached csv file, see the (only) item in the "attachments" array in Get Asset - CSV File - Response Body - Before Profiling.
Extended Metadata Document Attachment(s)
The other kind of attachment objects are extended metadata documents. A card can have 0, 1, or many attached extended metadata documents. These documents each contain a related set of (additional) metadata describing the asset resource. Extended metadata documents are stored externally in the catalog's bucket.
See the underlined "data_profile" type name in the Parts of a Primary Metadata Document figure for a visualization of how, for one extended metadata document, the three parts ("metadata", "entity", "attachments") of a card are related to each other.
See the second item in the "attachments" array in Get Asset - CSV File - Response Body - After Profiling for an example showing an attachment item for a "data_profile" extended metadata document.
Uses of "asset_type"
value
From the previous sections, you can see that the "asset_type"
value shows up in:
- the "metadata.asset_type" field
- the "metadata.asset_attributes" array
- a field (ie, object) in the "entity" field. This object is the primary attribute.
- the asset_type field of the primary attribute's attachment (if such an attachment exists, which it typically does). This (primary) attachment will be the asset resource (eg, database table, spreadsheet, csv file, etc.).
For example, see the Parts of a Primary Metadata Document figure above, where the name of the primary attribute is, in this case, "data_asset" and is highlighted with green rectangles in all the places it's used. The path shown by the green arrows in the figure starts at the "metadata.asset_type"
field and ends at the asset resource, in this case a file called Sample.csv.
Other Assets API Objects
Finally, here is a brief list of some of the remaining objects that can be manipulated with the Assets APIs:
- owner
- the owner of the asset
- collaborators
- users who are allowed to see and possibly edit (some parts of) the asset
- perms
- permissions for viewing / editing an asset
- ratings
- indications of how popular or useful the asset is
- stats
- statistics on how often and when the asset was viewed or edited, and who did that viewing or editing.
Getting an Asset
It's important to understand that the GET Asset
API does not return an asset resource like a database table, a spreadsheet, a csv file, etc. Instead, it returns a primary metadata document (ie, card) that describes an asset resource.
Obviously, a primary metadata document (ie, card) must have been created before it can be retrieved. Still, it's instructive to see actual examples of a card and its parts before attempting to create those things. After all, many users will retrieve cards that were previously created by someone else.
This and the following sections show how to retrieve asset metadata and attachments (eg, an asset resource and extended metadata documents).
Getting an Asset - for a Connection
We'll start by retrieving a common primary metadata document (ie, card): one for a "connection" asset type. This is a simple card because it has no attachments. That makes it an easy example to start with, even though many of the other cards you'll encounter do have attachments.
Use the following GET Asset
API to retrieve the primary metadata document for a connection. Note that this requires that you know and supply the IDs of both the primary metadata document (ie, card) and of the catalog that contains the card. Either someone has given you both of those IDs or you can browse to the asset's page using the IBM Knowledge Catalog UI and then extract both the catalog ID and the primary metadata document ID from within the URL in the browser's address bar.
Getting an Asset - Request URL:
GET {service_URL}/v2/assets/{asset_id}?catalog_id={catalog_id}
The following is the primary metadata document (ie, card) that's returned.
Note: you may find it helpful to look at the Parts of a Primary Metadata Document Figure before looking at the following Response Body.
Getting an Asset - Connection - Response Body:
{
"metadata": {
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2018-11-06T17:40:37Z",
"last_updater_id": "IBMid-___",
"last_update_time": 1541526037227,
"last_accessed_at": "2018-11-06T17:40:37Z",
"last_access_time": 1541526037227,
"last_accessor_id": "IBMid-___",
"access_count": 0
},
"name": "ConnectionForCSVFile",
"description": "Connection for CSV file",
"tags": [],
"asset_type": "connection",
"origin_country": "us",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-2b7f-42fb-aa60-___",
"created": 1541526037227,
"created_at": "2018-11-06T17:40:37Z",
"owner_id": "IBMid-___",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"connection"
],
"asset_id": "070e9be2-40a8-4e0e-___",
"asset_category": "SYSTEM"
},
"entity": {
"connection": {
"datasource_type": "193a97c1-4475-4a19-b90c-295c4fdc6517",
"context": "source,target",
"properties": {
"bucket": "catalogforgettingsta___",
"secret_key": "{wdpaes}12345___=",
"api_key": "{wdpaes}eo/12345_=",
"resource_instance_id": "crn:v1:bluemix:public:cloud-object-storage:global:a/12345c___:7240b198-b0f6-___::",
"access_key": "12345___",
"region": "us-geo",
"url": "https://s3.us-south.objectstorage.softlayer.net"
},
"flags": []
}
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/070e9be2-40a8-4e0e-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___"
}
The above response has two of the three primary groups of metadata that were described in the Primary Metadata Document section: "metadata" and "entity".
As discussed in Assets API Overview section, the contents of the "metadata" field are common to all primary metadata documents (ie, cards). The set of fields in "metadata" is completely defined by the Assets API. The values for some of those fields must be provided by the creator of the card, while other fields' values will be populated by various Assets APIs during the life of the card. Note the following fields' values in particular:
-
"metadata"
fields whose values are provided by the creator of the card:-
"name"
: "ConnectionForCSVFile" -
"description"
: "Connection for CSV file" -
"asset_type"
: "connection" -
"asset_attributes"
: [ "connection"]
-
-
"metadata"
fields whose values are set by various Assets APIs during the life of the card:"usage"
: contains various statistics describing usage of the card/asset"catalog_id"
: the ID of the catalog that contains the card"created_at"
: the time and date at which the card was created"asset_id"
: the ID of the card (not the asset resource)
For more info about the "metadata"
fields, see the discussion on "metadata" in the Assets API Overview section above.
The contents of the "entity"
field are only partially defined by the Assets API. In particular, the "entity"
field shown in the above card contains a field whose name must match the value in "metadata.asset_type"
, in this case, "connection"
. That field is the primary attribute.
On the other hand, both the names and the values of all the fields inside the primary attribute "entity.connection"
are completely determined by the creator of the "connection" asset type and the creator of the "connection" attribute. The Assets API does not, in general, decide what fields go inside the primary attribute (or any other attribute). In the example "connection" attribute above, some of the more interesting fields are:
"datasource_type"
- specifies the ID of the type of the data source to which a connection will be formed."properties"
- specifies connection metadata specific to the type of the datasource. The exact contents of this field will change according to the type of the datasource.
For more info on the contents of "entity"
in general, see the discussion on "entity" in the Assets API Overview section.
Notice the above card contains no "attachments" array. That means there is no attached asset resource associated with this card. A natural question is: how can "connection" asset metadata exist for, or describe, a non-existent "connection" asset resource? Actually, a "connection" asset resource does exist, but only when the metadata in the connection's primary metadata document is used to create a client-server connection at runtime.
Get Asset - for a CSV File
This section shows a far more typical example in which the primary metadata document (ie, card) does have an attached asset resource - in this case, a csv file named Sample.csv. Here's the very simple contents of the Sample.csv file:
Sample.csv file contents
Name,Number
abc,123
def,456
Use the GET Asset
API to retrieve the asset metadata for the Sample.csv asset resource. Note: the GET Asset
API only returns a primary metadata document (ie, card) that describes the Sample.csv file - it does not return the actual Sample.csv file.
Get Asset - Request URL:
GET {service_URL}/v2/assets/{asset_id}?catalog_id={catalog_id}
It's instructive to show two different versions of the primary metadata document for the Sample.csv asset:
- Before profiling (which returns a small metadata document - without extended metadata)
- After profiling (which returns a much larger metadata document - with extended metadata)
Note: you may find it helpful to look at the Parts of a Primary Metadata Document Figure before looking at either of the following two Get Asset Response Bodies.
Here is the smaller primary metadata document that exists before the Profile API is invoked on the Sample.csv file.
Get Asset - CSV File - Response Body - Before Profiling:
{
"metadata": {
"name": "Sample.csv",
"description": "A simple csv file.",
"asset_type": "data_asset",
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2018-11-06T17:45:23Z",
"last_updater_id": "IBMid-___",
"last_update_time": 1541526323713,
"last_accessed_at": "2018-11-06T17:45:23Z",
"last_access_time": 1541526323713,
"last_accessor_id": "IBMid-___",
"access_count": 0
},
"origin_country": "united states",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-2b7f-42fb-aa60-___",
"created": 1541526321437,
"created_at": "2018-11-06T17:45:21Z",
"owner_id": "IBMid-___",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"data_asset"
],
"asset_id": "45f4ab8c-37d5-45a1-8adf-___",
"asset_category": "USER"
},
"entity": {
"data_asset": {
"mime_type": "text/csv",
"dataset": false
}
},
"attachments": [
{
"id": "b8c7a390-e857-4c34-add8-___",
"version": 2,
"asset_type": "data_asset",
"name": "remote",
"description": "remote",
"connection_id": "070e9be2-40a8-4e0e-___",
"connection_path": "catalogforgettingsta-datacatalog-r1s___/data_asset/Sample_SyjEQUy6m.csv",
"create_time": 1541526323713,
"size": 0,
"is_remote": true,
"is_managed": false,
"is_referenced": false,
"is_object_key_read_only": false,
"is_user_provided_path_key": true,
"transfer_complete": true,
"is_partitioned": false,
"complete_time_ticks": 1541526323713,
"user_data": {},
"test_doc": 0,
"usage": {
"access_count": 0,
"last_accessor_id": "IBMid-___",
"last_access_time": 1541526323713
}
}
],
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/45f4ab8c-37d5-45a1-8adf-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___"
}
The above primary metadata document has all three primary groups of metadata ("metadata", "entity", and "attachments") that were described in the Assets API Overview section.
The contents of the "metadata" field are very similar to those shown above for the Connection card example. The most important difference is the value that the user specified as the "asset type" for the Sample.csv asset, namely "data_asset"
. That asset type name shows up in two places inside the "metadata" section of the primary metadata document:
"metadata"
:"asset_type"
: "data_asset""asset_attributes"
: [ "data_asset" ]
As discussed in the Attributes section, the fact that "metadata.asset_type"
has the value "data_asset"
means the "entity" field of the card must contain a primary attribute called "data_asset"
. The Asset Types API provides the predefined asset type "data_asset". That "data_asset"
type definition declares that there are two mandatory fields in a "data_asset"
attribute: "mime_type"
and "dataset"
, as can be seen in the card above and repeated here:
"entity"
:"data_asset"
:"mime_type"
: "text/csv"- specifies the mime type of the asset resource. Here, the mime type indicates that the asset resource is a text csv file.
"dataset"
: false- false because there is no "columns" field in this primary attribute.
- Note: false does not mean there are no columns in the asset resource. Clearly, our Sample.csv file does have columns. The problem here is that no one has (yet) told the card that the asset resource has columns. Compare this "data_set" attribute to the one shown in the next example Get Asset - CSV File - Response Body - After Profiling, where the value of "dataset" has been changed to true, and the primary attribute does have a "columns" field.
Unlike in the Connection card example above, the card for the Sample.csv file does have an "attachments"
field. In this case, the "attachments" array has one item in it. That item contains metadata that points to the attached asset resource (ie, the Sample.csv file). Some of the more interesting fields in that attachment item are:
"id"
: "b8c7a390-e857-4c34-add8-___"- identifies the metadata document that points to the attached asset resource
"asset_type"
: "data_asset"- matches the name of the primary attribute in "entity", so linking the primary attribute to this attachment item and designating this item as the item that points to the asset resource.
"connection_id"
: "070e9be2-40a8-4e0e-___"- identifies a connection primary metadata document (ie, card) which contains credentials and other info that can be use to connect to the external repository that contains the attached asset resource (ie, the "Sample.csv" file)
- not coincidentally, the particular connection card referred to by "070e9be2-40a8-4e0e-___" is the exact same connection card shown above in Get Asset - Connection Primary Metadata Document
"connection_path"
: "catalogforgettingsta-datacatalog-r1s___/data_asset/Sample_SyjEQUy6m.csv",- identifies the path in the external repository that contains the attached asset (ie, the "Sample.csv" file)
"is_remote"
: true- as discussed in the "attachments" overview section, is_remote is true because "connection_id" and "connection_path" are being used to describe how to get the Sample.csv asset resource.
"is_referenced"
: false (at most one of "is_referenced" and "is_remote" will be true)
Get Asset - CSV File - Response Body - After Profiling:
Now, let's compare what GET {service_URL}/v2/assets/{asset_id}?catalog_id={catalog_id}
returns for the same asset after the Profile API has been invoked on the Sample.csv file:
{
"metadata": {
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2018-11-12T15:33:34Z",
"last_updater_id": "iam-ServiceId-12345___",
"last_update_time": 1542036814782,
"last_accessed_at": "2018-11-12T15:33:34Z",
"last_access_time": 1542036814782,
"last_accessor_id": "iam-ServiceId-12345___",
"access_count": 0
},
"name": "Sample.csv",
"description": "Simple csv file for experiment for getting started document.",
"tags": [],
"asset_type": "data_asset",
"origin_country": "united states",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-2b7f-42fb-aa60-___",
"created": 1541526321437,
"created_at": "2018-11-06T17:45:21Z",
"owner_id": "IBMid-___",
"size": 9238,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"data_asset",
"data_profile"
],
"asset_id": "45f4ab8c-37d5-45a1-8adf-___",
"asset_category": "USER"
},
"entity": {
"data_asset": {
"mime_type": "text/csv",
"dataset": true,
"columns": [
{
"name": "Name",
"type": {
"type": "varchar",
"length": 1024,
"scale": 0,
"nullable": true,
"signed": false
}
},
{
"name": "Number",
"type": {
"type": "varchar",
"length": 1024,
"scale": 0,
"nullable": true,
"signed": false
}
}
]
},
"data_profile": {
"971e9c66-be4c-44b4-91f3-___": {
"metadata": {
"guid": "971e9c66-be4c-44b4-91f3-___",
"asset_id": "971e9c66-be4c-44b4-91f3-___",
"dataset_id": "45f4ab8c-37d5-45a1-8adf-___",
"url": "https://api.dataplatform.cloud.ibm.com/v2/data_profiles/971e9c66-be4c-44b4-91f3-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___&dataset_id=45f4ab8c-37d5-45a1-8adf-___",
"catalog_id": "c6f3cbd8-2b7f-42fb-aa60-___",
"created_at": "2018-11-12T15:32:53.902Z",
"accessed_at": "2018-11-12T15:32:53.902Z",
"owner_id": "IBMid-___",
"last_updater_id": "IBMid-___"
},
"entity": {
"data_profile": {
"options": {
"disable_profiling": false,
"max_row_count": 5000,
"max_distribution_size": 100,
"max_numeric_stats_bins": 200,
"classification_options": {
"disabled": false,
"use_all_ibm_classes": true,
"ibm_class_codes": [],
"custom_class_codes": []
}
},
"execution": {
"status": "finished",
"is_supported": true,
"dataflow_id": "3f1ace02-4d40-451d-9bc7-___",
"dataflow_run_id": "f774f92f-5a61-49ca-8a68-___"
},
"columns": [],
"attachment_id": "8d614be0-6900-403b-ab50-___"
}
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/data_profiles/971e9c66-be4c-44b4-91f3-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___&dataset_id=45f4ab8c-37d5-45a1-8adf-___"
},
"attribute_classes": [
"NoClassDetected",
"Organization Name"
]
}
},
"attachments": [
{
"id": "b8c7a390-e857-4c34-add8-___",
"version": 2,
"asset_type": "data_asset",
"name": "remote",
"description": "remote",
"connection_id": "070e9be2-40a8-4e0e-___",
"connection_path": "catalogforgettingsta-datacatalog-r1s___/data_asset/Sample_SyjEQUy6m.csv",
"create_time": 1541526323713,
"size": 0,
"is_remote": true,
"is_managed": false,
"is_referenced": false,
"is_object_key_read_only": false,
"is_user_provided_path_key": true,
"transfer_complete": true,
"is_partitioned": false,
"complete_time_ticks": 1541526323713,
"user_data": {},
"test_doc": 0,
"usage": {
"access_count": 0,
"last_accessor_id": "IBMid-___",
"last_access_time": 1541526323713
}
},
{
"id": "8d614be0-6900-403b-ab50-___",
"version": 2,
"asset_type": "data_profile",
"name": "data_profile_971e9c66-be4c-44b4-91f3-___",
"object_key": "data_profile_971e9c66-be4c-44b4-91f3-___",
"create_time": 1542036813627,
"size": 9238,
"is_remote": false,
"is_managed": false,
"is_referenced": true,
"is_object_key_read_only": false,
"is_user_provided_path_key": true,
"transfer_complete": true,
"is_partitioned": false,
"complete_time_ticks": 1542036813627,
"user_data": {},
"test_doc": 0,
"handle": {
"bucket": "catalogforgettingsta-datacatalog-r1s___",
"location": "us-geo",
"key": "data_profile_971e9c66-be4c-44b4-91f3-___",
"upload_id": "done",
"max_part_num": 1
},
"usage": {
"access_count": 0,
"last_accessor_id": "iam-ServiceId-12345___",
"last_access_time": 1542036813627
}
}
],
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/45f4ab8c-37d5-45a1-8adf-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___"
}
Let's look at a few of the most important differences between the primary metadata document for the Sample.csv file before and after profiling:
-
"metadata"
:"asset_attributes"
: [ "data_asset", "data_profile" ]- Note the "data_profile" attribute name has been added
-
"entity"
:-
"data_asset"
:"columns"
: the Profile API has added the"columns"
field to thedata_asset
attribute,"dataset"
: the Profile API caused this to change from false to true because of the newly added"columns"
field
-
"data_profile"
:- this
"data_profile"
attribute is entirely new, and was added by the Profile API. - the name of this secondary attribute matches the name of the secondary asset type "data_profile", which was (previously) created by the Profile API.
- the contents of this
"data_profile"
attribute was entirely decided by the Profile API, not by the Assets API. - this attribute contains a lot of extended metadata about the "data_profile" run that produced a
"data_profile"
extended metadata document.
- this
-
-
"attachments"
:- a new item has been added to the
"attachments"
array - that new item contains the following
metadata
about an extended metadata document:"id"
:"8d614be0-6900-403b-ab50-___"
"asset_type"
: "data_profile"- note that the value "data_profile" matches the name of the "data_profile" attribute that this attachment item belongs to, so linking the attachment item and the attribute.
"handle"
: contains various fields pointing to the actual attached extended metadata document which is located in some external repository. That extended metadata document will contain a great deal more metadata about the asset resource, that is, about the "Sample.csv" file.
- a new item has been added to the
The next section shows how to retrieve the Extended Metadata Document that's referred to by the new "data_profile" "attachments"
item just described above.
Get Attachment - Extended Metadata Document:
The following example builds on the GET Asset
example from the previous section and shows how to retrieve an attachment that is an extended metadata document.
An attachment can be retrieved in 4 steps.
The only choices you have for asset_type in a given primary metadata document are listed in that document's "metadata.asset_attributes"
field. In the example above those values are:
- "data_asset"
- "data_profile"
The asset_type of the extended metadata document we want is "data_profile".
Step 2: Get the "id"
of the "attachments"
item whose "asset_type"
field has the value you chose in Step 1.
In the primary metadata document, look for the only "attachments"
item whose "asset_type"
field has the value you chose in Step 1, namely "data_profile". In our example primary metadata document above, that "attachments"
item has the "id"
value "8d614be0-6900-403b-ab50-___"
.
Step 3: Invoke the Get Attachment
API to get attachment metadata for the attached extended metadata document.
Get Asset Attachment - Request URL
GET /v2/assets/{asset_id}/attachments/{attachment_id}
The values for the above URL parameters are obtained as follows:
-
{asset_id}
: is the same as what appears in the"metadata.asset_id"
field of the above primary metadata document, namely"45f4ab8c-37d5-45a1-8adf-___"
-
{attachment_id}
is the of"id"
that was obtained in Step 2, namely"8d614be0-6900-403b-ab50-___"
.
Invoke the above GET Attachment
API with the above values, which will return an attachment metadata document as shown in the following response body:
Get Asset Attachment - Response Body:
{
"attachment_id": "8d614be0-6900-403b-ab50-___",
"asset_type": "data_profile",
"is_partitioned": false,
"name": "data_profile_971e9c66-be4c-44b4-91f3-___",
"created_at": "2018-11-12T15:33:33Z",
"object_key": "data_profile_971e9c66-be4c-44b4-91f3-___",
"object_key_is_read_only": false,
"bucket": {
"bucket_name": "catalogforgettingsta-datacatalog-r1s___",
"bluemix_cos_connection": {
"viewer": {
"bucket_connection_id": "5b6bc03d-577d-4609-b3a4-___"
},
"editor": {
"bucket_connection_id": "070e9be2-40a8-4e0e-a468-___"
}
}
},
"url": "https://s3.us-south.objectstorage.softlayer.net/catalogforgettingsta-datacatalog-r1s___/data_profile_971e9c66-be4c-44b4-91f3-___?response-content-disposition=attachment%3B%20filename%3D%22data_profile_971e9c66-be4c-44b4-91f3-___%22&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20190423T162446Z&X-Amz-SignedHeaders=host&X-Amz-Expires=86400&X-Amz-Credential=d2d518b66ac64de___%2F2019___%2Fus-geo%2Fs3%2Faws4_request&X-Amz-Signature=ce7322d7291396c511a6df38635df4e85b7c78c173___",
"transfer_complete": true,
"size": 9238,
"user_data": {},
"creator_id": "iam-ServiceId-12345___",
"usage": {
"access_count": 1,
"last_accessor_id": "IBMid-___",
"last_access_time": 1556036686480
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/45f4ab8c-37d5-45a1-8adf-726c65b68008/attachments/8d614be0-6900-403b-ab50-___?catalog_id=c6f3cbd8-2b7f-42fb-aa60-___"
}
It's important to understand that the GET Attachment
API only returns a metadata document that describes where, or how, an attached asset resource or extended metadata document can be accessed or retrieved.
The most important field in the above response is "url"
which contains a signed URL that can be used to retrieve the actual extended metadata document. Note that the "url"
points to a completely different server than the server that responds to "Assets API" calls! Extended metadata documents are not stored in the catalog.
Step 4: Use the "url"
in the response from Step 3 to call the relevant server to get the extended metadata document.
The simplest way to use that "url"
value is to paste it into the address bar of a browser, and let the browser retrieve the extended metadata document. Here's a peek at some of the contents of the large extended metadata document that can be retrieved using that "url"
value. That large extended metadata document was created by the Profile API and contains a great deal of extended metadata about our small Sample.csv file:
{
"summary": {
"version": "1.9.3",
"row_count": 2,
"score": 1,
"score_stats": {
"n": 2,
"mean": 1.0,
"variance": 0.0,
"stddev": 0.0,
"min": 1.0,
"max": 1.0,
"sum": 2.0
},
...
},
"columns": [{
"name": "Name",
"value_analysis": {
"distinct_count": 2,
"null_count": 0,
"empty_count": 0,
"unique_count": 2,
"max_value_frequency": 1,
"min_string": "abc",
"max_string": "def",
"inferred_type": {
"type": {
"length": 3,
"precision": 0,
"scale": 0,
"type": "STRING"
}
},
...
}, {
"name": "Number",
"value_analysis": {
"distinct_count": 2,
"null_count": 0,
"empty_count": 0,
"unique_count": 2,
"max_value_frequency": 1,
"min_string": "123",
"max_string": "456",
"min_number": 123.0,
"max_number": 456.0,
"inferred_type": {
"type": {
"length": 3,
"precision": 3,
"scale": 0,
"type": "INT16"
}
},
...
]
}
Get Attachment - Asset Resource:
The 4 steps given above to retrieve an extended metadata document can also be used to retrieve an asset resource like the Sample.csv file example.
The main difference is that in Step 1 you would choose the asset_type "data_asset" because that is the primary asset type of the primary metadata document, ie. the asset_type that identifies both the primary attribute and the primary attachment, ie, the asset resource.
Create Asset: book
Before you can create a primary metadata document (ie, card) the asset type that you want to use for that card must already exist. You can use one of the already available asset types, or you can use an asset type that you have created.
The Create Asset Type: book section shows how to create an asset type named book
. In this section, that asset type will be used to create a primary metadata document for a book asset resource. That primary metadata document will have:
- a
"metadata.asset_type"
field with the value"book"
- a primary attribute called
"book"
.
Use the following endpoint to create a primary metadata document for a book asset resource:
Create Asset: book - Request URL:
POST {service_URL}/v2/assets?catalog_id={catalog_id}
Create Asset: book - Request Body:
{
"metadata": {
"name": "Getting Started with Assets",
"description": "Describes how to create and use metadata for assets",
"tags": ["getting", "started", "documentation"],
"asset_type": "book",
"origin_country": "us",
"rov": {
"mode": 0
}
},
"entity": {
"book": {
"author": {
"first_name": "Tracy",
"last_name": "Smith"
},
"price": 29.95
}
}
}
The above request body specifies the preliminary contents for the primary metadata document about to be created. Most of the fields have been described previously in the Asset's Primary Metadata Document section. However, there are a few things to note in particular about the above request:
"metadata"
: you supply the values of only some of the fields that will end up appearing inside the"metadata"
field of the primary metadata document about to be created, including:"asset_type"
: the value"book"
matches the name of the asset type for this document"name"
: the name to use for the asset being described by this document"description"
: a description for the asset
Notice that you do not supply a "metadata.asset_attributes"
field in the request body. If you include a "metadata.asset_attributes"
field in your Create Asset request body then the request will be rejected because it tried to supply a reserved value. The Assets API reserves control of the contents of the "metadata.asset_attributes"
field.
"entity"
: you supply the entire contents of the"entity"
field"book"
:- this is the primary attribute of the primary metadata document
- the name of this attribute matches the name of the corresponding primary asset type "book"
- contains metadata describing a book (does not contain the actual book asset resource)
Notice the above "book"
attribute doesn't contain a field called "title" - a field which might be expected in an attribute for a book. In this case, we've chosen to put the title of the book in the "metadata.name"
field of the card. However, the creator of the "book"
attribute is free to include whatever fields they want in that attribute, including a field called "title" if desired.
Create Asset: book - Response Body:
{
"metadata": {
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2019-04-30T14:37:57Z",
"last_updater_id": "IBMid-___",
"last_update_time": 1556635077746,
"last_accessed_at": "2019-04-30T14:37:57Z",
"last_access_time": 1556635077746,
"last_accessor_id": "IBMid-___",
"access_count": 0
},
"name": "Getting Started with Assets",
"description": "Describes how to create and use metadata for assets",
"tags": [
"getting",
"started",
"documentation"
],
"asset_type": "book",
"origin_country": "us",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-___",
"created": 1556635077746,
"created_at": "2019-04-30T14:37:57Z",
"owner_id": "IBMid-___",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"book"
],
"asset_id": "3da5389d-d4a4-43da-be1f-___",
"asset_category": "USER"
},
"entity": {
"book": {
"author": {
"first_name": "Tracy",
"last_name": "Smith"
},
"price": 29.95
}
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/3da5389d-d4a4-43da-be1f-___?catalog_id=c6f3cbd8-___",
"asset_id": "3da5389d-d4a4-43da-be1f-___"
}
Notice that the card returned in the Create Asset Response Body has many more fields than were present in the Request Body. The Create Asset API has added a lot of information to the "metadata"
part of the primary metadata document:
"asset_id"
: most importantly, the Create Asset API has given your primary metadata document an id"owner_id"
: the API has made the caller of the API be the owner of the asset"created_at"
: the API has recorded the time at which the metadata document was created. In general, this is not the same as the time at which an attached asset resource was created (although in this case there is no attached asset resource)."total_ratings"
: contains the number of ratings this asset has recieved. 0 for now because the primary metadata document is brand new."usage"
: usage statistics. Since this is a brand new card these statistics don't yet contain much interesting data."asset_attributes"
: notice that the Create Asset API has added the name of the primary attribute to this array.
On other hand, notice that the Create Asset API did not modify the contents of the "entity"
field in any way. In particular, the Create Asset API did not modify the contents of the primary attribute "book"
.
Your catalog now contains a primary metadata document for a "book" asset resource.
Associate a Term with an Asset
You can associate one or more terms with an asset. Within the asset's primary metadata document a pre-defined attribute called "asset_terms"
(one of the global asset_types) is used to contain the array of terms associated with the asset.
The first time any term is associated with an asset, the Create Asset Attribute API must be used to create and populate an instance of the "asset_terms"
attribute inside the asset's primary metadata document. In the following example, we're only associating a single term with an asset, so the "list" value in the request body is an array with a single element containing info related to that single term.
POST {service_URL}/v2/assets/{asset_id}/attributes?catalog_id={catalog_id}
Body:
{
"name": "asset_terms",
"entity": {
"list": [
{
"term_id": "--insert your term's id here--",
"term_display_name": "first_term"
}
]
}
}
Note: the field named "term_display_name" is not strictly mandatory, but is highly recommended because its value is what will be used when searching for assets by term. Its value is also the term name that users will see in the Catalog UI.
The response should look like:
{
"asset_id": "--your asset id--",
"asset_terms": {
"list": [
{
"term_id": "--your term id--",
"term_display_name": "first_term"
}
]
},
"href": ".../v2/{asset_id}/attributes/asset_terms?catalog_id={catalog_id}"
}
The next time you GET the asset you should see the "asset_terms"
attribute you just created:
{
"metadata": {
...
},
"entity": {
...
"asset_terms": {
"list": [
{
"term_id": "--your term id--",
"term_display_name": "first_term"
}
]
}
},
"attachments": [
...
]
}
If you want to associate additional terms with an asset, or remove existing terms, then use the Update Asset Attribute API. For example, the following adds an additional term to the already existing "asset_terms"
attribute of the asset used in the above example:
PATCH {service_URL}/v2/assets/{asset_id}/attributes/asset_terms?catalog_id={catalog_id}
Body:
[
{
"op": "replace",
"path": "/list",
"value": [
{
"term_id": "--insert your term's id here--",
"term_display_name": "first_term"
},
{
"term_id": "--insert your second term's id here--",
"term_display_name": "second_term"
}
]
}
]
You can use GET the asset to see the new collection of terms in the asset_terms attribute of the asset.
Associate a Classification with an Asset
You can associate one or more classifications with an asset. The classifications associated with the asset reside within the "data_profile"
attribute of the asset. When user runs asset profiling, then the profiling job will add the suitable classifications. However, users can manually add classifications to an asset also.
If profile has not run on the asset, then the "data_profile"
attribute will not exist. In such a scenario, the Create Asset Attribute API must be used to create and populate an instance of the "data_profile"
attribute inside the asset's primary metadata document. Since the classification is added manually, the property added will be "data_classification_manual"
.
In the example below, we are adding a single classification to an asset in a catalog. If you don't know the id and global id of the classification, you can run the API GET artifact of a given type. If you know the id of the classification, you can obtain the global_id of the classification by running the GET classification API.
POST {service_URL}/v2/assets/{asset_id}/attributes?catalog_id={catalog_id}
Body:
{
"name": "data_profile",
"entity": {
"data_classification_manual": [
{
"id": "--insert your classification's id here--",
"name": "your_classification_name",
"global_id": "--insert your classification's global id here--"
}
]
}
}
The response should look like:
{
"asset_id": "{asset_id}",
"data_profile": {
"data_classification_manual": [
{
"id": "{classification id}",
"name": "your_classification_name",
"global_id": "{classification global id}"
}
]
},
"href": "/v2/assets/{asset_id}/attributes/data_profile?catalog_id={catalog_id}"
}
You can use GET the asset to see the new collection of classifications in the data_profile attribute of the asset.
{
"metadata": {
...
},
"entity": {
...
"data_profile": {
"data_classification_manual": [
{
"id": "{classification id}",
"name": "your_classification_name",
"global_id": "{classification global id}"
}
]
}
},
"attachments": [
...
]
}
If you want to associate additional classifications with an asset, or remove existing classifications, then use the Update Asset Attribute API. For example, the following adds an additional classification, to the existing "data_profile"
attribute of the asset, without replacing the classification added in the previous example:
PATCH /v2/assets/{asset_id}/attributes/data_profile?catalog_id={catalog_id}
Body:
[
{
"op": "add",
"path": "/data_classification_manual/1",
"value":
{
"id":"--insert your second classification's id here--",
"name":"your_second_classification_name",
"global_id":"--insert your second classification's global id here--"
}
}
]
The response should look like:
{
"data_classification_manual": [
{
"id": "{classification id}",
"name": "your_classification_name",
"global_id": "{classification global id}"
},
{
"id": "{second classification id}",
"name": "your_second_classification_name",
"global_id": "{second classification global id}"
}
]
}
The following example replaces all the classifications associated with the asset and replaces with a new list
PATCH /v2/assets/{asset_id}/attributes/data_profile?catalog_id={catalog_id}
Body:
[
{
"op": "replace",
"path": "/data_classification_manual",
"value":
[
{
"id":"--insert your classification's id here--",
"name":"your_classification_name",
"global_id":"--insert your classification's global id here--"
}
]
}
]
The response should look like:
{
"data_classification_manual": [
{
"id": "{classification id}",
"name": "your_classification_name",
"global_id": "{classification global id}"
}
]
}
Duplicate Asset
Duplicate Asset Overview
When a CAMS call tries to create an asset (e.g. create a new asset, promote/publish/clone an asset, etc.), CAMS can optionally detect pre-existing duplicate assets and take appropriate actions based on configurations and query parameters, e.g., ignoring the duplicates and create a new asset, or failing the call and returning an error saying duplicates were found, or updating the existing duplicate.
Similarly, when a CAMS call tries to update an asset, CAMS can also optionally detect if the asset with the incoming change would have any duplicates if the change were persisted and take appropriate actions based on configurations and query parameters, e.g., ignoring the duplicates and update the asset, or failing the call and returning an error saying the change would result in duplicates.
This process is called duplicate asset processing. This section describes how the duplicate asset processing works in CAMS and how you can make it work in the ways you desire.
What is a duplicate
An asset is considered a duplicate if it fits any of the following scenarios:
- Original asset - the asset that the incoming asset was originally cloned/published from.
For instance, if you cloned/published an asset A to a project/catalog and resulted in asset B, and then try to publish/clone the asset B back to the original catalog/project, the asset A will be seen as the original asset and considered as a duplicate.
- Copies of the same asset - an asset is cloned/published/promoted from the same asset as the incoming asset
For example, if you cloned/published/promoted an asset A to a project/catalog/space and resulted in asset B, and then try to clone/publish/promote the asset A again to the same target project/catalog/space, the asset B will be seen as the copy of the same asset and considered as a duplicate.
- Asset with the same values - an asset has the same values as the incoming asset based on the effective duplicate detection strategy of the asset type. See Duplicate Detection Strategy section for more details about duplicate detection strategy.
Let's say that the effective duplicate detection strategy of the asset type data_asset
in a project is DUPLICATE_DETECTION_BY_NAME
(i.e., the duplicate detection will base on the metadata.name
field). If you try to create an asset of type data_asset
with the name KPIReport2021
in this project, and there is an existing asset A of type data_asset
with the same name, then the asset A will be considered as a duplicate.
What to do with a duplicate
Users can set the configuration duplicate_action
of the asset containers and/or specify the query parameter duplicate_action
while calling endpoints to control how the service handles duplicate assets. The valid values of duplicate_action
for calls creating a new asset are:
IGNORE
- ignore the duplicates and create a new assetREJECT
- fail the call and return an error response similar to the one below (no asset will be created):
{
"trace": "290c281c-4adc-4e40-aa49-aaf7cd2dbf6a",
"errors": [
{
"code": "already_exists",
"message": "ASTSV3040E: Duplicate assets exist. '[cc5f7412-5c96-4d66-9c14-40b3c944ad79, 244a3612-63a8-4140-9423-f40841be33ee]'"
}
]
}
UPDATE
- update the duplicate with the incoming changes. See Multiple duplicates for how to choose the duplicate for updating if more than one duplicate is found. See here on how the duplicate asset is updated.REPLACE
- overwrite the duplicate with the input values. See Multiple duplicates for how to choose the duplicate for overwriting if more than one duplicate is found. See here on how the duplicate asset is overwritten.
The valid values of duplicate_action
for calls making changes to an existing asset (including restoring a deleted asset) are:
IGNORE
- ignore the duplicates and update the assetREJECT
- fail the call and return an error response similar to the one below. The asset will not be updated or restored.
{
"trace": "8ea27315-c958-435a-b415-e3632a664dbc",
"errors": [
{
"code": "already_exists",
"message": "ASTSV3128E: The asset will have duplicate assets with IDs '[a0a54f65-8a3d-4f22-95b7-5456d8e43a71]' after saving the change."
}
]
}
Or the below for restoring an asset. Note that the reason that the id of the asset that will have duplicates is also in the message is that restoring an asset may potentially restore multiple related assets and it is possible that some of these related assets will have duplicates.
{
"trace": "7b2668a4-c794-4720-83dd-f4f16e9d0a04",
"errors": [
{
"code": "already_exists",
"message": "ASTSV3128E: The asset '8c740814-5fc0-4294-a2be-f042f15098f1' will have duplicate assets with IDs '[a0a54f65-8a3d-4f22-95b7-5456d8e43a71]' after being restored."
}
]
}
UPDATE
andREPLACE
are not allowed to be used for the query parameterduplicate_action
for calls updating assets. However, if the query parameterduplicate_action
is not supplied and the configurationduplicate_action
is set to one of these values in the asset container level, they will be the effective value of theduplicate_action
. In which case, they will be treated the same asREJECT
.
The configuration duplicate_action
can be set in the asset container level during the creation of a container and can be modified later by using the endpoint PUT /v2/asset_containers/configurations
. If the configuration duplicate_action
is not specified in an asset container, it will be equivalent to IGNORE
.
The following example shows how to supply the configuration duplicate_action
(along with other duplicate asset processing related configurations) while creating a catalog:
{
"name": "my catalog",
...
"configurations": {
"duplicate_action": "REJECT",
...
}
}
The following example shows how to update the configuration by using the endpoint PUT /v2/asset_containers/configurations
:
{
"duplicate_action": "REPLACE",
...
}
The configuration duplicate_action
can be overwritten by a query parameter duplicate_action
for individual calls to control how CAMS handles duplicates for these particular calls. The endpoints that support the query parameter are listed below. Note that the allowed options may differ from endpoint to endpoint depending on if the endpoint supports all available options.
POST /v2/assets
POST /v2/data_assets
POST /v2/assets/{asset_id}/publish
POST /v2/assets/{asset_id}/clone
POST /v2/assets/{asset_id}/promote
POST /v2/assets/{asset_id}/deepcopy
POST /v2/assets/bulk_create
POST /v2/assets/bulk_patch
PATCH /v2/assets/{asset_id}
POST /v2/assets/{asset_id}/attributes
PATCH /v2/assets/{asset_id}/attributes/{attribute_key}
DELETE /v2/assets/{asset_id}/attributes/{attribute_key}
POST /v2/assets/{asset_id}/attachments
DELETE /v2/assets/{asset_id}/attachments/{attachment_id}
POST /v2/trashed_assets/{asset_id}/restore
Duplicate Detection Strategy
The duplicate detection strategy defines what fields to be used for determining if assets of a particular asset type are duplicates. The available duplicate detection strategies are:
DUPLICATE_DETECTION_BY_NAME
- themetadata.name
fieldDUPLICATE_DETECTION_BY_RESOURCE_KEY
- themetadata.resource_key
fieldDUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
- themetadata.name
and themetadata.resource_key
fieldsDUPLICATE_DETECTION_NOT_APPLICABLE
- no duplicate will be determined. If this strategy is used, no duplicate will be determined by the strategy. At the same time, it also disables the duplicate detection fororiginal asset
andcopies of the same asset
. In other words, it disables duplicate detection for assets of the asset type completely.
The duplicate detection strategy can be set in several levels as shown below (from the highest priority to the lowest priority). Setting with a higher priority will take precedence over the setting with a lower priority.
- Strategy of the asset type defined in the asset container configuration, e.g.,
{
"duplicate_strategies": [
{
"asset_type": "data_asset",
"strategy": "DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY"
}
]
}
- Strategy specified in the asset type definition, e.g.,
{
"description": "Job Run",
"fields": [],
"identity": {
"strategy": "DUPLICATE_DETECTION_NOT_APPLICABLE"
}
}
-
System default strategy of the asset type in the residing asset container. i.e.,
- Project/Space:
connection
:DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
- Catalog:
connection
:DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
data_asset
:DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
-
The
default_duplicate_strategy
specified in the asset container configuration (which applies to all asset types), i.e.,
{
"default_duplicate_strategy": "DUPLICATE_DETECTION_BY_NAME"
}
- System default strategy
DUPLICATE_DETECTION_BY_NAME
(which applies to all asset types)
Multiple duplicates
It is possible that a call may find more than one duplicate for a given asset. It could be that the duplicates were created before CAMS started processing duplicate assets, or the duplicates were created because multiple calls were creating the same asset at the same time.
If the effective value for duplicate_action
is REJECT
, CAMS will fail the call and return the asset ids of all the duplicates in the error response. If the effective value for duplicate_action
is UPDATE
or REPLACE
, CAMS will rank the duplicates based on the following order and choose the duplicate that has the highest score and the caller has permission to update as the target for updating or overwriting. When all things are equal, the duplicates will be ranked by the created time (i.e., the metadata.created
field) from the earliest to the most recent.
- Original asset
- Copies of the same asset
- Asset with the same values
How the duplicate asset is updated or overwritten
If the effective value of duplicate_action
is UPDATE
or REPLACE
, the following information of the duplicate asset that has the highest score and the caller has permission to update will be updated or overwritten with the values from the incoming asset based on the rules. Any other aspects of the duplicate asset will not be updated or overwritten, e.g., privacy settings, collaborators, owners, source asset, ratings, revisions, non-inline relationships, related assets, etc.
- metadata
metadata.name
metadata.description
metadata.resource_key
metadata.origin_country
metadata.tags
- attributes (e.g.,
entity.data_asset
,entity.data_profile
,entity.column_info
, etc.) - attachments (i.e.,
attachments[*]
)
If duplicate_action
is UPDATE
, the chosen duplicate asset will be updated as described below. The result will look the same as the duplicate asset but with the incoming changes.
- The listed metadata fields will be replaced with the values from the incoming asset if the value from the incoming asset is not
null
- All attributes in the incoming asset will be copied over and any existing attribute will be replaced. Some exceptions may apply, e.g.,
- If the existing or incoming attribute
connection
represents a reference connection, the existing attributeconnection
will not be replaced. - Reference copy terms are preserved
- If the existing or incoming attribute
- All attachments in the incoming asset will be copied over. Any existing attachments with the same
asset_type
as the attachments in the incoming asset will be removed.
If duplicate_action
is REPLACE
, the chosen duplicate asset will be updated as described below. The result will look the same as the incoming asset with some exceptions.
- All listed metadata fields will be replaced with the values from the incoming asset (even if the value from the incoming asset is
null
) - All existing attributes will be removed and all attributes in the incoming asset will be copied over with the exception of the attribute
connection
. If the existing or incoming attributeconnection
represents a reference connection, the existing attributeconnection
will remain unchanged. - All existing attachments will be removed and all attachments in the incoming asset will be copied over
Let's say you have an existing asset that looks like below
{
"metadata": {
"name":"B",
"description":"C",
"tags":[
"confidential"
]
},
"entity": {
"data_asset": {
"mime_type": "binary"
},
"something": {}
},
"attachments":[
{
"asset_type": "data_asset",
"name": "attachment 1"
},
{
"asset_type": "something",
"name": "attachment 2"
}
]
}
and you try to add the following asset
{
"metadata": {
"name":"A",
},
"entity": {
"data_asset": {
"mime_type": "text/csv"
}
},
"attachments":[
{
"asset_type": "data_asset",
"name": "attachment 3"
}
]
}
If the effective duplicate_action
is UPDATE
, the existing asset will be modified to be the following
{
"metadata": {
"name":"A",
"description":"C",
"tags":[
"confidential"
]
},
"entity": {
"data_asset": {
"mime_type": "text/csv"
},
"something": {}
},
"attachments":[
{
"asset_type": "something",
"name": "attachment 2"
},
{
"asset_type": "data_asset",
"name": "attachment 3"
}
]
}
If the effective duplicate_action
is REPLACE
, the existing asset will be modified to be the following
{
"metadata": {
"name":"A"
},
"entity": {
"data_asset": {
"mime_type": "text/csv"
}
},
"attachments":[
{
"asset_type": "data_asset",
"name": "attachment 3"
}
]
}
Backup revision
When the best duplicate is updated as a result of asset duplicate processing, a revision is created in case we want to go back or review what was changed. The revision will contain commit information similar to below:
{
"committed_at": "2020-11-17T08:13:39.103Z",
"commit_message": "Backup prior to update the best duplicate",
"reason": "update_duplicate",
"duplicate_source": {
"operation": "clone",
"asset_id": "ca0007d9-051d-478f-b87f-82f38fc6997c",
"catalog_id": "c548b021-e026-49a6-aa60-35fe478afdb5"
}
}
The asset in the response will contain the previous_revision
field in such a case. Which can be used to determine if the call indeed created a new asset or updated an existing asset.
{
"metadata": {
...,
"commit_info": {
"previous_revision": 1
}
},
"entity": {
...
},
"asset_id": "c1bf6686-836c-4f93-b173-c2ed52da8e76"
}
Check duplicates before creating an asset
The duplicate asset processing automatically kicks in when a CAMS call tries to create an asset. However, in some cases, you may want to check possible duplicates before creating an asset or before publish/clone/promote/deepcopy an asset. CAMS provides an endpoint POST /v2/assets/duplicates/search
to help you do this. You can either supply an existing asset or an asset payload to check the duplicates. The endpoint will list all the potential duplicates and why they were considered duplicates.
Lineage and Activity event messages change
If the query parameter duplicate_action
is UPDATE
or REPLACE
and duplicate assets are found, the calls will change from creating a new asset to updating an existing asset. As a result, the corresponding Lineage and Activity event messages will also change from creating
an asset to updating
an asset.
Known issues/limitations
Duplicate assets may be created due to race condition
When a call is made to create an asset, CAMS searches for potential duplicates and creates the asset only if no duplicate is found. If multiple calls are made at the exact same time to create the same asset, it is possible to result in a duplicate. For example, two calls try to create the same asset at the same time, and both calls search for potential duplicates and do not find any duplicates; both calls think there is no duplicate and create the asset. As the result, the same asset will be created twice and each asset is a duplicate of the other.
There is currently no way to prevent such cases from happening. If this happened, the user would have to choose one of these assets and delete others. Internally, CAMS always favours the asset that has the earliest value in the metadata.created
field and chooses it for future duplicate updating/overwriting operations.
Troubleshooting
Duplicate assets are not detected
Sometimes you may see assets that look like duplicates but are not detected as duplicates. You can call the GET /v2/assets/{asset_id}
API to get the JSON representation of these assets and compare the fields that are used for identifying duplicates. If the fields are not the same, then the assets are not duplicates. Otherwise, it may fall into one of the situations where real duplicates are not detected. Please see the Known issues/limitations section for more details.
For instance, if the effective duplicate detection strategy is DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
, you can compare the metadata.name
and the metadata.resource_key
fields of these assets and see if they are indeed the same; if the effective duplicate detection strategy is DUPLICATE_DETECTION_BY_NAME
, you can compare the metadata.name
field of these assets. For details about what fields are used for a strategy, please see Duplicate detection strategy section.
Asset Types
Asset Types serve multiple purposes in the Assets API. Asset types fall into two categories:
-
Primary asset type:
- describes the primary type of an asset
- every primary metadata document (ie, card) will have exactly one primary asset type, whose name will be stored in the card's
"metadata.asset_type"
field - every card will have exactly one primary attribute whose name matches the name of the primary asset type
- a very common example of a primary asset type is the "data_asset" type, examples of which are shown throughout this documentation
-
Secondary / Extended asset type:
- a secondary / extended asset type describes an inter-related group of additional metadata for an asset resource
- a primary metadata document can have 0, 1, or many secondary / extended asset types
- information for a secondary / extended asset type is stored in a secondary / extended attribute in a primary metadata document
- a very common example of a secondary / extended asset type is "data_profile". See Get Asset - CSV File - Response Body - After Profiling for an example "data_profile" attribute.
- secondary / extended / decorator asset types can be used to simulate type inheritance
The names of various asset types are used in the following ways, all at once, within a single primary metadata document:
- describe the type of an asset resource, via the
"metadata.asset_type"
field - describe the type of an object that contains extended information for an asset resource. For example, the type of an extended metadata document via an
"attachments[_].asset_type"
field. - assign names and types to attributes in the "entity" field of a primary metadata document
- implicitly tie various related parts of a primary metadata document to each other. For example, see the green rectangles and arrows in the Parts of a Primary Metadata Document Figure.
The content, or definition, of an asset type serves the following purposes:
- tell the catalog what fields of an attribute should be indexed for searching
- specify search paths and cross attribute searching
- specify additional features like relationships and external asset previews (both of which are beyond the scope of this document)
An asset type must exist in the catalog before it can be used for any of the above purposes.
As of this writing there are several asset types available, including the following:
- data_asset
- folder_asset
- policy_transform
- asset_terms
- column_info
- connection
- ai_training_definition
- data_flow
- activity
- notebook
- machine-learning-stream
- dashboard
- data_profile_nlu
You are free to use any of the above asset types. You do not have to, nor are you allowed to, create or over-write any of the above asset types.
Use the Create Asset Type API to create your own asset type. See Asset Type Fields for an overview of the specification of an asset type. See Create Asset Type: book for an example of creating an asset type.
Asset Type Fields
Here is a description for each of the fields in the definition of an asset type. You supply values for these fields when creating an asset type. You will see those same values returned when you get a list of asset types or get a specific asset type.
"name"
:- the name and identifier for the asset type
- should contain only lowercase letters
- will be used in various places in primary metadata documents, including:
- can be used in catalog searches of attribute contents
"description"
: a description for this asset type"fields"
:- an array that contains information for the fields in the corresponding attribute that should be indexed for subsequent searches.
- does not (necessarily) describe all the fields in attributes of this asset type.
- there must be at least one item in this array. In other words, there must be at least one index for an asset type.
- see the following Fields Table for a description of the contents of an item in the
"fields"
array - see "fields" and "properties" Note below
"properties"
:- an object that contains "non-index" information for the fields in the corresponding attribute. This information is typically used by UIs that display/edit assets.
- does not (necessarily) describe all the fields in attributes of this asset type.
- see the following Properties Table for a description of the contents of an item in the
"properties"
object - see "fields" and "properties" Note below
"external_asset_preview"
: beyond the scope of this document"relationships"
: beyond the scope of this document"localized_metadata_attributes"
: a description for this asset type"decorates"
: list of pointers to types this type can decorate"identity"
: identity definition of the assets of the type"global_search_searchable"
: list of fields made searchable through Global Search functionality- The fields correspond the the
"key"
defined in the field
- The fields correspond the the
"attribute_behavior"
: define the behavior only if this asset type is used as an attribute- it contains a field
"remove_on_copy"
. This field can take boolean values true or false. - when
"remove_on_copy"
is true, then during cloning, publishing or promoting the asset, the attribute is removed from the asset
- it contains a field
"attribute_only"
: takes boolean values true or false. When true, then this attribute will be used as custom attributes"data_protection"
: asset type properties and attachments that should be removed when user who does not have access is trying to view the data"is_column_custom_attribute"
: takes boolean values true or false. When true, then this attribute can be applied as a column level attribute. See Columns and Column Level Attributes"can_have_image"
: takes boolean values true or false. Set to true if asset type is expected to support uploading of images"icon_id"
: name or id of the svg image to be used as the icon for the asset type"allow_decorators"
: takes boolean value true or false. When true, custom attributes can decorate this asset type
Note: "fields" and "properties" can, optionally, both be used to describe the exact same field in an attribute. Whether you use "fields"
and/or "properties"
depends on what you want to specify for a field. For example, if you're creating an asset type named "person"
and a person has a field called "birthdate"
(resulting in "entity.person.birthdate"
being present in the primary metadata document) then:
- if you want
birthdate
to be indexed (for searching) then you would include an entry in the"fields"
array forbirthdate
- if you want a UI to understand/display the
birthdate
properly then you would include an entry in the"properties"
object for that samebirthdate
field
See this example which shows both an example "fields"
array and an example "properties"
object.
Key | Description | Example | Required |
---|---|---|---|
key | the name of both the field that will appear in an attribute for this asset type, and the name of the corresponding index for that attribute field | data_asset.mime_type | Yes |
type | the data type of the field being indexed | boolean, or number, or string | Yes |
facets | beyond the scope of this document | true or false | No. Defaults to false. |
search_path | a json path that locates a field in the attribute | See Search Path Examples below. | Yes |
is_searchable_across_types | specifies whether this field can be used in a query without specifying the asset type | true or false | No. Defaults to false. |
Name | Type | Description |
---|---|---|
type | String | Specifies the data type for the property. This value is required. Possible types are: string, number |
description | String | A displayable string to describe the property. |
is_array | boolean | true if the property value is multi-valued (json array). |
required | boolean | true if the property requires a value to be set. |
hidden | boolean | true if the application UI should not display the property or value. |
readonly | boolean | true if the property should not be changed once set. |
default_value | matches the "type" | A value that should be set if no value is provided when the asset attribute is created. |
placeholder | string | A string an application UI can use as a prompt before a value is entered. |
values | array, elements matching "type" | An array of allowed values for the property. Used to describe a limited enumeration or "choice list". |
minimum | integer/number | For an integer or number property, the minimum allowed value. |
maximum | integer/number | For an integer or number property, the maximum allowed value. If both minimum and maximum are specified, minimum must be less than or equal to maximum. |
min_length | integer | For a string property, the minimum allowed length. If specified, must be greater than or equal to zero. |
max_length | integer | For a string property, the maximum allowed length. If specified, must be greater than or equal to zero. If both min_length and max_length are specified, min_length must be less than or equal to max_length. |
properties | object | For a property of type 'object', the recursive definition of the properties, described as in this table. This allows describing nested object-valued properties. |
Search Path Examples
-
See the request body in Create Asset Type: book for an example of where a search path is used in the definition of an asset type.
- Note: when you specify a search path in the definition of an asset type's
"field"
, you only specify the path within the correspondingly named attribute. You needn't specify the attribute name. For example if you have an attribute called"book"
that has a field called"author.last_name"
within it, you only need to specify"author.last_name"
as the search path - not"book.author.last_name"
.
- Note: when you specify a search path in the definition of an asset type's
-
See Search Asset Type: attribute - book for an example of where a search path is used in the body of a search.
- Note: when you specify a search path in the body of search you must specify the name of the attribute being searched. For example if you have an attribute called
"book"
that has a field called"author.last_name"
within it, you would include the name of the attribute in the search path:"book.author.last_name"
.
- Note: when you specify a search path in the body of search you must specify the name of the attribute being searched. For example if you have an attribute called
-
"price"
: a simple path contains just the name of the field to be searched. In this case the attribute being searched should have a simple field called"price"
. -
"tags[]"
: traverse a json array called"tags"
. Because tags[] is not followed by any further names it must be a basic type (e.g. string, boolean, or number), and so its elements will be indexed directly. -
"asset_terms[].name"
: this search path indicates a path starting with a json object named"term_assignments"
at the top, traversing through a json array named asset_terms (you use the [] at the end of the field name to indicate it's an array), landing on another json object that has a field called"name"
. The"name"
field will be indexed. -
"asset_terms[0].name"
: same as above but only the first element in the"asset_terms"
array will be traversed. -
"columns.*.tags[]"
: traverse an object called"columns"
followed by any column name (the '*' indicates a wildcard), followed by a json array called"tags"
. Because tags[] is not followed by any further names it must be a basic type (e.g. string, boolean, or number), and so its elements will indexed directly. -
"column_tags.*[]"
: the json object"column_tags"
contains a series of arrays indicated by *[]. The name of the array object doesn't matter - we want to index it.
Global Search searchable custom attributes
Fields that have been identified as Global Search searchable by being included in global_search_searchable
array by their key, will be synchronized as custom attributes to Global Search microservice, and will become searchable via Global Search. Note that field will only be synchronized to Global Search if field definition contains a valid search_path
. Values found at that search_path
will be synchronized to Global Search, otherwise field will be ignored. Any values provided in global_search_searchable
array which do not correspond to any existing fields of the type will be ignored.
An example of valid configuration of global_search_searchable
attributes would be:
{
"name": "book",
"description": "Book asset type",
"fields": [
{
"key": "last_name",
"type": "string",
"facet": false,
"is_array": false,
"search_path": "author.last_name",
"is_searchable_across_types": true
}
],
"global_search_searchable": [
"last_name"
],
"properties": {
"price" : {
"type": "number",
"description": "Suggested retail price",
}
}
}
Following are key characteristics of a valid definition:
- Key of a field
last_name
is included inglobal_search_searchable
array. - Field
last_name
has a properly definedsearch_path
which contains a valid json path.
Common pitfalls:
- Not including
search_path
into the field definition will cause the field to be ignored and prevent it from indexing. Similarly, an invalid json path definition withinsearch_path
which either is not formatted correctly, or points to a wrong location will prevent field value from being indexed asglobal_search_searchable
. - Make sure that only the key value of the field is used inside
global_search_searchable
. In the above examplelast_name
is the name of the field, so theglobal_search_searchable
array contains anlast_name
.
data_asset Type
"data_asset"
is by far the most commonly used already available asset type. It can be seen in:
- the Parts of a Primary Metadata Document Figure
- many of the examples in the Assets API Examples and Asset Types API Examples sections
- the default asset type used when you drag an asset resource file onto the Create Asset page.
The reason "data_asset"
is so popular is that it is a generic asset type that allows you to declare a specific type for a given asset resource without explicitly creating an asset type named after that specific type. For example, say you want to create a primary metadata document for a csv file. You could first create a specific asset type named, say, "csv_file", and then create a primary metadata document (for that csv file) and specify "csv_file" as the value for "metadata.asset_type"
. However, you can avoid creating a specific "csv_file" asset type by instead using the generic "data_asset" asset type and then use the "mime_type" field of the "data_asset" attribute to declare that the specific type of your asset resource is a csv file. To do so, the primary metadata document for the csv file would have:
- a
"metatada.asset_type"
value of the generic type"data_asset"
- a
"entity.data_asset.mime_type"
value of the specific type"text/csv"
.
The fields "asset_type"
and "mime_type"
both describe the "type" of the asset resource. However:
- the type specified by the
"metatada.asset_type"
field (ie,"data_asset"
) is generic - the type specified by the
"entity.data_asset.mime_type"
field (ie,"text/csv"
) is specific
It is the "mime_type"
field of the data_asset
type that allows you to declare a specific type for an asset without creating that specific type(!).
So, in its most basic use, the "data_asset"
asset type is a very "lite" asset type. It's used to avoid creating many other "heavier" asset types. However, if you need to create more complex attributes with indexes for specific fields in your attribute then you will have to create your own asset type (see Create Asset Type: book for an example).
The other two fields of the type "data_asset"
are "dataset"
and "columns"
.
"dataset"
value offalse
means that the"columns"
field is absent in a"data_asset"
attribute"dataset"
value oftrue
means that the"columns"
field is present in a"data_asset"
attribute
The "columns"
field of a "data_asset"
attribute is optionally used to specify metadata for columns of assets that have columns, like csv files, spreadsheets, database tables, etc.
The full definition of the "data_asset"
type is shown in Get Asset Type: data_asset - Response Body.
See Get Asset - CSV File - Response Body - Before Profiling and Get Asset - CSV File - Response Body - After Profiling for examples where a "data_asset"
is used for a csv asset resource.
Using Secondary / Extended / Decorator Asset Types to Simulate Type Inheritance
Asset type inheritance/subclassing is not supported by the Asset Type API. Asset types can be combined through the use of Secondary / Extended / Decorator Asset Types, to simulate inheritance through composition of multiple types. A "decorator" can combine a common set of properties that under a traditional inheritance model would be part of the super class and "decorate" those types that would traditionally be the child class. The asset type decorates
field specifies the child asset types that are decorated. Note that type decoration is not enforced by the API, i.e. there is no checking that the contents of an asset has fields which match what is declared in any asset type. However, the IKC UI honors the type decorator setting by presenting the fields in decorator types as custom attributes in the Asset Overview screen UI under "Details" section. For more details on using the asset type decorator feature, see Tailor your Data Catalog with custom asset attributes.
Get Asset Types
You can get a list of the asset types in a catalog using the following Asset Types API:
Get Asset Types - Request URL:
GET {service_URL}/v2/asset_types?catalog_id={catalog_id}
Get Asset Types - Response Body:
{
"resources": [
{
"description": "Data Asset Type",
"fields": [
{
"key": "dataset",
"type": "boolean",
"facet": true,
"is_array": false,
"is_searchable_across_types": false
},
{
"key": "mime_type",
"type": "string",
"facet": true,
"is_array": false,
"is_searchable_across_types": false
},
{
"key": "columns",
"type": "string",
"facet": true,
"is_array": true,
"search_path": "columns[].name",
"is_searchable_across_types": true
}
],
"external_asset_preview": {},
"relationships": [],
"name": "data_asset",
"version": 3
},
"global_search_searchable": [
"mime_type"
],
{
"description": "An asset type you can use to describe the columns of a data asset. Normally attached as a property to an existing data asset.",
"fields": [
{
"key": "column_info_term_display_name",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "*.column_terms[].term_display_name",
"is_searchable_across_types": true
},
{
"key": "column_info_term_id",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "*.column_terms[].term_id",
"is_searchable_across_types": false
},
{
"key": "column_info_tag",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "*.column_tags[]",
"is_searchable_across_types": true
},
{
"key": "column_info_description",
"type": "string",
"facet": false,
"is_array": false,
"search_path": "*.column_description",
"is_searchable_across_types": true
},
{
"key": "column_info_omrs_guid",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "*.omrs_guid",
"is_searchable_across_types": true
}
],
"external_asset_preview": {},
"relationships": [],
"name": "column_info",
"version": 4
},
{
"description": "An asset type that you can use to assign terms from a business glossary to any asset. Attach items of this type as attributes to other assets.",
"fields": [
{
"key": "asset_term_display_name",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "list[].term_display_name",
"is_searchable_across_types": true
},
{
"key": "asset_term_id",
"type": "string",
"facet": true,
"is_array": false,
"search_path": "list[].term_id",
"is_searchable_across_types": false
}
],
"external_asset_preview": {},
"relationships": [],
"name": "asset_terms",
"version": 1
},
...
]
}
See Asset Type Fields for descriptions of the fields in each of the above asset types.
In a scenario in which the user has not yet created any of their own asset types, the result will contain only the pre-existing, global, asset types. For brevity, the actual sample result shown above includes only a subset of those asset types. Try the GET Asset Types
API on your catalog to see the complete set of pre-existing, global, asset types.
Get Asset Type: data_asset
You can get an individual asset type in a catalog using the following Asset Types API:
Get Asset Type: data_asset - Request URL:
GET {service_URL}/v2/asset_types/{type_name}?catalog_id={catalog_id}
Supplying "data_asset" as the value for the {type_name}
parameter in the above url will produce a response like the following:
Get Asset Type: data_asset - Response Body:
{
"description": "Data Asset Type",
"fields": [
{
"key": "mime_type",
"type": "string",
"facet": true,
"is_array": false,
"is_searchable_across_types": false
},
{
"key": "dataset",
"type": "boolean",
"facet": true,
"is_array": false,
"is_searchable_across_types": false
},
{
"key": "columns",
"type": "string",
"facet": true,
"is_array": true,
"search_path": "columns[].name",
"is_searchable_across_types": true
}
],
"global_search_searchable": [
"mime_type"
],
"external_asset_preview": {},
"relationships": [],
"name": "data_asset",
"version": 3
}
See Asset Type Fields for descriptions of the fields in the above asset type definition.
Since an asset type called "data_asset"
exists, you can create a primary metadata document (ie, card) with a "metadata.asset_type"
value of "data_asset". That card must then also have a primary attribute called "data_asset".
The most interesting item in the "fields"
array in the above "data_asset"
asset type definition is the item with "key"
value "mime_type". That item means that a primary attribute named "data_asset" will have a field called "mime_type"
. The value of that "mime_type"
attribute field will declare the specific type of the asset resource represented by the primary metadata document. For example, see the field "entity.data_asset.mime_type"
in Get Asset - CSV File - Response Body - Before Profiling where the "mime_type"
value is "text/csv".
Notice the "data_asset" attribute in Get Asset - CSV File - Response Body - Before Profiling only contains two fields - "mime_type"
and dataset
. The columns
field specified in the definition of the "data_asset"
asset type is not present in the "data_asset" attribute.
Now compare all the items in the "fields"
array in the above "data_asset"
asset type definition with the "entity.data_asset"
attribute fields as shown, for example, in Get Asset - CSV File - Response Body - After Profiling. Notice that now all the fields described in the "fields"
array of the "data_asset"
type are present as fields in the "entity.data_asset"
attribute. In particular, profiling has added the "columns"
field to the "data_asset" attribute.
The Before Profiling and After Profiling examples illustrate that not all the fields defined in an asset type need be present in a corresponding attribute.
Lastly, note that asset type definition includes a global_search_searchable
list of field keys, including the value mime_type
. That indicates that mime_type
value of every instance of this asset type will be seachable via Global Search microservice.
Create Asset Type: book
Say you have a book asset resource and you want to create a primary metadata document to describe that book. You will first need to create an asset type called "book" (as shown below) so you can then:
- use the name of that asset type as the value for the
"metadata.asset_type"
field in the primary metadata document - create a primary attribute named "book" that will contain data about your book.
Say you want that primary attribute to look like the following:
"book": {
"author": {
"first_name": "Tracy",
"last_name": "Smith"
},
"price": 29.95
}
}
The above "book" attribute has:
- one complex field called "author" (complex fields are allowed in attributes)
- one simple field called "price".
For this example, assume you'll want to be able to search inside the "author.last_name"
field of "book" attributes.
In addition to that, lets assume that you would like to use value of "author.last_name"
field to search for "books" via Global Search microservice.
To create an asset type named "book" that will allow you to do all of the above, use a request like the following:
Create Asset Type: book - Request URL:
POST {service_URL}/v2/asset_types?catalog_id={catalog_id}
Create Asset Type: book - Request Body:
{
"name": "book",
"description": "Book asset type",
"fields": [
{
"key": "author.last_name",
"type": "string",
"facet": false,
"is_array": false,
"search_path": "author.last_name",
"is_searchable_across_types": true
}
],
"global_search_searchable": [
"author.last_name"
],
"properties": {
"price" : {
"type": "number",
"description": "Suggested retail price",
}
}
}
The purpose of most of the fields used in the above request was described in the Asset Type Fields section. Here are some things to note specifically in the above request:
"name"
: uses only lowercase letters, ie, "book""fields"
: even though our goal attribute has multiple fields in it, there is only one item in the asset type's"fields"
array. That is because the"fields"
array should only contain items for the fields of an attribute that we want the catalog to create an index for. In this case, we only want an index for the"author.last_name"
field of "book" attributes."key"
: the name of the attribute field that we want indexed, and the name for that index. In this case,"author.last_name"
."type"
: the type of the"author.last_name"
field is "string""facet"
: an explanation of this field is beyond the scope of this document"is_array"
: false because"author.last_name"
is not an array"search_path"
: this is the path inside the attribute to the value that we want indexed"is_searchable_across_types"
: an explanation of this field is beyond the scope of this document
"global_search_searchable"
since we would like to be able to search forauthor.last_name
value using Global Search - we include correspondingfield.key
value.
Create Asset Type: book - Response Body:
{
"description": "Book asset type",
"fields": [
{
"key": "author.last_name",
"type": "string",
"facet": false,
"is_array": false,
"search_path": "author.last_name",
"is_searchable_across_types": true
}
],
"global_search_searchable": [
"author.last_name"
],
"relationships": [],
"name": "book",
"version": 1
}
The response to the POST /v2/asset_types
API echoes the input, with two additional fields:
- `relationships`: an explanation of the contents of this field is beyond the scope of this document
- `version`: the version of the newly created asset type
You now have an asset type called "book"
that specifies one indexed, search-able, field called "author.last_name"
. See Create Asset: book for an example of the ways in which that "book"
asset type can be used when creating a primary metadata document.
Search Asset Type: attribute - book
The Search Asset Type API can be used to search inside a catalog for all the primary metadata documents that satisfy both of the following conditions:
- have a
"metadata.asset_type"
value that matches the asset type name specified in the {type_name} URL parameter - have an attribute whose fields' values match those specified in the request body.
Recall that one of the primary reasons for creating an asset type is to specify fields in attributes (named after that asset type) that will be indexed for searching. The Create Asset Type: book section showed how to create an asset type named "book"
. The Create Asset: book section showed how to create a primary metadata document whose "metadata.asset_type"
value and primary attribute name are both "book". So, if you use the value "book" for the {type_name}
parameter in the URL below, and if you supply the following request body, then you'll get back matching metadata for books.
Search Asset Type: attribute - book - Request URL
POST {service_URL}/v2/asset_types/{type_name}/search?catalog_id={catalog_id}
Search Asset Type: attribute - book - Request Body:
{
"query":"book.author.last_name:Smith"
}
Notice how the query specifies both the attribute (book
) to be searched and the search path (author.last_name
) within that attribute. The value to match is specified after the colon (:
). In this case, the value is Smith
.
The following is the result of the above search:
Search Asset Type: attribute - book - Response Body:
{
"total_rows": 1,
"results": [
{
"metadata": {
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2019-05-01T18:58:51Z",
"last_updater_id": "IBMid-___",
"last_update_time": 1556737131140,
"last_accessed_at": "2019-05-01T18:58:51Z",
"last_access_time": 1556737131140,
"last_accessor_id": "IBMid-___",
"access_count": 0
},
"name": "Getting Started with Assets",
"description": "Describes how to create and use metadata for assets",
"tags": [
"getting",
"started",
"documentation"
],
"asset_type": "book",
"origin_country": "us",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-___",
"created": 1556635077746,
"created_at": "2019-04-30T14:37:57Z",
"owner_id": "IBMid-___",
"size": 0,
"version": 0,
"asset_state": "available",
"asset_attributes": [
"book"
],
"asset_id": "3da5389d-d4a4-43da-be1f-___",
"asset_category": "USER"
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/3da5389d-d4a4-43da-be1f-___?catalog_id=c6f3cbd8-___"
}
]
}
In this case, there is only one primary metadata document returned in the "results"
array (namely, the primary metadata document that was created in the Create Asset: book section). In general, there can be many matching documents in the "results"
array.
Notice the results of an Asset Type Search, as shown above, only contain the "metadata" section of a primary metadata document. In particular, the "entity" section that contains the attributes is not returned. That is done to reduce the size of the response because, in general, the "entity" section of a primary metadata document can be much larger than the "metadata" section. Use the value of the "metadata.asset_id"
in one of the items in "results"
to retrieve either:
- the entire primary metadata document (using the GET Asset API), or
- just the attributes of the primary metadata document (using the GET Attributes API).
Notes:
- searching is not limited to just primary attributes (like
book
above). Searches may also be performed on:- Secondary, or extended, attributes
- the "metadata" field of a primary metadata document, as shown in the next section.
- other parameters available for searches are:
- limit (number): limit number of search results
- sort (string): sort columns for search results
- counts: beyond the scope of this document
- drilldown: beyond the scope of this document
Search Asset Type: metadata - name
You're not limited to searching within attributes (like the attribute search shown in the previous section). You can also search within the "metadata" section of a primary metadata document.
Search Asset Type: metadata - name - Request URL:
POST {service_URL}/v2/asset_types/{type_name}/search?catalog_id={catalog_id}
Search Asset Type: metadata - name - Request Body:
{
"query":"asset.name:Getting Started with Assets"
}
Notice the query signifies that the search should take place in the "metadata" section of the primary metadata document by using the term asset
at the beginning of the search path. Then the field to be searched within "metadata" is specified - name
in the example above. The value to match is specified after the colon (:
), in this case the value is Getting Started with Assets
.
The following is the result of the above search:
Search Asset Type: metadata - name - Response Body:
{
"total_rows": 1,
"results": [
{
"metadata": {
"rov": {
"mode": 0,
"collaborator_ids": {}
},
"usage": {
"last_updated_at": "2019-04-30T17:27:56Z",
"last_updater_id": "IBMid___",
"last_update_time": 1556645276827,
"last_accessed_at": "2019-04-30T17:27:56Z",
"last_access_time": 1556645276827,
"last_accessor_id": "IBMid___",
"access_count": 0
},
"name": "Getting Started with Assets",
"description": "Describes how to create and use metadata for assets",
"tags": [
"getting",
"started",
"documentation"
],
"asset_type": "book",
"origin_country": "us",
"rating": 0,
"total_ratings": 0,
"catalog_id": "c6f3cbd8-___",
"created": 1556635077746,
"created_at": "2019-04-30T14:37:57Z",
"owner_id": "IBMid-___",
"size": 0,
"version": 0,
"asset_state": "available",
"asset_attributes": [
"book"
],
"asset_id": "3da5389d-d4a4-43da-be1f-___",
"asset_category": "USER"
},
"href": "https://api.dataplatform.cloud.ibm.com/v2/assets/3da5389d-d4a4-43da-be1f-___?catalog_id=c6f3cbd8-___"
}
]
}
In this case, the result is the same as was described in Search Asset Type: attribute - book - Response Body. See that section for more details.
Data Assets and Columns
Data assets represent tables or files from a connection to a data source (ex. tables) or files uploaded into cloud object storage associated with Projects or Catalog. Data assets extend the generic asset service and attachments functionality. This means that like generic assets, data assets also have name, description, visibility, members, tags, classifications and asset_type. In addition, data assets have mime type, column related attributes and properties. The asset type for data asset is data_asset and it is one of the asset types already available.
Like other assets, we can add or update attributes on data assets also. In addition, data assets can have attributes which are associated with individuals columns. Given below is a simple format of a data asset, which has attributes associated with columns.
{
"metadata": {
...
...
"name": "Name of the data asset",
"asset_type": "data_asset",
"asset_attributes": [
"data_asset",
"column_info",
"column_level_custom_attribute",
... other attributes
],
"asset_id": "id of the data asset"
},
"entity": {
"data_asset": {
"mime_type": "text/csv",
"dataset": false,
"columns": [
{
"name": "NameOfColumn", // Name of the column
"type": {
"type": "varchar",
"length": 1024,
"scale": 0,
"nullable": true,
"signed": false
}
}
]
},
"column_info": {
"NameOfColumn": { // we are listing the column_info attributes associated with each column
"column_description": "description of the column",
"column_terms": [
{
"term_display_name": "test_term",
"term_id": "d3d667e1-____"
}
]
}
},
"column_level_custom_attribute": { // Column level custom attribute
"columns": [
{
"name": "NameOfColumn",
"OtherColumnProperty": "dummy value"
}
]
},
... other attributes
},
"attachments": [
...
...
]
}
All asset APIs listed in Assets API Overview apply to data assets as well. In additions, data assets have special APIs.
Creating a Data Asset
Creating a data asset is similar to creating any other asset. In here,
- a
"metadata.asset_type"
field will hold value"data_asset"
- a primary attribute called
"data_asset"
.
In the first example, we will create a connected data asset. It represents a table or a file from an exisitng connection. The sample request body and the corresponding response is given below.
Create Data Asset: Request URL:
POST {service_URL}/v2/data_assets
Create Data Asset: -Sample Request Body:
{
"metadata": {
"name": "CUSTOMER", // Name of table
"description": "customer table", // description
"tags": [
"public"
],
"asset_type": "data_asset",
"origin_country": "us",
"rov": {
"mode": 0
}
},
"entity": {
"data_asset": {
"mime_type": "application/x-ibm-rel-table",
"dataset": true,
"properties": [
{
"name": "schema_name",
"value": "DB2INST1"
},
{
"name": "table_name",
"value": "CUSTOMER"
}
],
"columns": [ // list of columns in the data asset
{
"name": "CID",
"type": {
"type": "bigint",
"length": 19,
"scale": 0,
"nullable": false,
"signed": true,
"native_type": "BIGINT"
}
},
{
"name": "INFO",
"type": {
"type": "sqlxml",
"length": 0,
"scale": 0,
"nullable": true,
"signed": false,
"native_type": "XML"
}
},
{
"name": "HISTORY",
"type": {
"type": "sqlxml",
"length": 0,
"scale": 0,
"nullable": true,
"signed": false,
"native_type": "XML"
}
}
]
}
},
"attachments": [ // details of the connection
{
"asset_type": "data_asset",
"name": "CUSTOMER",
"description": "customer table",
"mime": "application/x-ibm-rel-table",
"connection_id": "3a______",
"connection_path": "/DB2_path_/CUSTOMER_table_path",
"is_partitioned": false
}
]
}
Create Data Asset: - Response Body:
{
"metadata": {
"usage": {
"last_updated_at": "2023-04-20T20:13:39Z",
"last_updater_id": "-- creator id--",
"last_update_time": 1682021619570,
"last_accessed_at": "2023-04-20T20:13:39Z",
"last_access_time": 1682021619570,
"last_accessor_id": "-- creator id--",
"access_count": 0
},
"rov": {
"mode": 0,
"collaborator_ids": {},
"member_roles": {
"-- creator id--": {
"user_iam_id": "-- creator id--",
"roles": [
"OWNER"
]
}
}
},
"name": "CUSTOMER",
"description": "customer table",
"tags": [
"public"
],
"asset_type": "data_asset",
"origin_country": "us",
"resource_key": "0000:0000:0000:0000:0000:FFFF:0914:9C3E|50000|sample:/DB2INST1/CUSTOMER",
"rating": 0,
"total_ratings": 0,
"catalog_id": "e1____",
"created": 1682021619570,
"created_at": "2023-04-20T20:13:39Z",
"owner_id": "-- creator id--",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"data_asset"
],
"asset_id": "12____",
"asset_category": "USER",
"creator_id": "-- creator id--"
},
"entity": {
"data_asset": {
"mime_type": "application/x-ibm-rel-table",
"dataset": true,
"properties": [
{
"name": "schema_name",
"value": "DB2INST1"
},
{
"name": "table_name",
"value": "CUSTOMER"
}
],
"columns": [
{
"name": "CID",
"type": {
"type": "bigint",
"length": 19,
"scale": 0,
"nullable": false,
"signed": true,
"native_type": "BIGINT"
}
},
{
"name": "INFO",
"type": {
"type": "sqlxml",
"length": 0,
"scale": 0,
"nullable": true,
"signed": false,
"native_type": "XML"
}
},
{
"name": "HISTORY",
"type": {
"type": "sqlxml",
"length": 0,
"scale": 0,
"nullable": true,
"signed": false,
"native_type": "XML"
}
}
]
}
},
"href": "/v2/assets/12____?catalog_id=e1____",
"asset_id": "12____",
"attachments": [
{
"attachment_id": "2a____",
"asset_type": "data_asset",
"attachment_type": "remote",
"datasource_type": "8c____",
"is_partitioned": false,
"name": "CUSTOMER",
"description": "customer table",
"mime": "application/x-ibm-rel-table",
"connection_id": "3a____",
"connection_path": "/DB2INST1/CUSTOMER",
"user_data": {},
"href": "/v2/assets/12____/attachments/2a____?catalog_id=e1____",
"asset_category": "USER"
}
]
}
In the next example, we will create a data asset which represents a local file uploaded to catalog. The URL for the API remains same. The sample request body and the corresponding response are given below.
Create Data Asset for Local File: -Sample Request Body:
{
"metadata": {
"name": "New data Asset",
"description": "data table",
"tags": [
"public"
],
"asset_type": "data_asset",
"origin_country": "us",
"rov": {
"mode": 0
}
},
"entity": {
"data_asset": {
"mime_type": "application/octet-stream",
"dataset": true,
"columns": [
{
"name": "COLUMN1",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN2",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN3",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN4",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN5",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN6",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN7",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN8",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
}
],
"properties": [
{
"name": "bucket",
"value": "--bucket name--"
},
{
"name": "file_name",
"value": "file_path/file_name-csv"
},
{
"name": "first_line_header",
"value": "false"
},
{
"name": "encoding",
"value": "UTF-8"
},
{
"name": "invalid_data_handling",
"value": "fail"
},
{
"name": "file_format",
"value": "csv"
}
]
}
},
"attachments": [
{
"asset_type": "data_asset",
"name": "remote",
"description": "remote",
"mime": "text/csv",
"connection_id": "26____",
"connection_path": "Cloud_object_storage_path/file_name-csv.csv",
"is_partitioned": false
}
]
}
Create Data Asset For Local File: - Response Body:
For the above request, the expected response is given below.
{
"metadata": {
"usage": {
"last_updated_at": "2023-04-20T20:23:21Z",
"last_updater_id": "-- creator id--",
"last_update_time": 1682022201699,
"last_accessed_at": "2023-04-20T20:23:21Z",
"last_access_time": 1682022201699,
"last_accessor_id": "-- creator id--",
"access_count": 0
},
"rov": {
"mode": 0,
"collaborator_ids": {},
"member_roles": {
"-- creator id--": {
"user_iam_id": "-- creator id--",
"roles": [
"OWNER"
]
}
}
},
"name": "New data Asset",
"description": "data table",
"tags": [
"public"
],
"asset_type": "data_asset",
"origin_country": "us",
"resource_key": "New data Asset",
"rating": 0,
"total_ratings": 0,
"catalog_id": "e1______",
"created": 1682022201699,
"created_at": "2023-04-20T20:23:21Z",
"owner_id": "-- creator id--",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"data_asset"
],
"asset_id": "80___",
"asset_category": "USER",
"creator_id": "-- creator id--"
},
"entity": {
"data_asset": {
"mime_type": "application/octet-stream",
"dataset": true,
"columns": [
{
"name": "COLUMN1",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN2",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN3",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN4",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN5",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN6",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN7",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
},
{
"name": "COLUMN8",
"type": {
"length": 1024,
"nullable": true,
"type": "varchar",
"scale": 0,
"signed": false
}
}
],
"properties": [
{
"name": "bucket",
"value": "--bucket name--"
},
{
"name": "file_name",
"value": "file_path/file_name-csv"
},
{
"name": "first_line_header",
"value": "false"
},
{
"name": "encoding",
"value": "UTF-8"
},
{
"name": "invalid_data_handling",
"value": "fail"
},
{
"name": "file_format",
"value": "csv"
}
]
}
},
"href": "/v2/assets/80__?catalog_id=e1____",
"asset_id": "805e24e6-2473-41d4-9e2a-679df9a40e7b",
"attachments": [
{
"attachment_id": "2b____",
"asset_type": "data_asset",
"attachment_type": "remote",
"is_partitioned": false,
"name": "remote",
"description": "remote",
"mime": "text/csv",
"connection_id": "26____",
"connection_path": "Cloud_object_storage_path/file_name-csv.csv",
"user_data": {},
"href": "/v2/assets/80____/attachments/2b____?catalog_id=e1____",
"asset_category": "USER"
}
]
}
Retrieving a Data Asset
Retrieving a data asset can be achieved by calling APIs for retrieving any other asset. Since data assets are also uploaded locally we can retrieve csv files also. However, we have special APIs designed specifically to retrieve data assets.
If we retrieve the data asset created in the previous section
Retrieve a Data Asset: Request URL:
GET {service_URL}/v2/data_assets/{asset_id}?catalog_id={catalog_id}
Retrieve Data Asset: - Response Body:
{
"metadata": {
"usage": {
"last_updated_at": "2023-04-17T20:20:56Z",
"last_updater_id": "--id--",
"last_update_time": 1681762856921,
"last_accessed_at": "2023-04-17T20:20:56Z",
"last_access_time": 1681762856921,
"last_accessor_id": "--id--",
"access_count": 0
},
"rov": {
"mode": 0,
"collaborator_ids": {},
"member_roles": {
"--id--": {
"user_iam_id": "--id--",
"roles": [
"OWNER"
]
}
}
},
"name": "CUSTOMER",
"description": "customer table",
"tags": [
"public"
],
"asset_type": "data_asset",
"origin_country": "us",
"resource_key": "string",
"rating": 0,
"total_ratings": 0,
"catalog_id": "e12____",
"created": 1681762856921,
"created_at": "2023-04-17T20:20:56Z",
"owner_id": "--id--",
"size": 0,
"version": 2,
"asset_state": "available",
"asset_attributes": [
"data_asset"
],
"asset_id": "75____",
"asset_category": "USER",
"creator_id": "--id--"
},
"entity": {
"data_asset": {
"mime_type": "application/x-ibm-rel-table",
"dataset": true,
"columns": [
{
"name": "CID",
"type": {
"length": 19,
"nullable": false,
"type": "bigint",
"scale": 0,
"signed": true
}
},
{
"name": "INFO",
"type": {
"length": 0,
"nullable": true,
"type": "sqlxml",
"scale": 0,
"signed": false
}
},
{
"name": "HISTORY",
"type": {
"length": 0,
"nullable": true,
"type": "sqlxml",
"scale": 0,
"signed": false
}
}
],
"properties": [
{
"name": "schema_name",
"value": "DB2INST1"
},
{
"name": "table_name",
"value": "CUSTOMER"
}
]
}
},
"attachments": [
{
"id": "51d____",
"version": 2,
"asset_type": "data_asset",
"name": "CUSTOMER",
"description": "customer table",
"mime": "application/x-ibm-rel-table",
"connection_id": "3a____",
"connection_path": "/DB2_path_/CUSTOMER_table_path",
"datasource_type": "8c____",
"creator_id": "--id--",
"create_time": 1681762856936,
"size": 0,
"is_remote": true,
"is_managed": false,
"is_referenced": false,
"is_object_key_read_only": false,
"is_user_provided_path_key": true,
"transfer_complete": true,
"is_partitioned": false,
"complete_time_ticks": 1681762856936,
"user_data": {},
"test_doc": 0,
"usage": {
"access_count": 0,
"last_accessor_id": "--id--",
"last_access_time": 1681762856936
}
}
],
"href": "/v2/data_assets/75__?catalog_id=e1____"
}
Retrieving Columns from a Data Asset
Data assets often contain information regarding the columns in the table or file. Use the GET columns
API to retrieve the column related information, such as the terms, column description or any other custom attribute assigned to a column etc.
Retrieve Columns from a Data Asset: Request URL:
GET {service_URL}/v2/data_assets/{asset_id}/columns?catalog_id={catalog_id}
Retrieve Columns from a Data Asset: - Response Body:
{
"total_rows": 1, // Total number of columns
"resources": [
{
"data_asset": {
"columns": [
{
"name": "NameOfColumn",
"type": {
"type": "varchar",
"length": 1024,
"scale": 0,
"nullable": true,
"signed": false
},
"position": 1
}
]
},
"column_info": {
"NameOfColumn": {
"column_description": "column description",
"column_terms": [
{
"term_display_name": "test_term",
"term_id": "d3d______"
}
],
"name": "NameOfColumn"
}
},
"column_level_custom_attribute": {
"columns": [
{
"name": "NameOfColumn",
"OtherColumnProperty": "dummy value"
}
]
}
}
]
}
Data assets can hold information about the columns of the data. It is important to note that they do not hold the actual column and the column values. The information it can stores can be the terms and data class assigned after profiling or manually, data quality score of the column, the type of the column, tags associated with the column and the description of the column.
We can create custom attributes and assign them to columns also. The Create Asset Type: book section shows how to create an asset type named book
.
Among the already available asset types, asset types "data_asset"
and "column_info"
contain attributes for columns.
Column Related Information Present in data_asset Asset Type
The basic structure of data_asset asset type is described in section data_asset Type. Here we will elaborate on the "columns"
field of the data_asset.
The "columns"
field consists of a list of elements. Each element represents a column. It is populated after profiling. Example of a single column element is given below
{
"name": "MARITAL_STATUS", // Name of the column
"type": {
"type": "varchar",
"length": 1024,
"scale": 0,
"nullable": true,
"signed": false
}
}
Column Related Information Present in column_info Asset Type
The column_info attribute in a data asset represents information about each column. The informations is added manually, or after running profiling or metadata enrichment. The "column_info"
field consists of a map. Each key-value pair represents an individual column. Each key is the name of the column. The value of the key represents attributes assigned to the column.
Example of a single column element's column_info is given below
"MARITAL_STATUS": { // name of the column
"data_class": {
"selected_data_class": {
"id": "7a8e___",
"name": "status",
"setByUser": false
}
},
"inferred_type": {
"length": 10,
"precision": 0,
"scale": 0,
"type": "STRING",
"display_name": "varchar(10)"
},
"quality": {
"score": 98
},
"column_terms": [
{
"term_id": "7a8e____",
"term_display_name": "TestTerm",
"confidence": 0.984,
"specification": "Data class based assignment"
}
],
"suggested_terms": [],
"rejected_terms": [],
"name": "MARITAL_STATUS"
}
}
Assigning Terms and Data Class to a Column
We can assign terms and data class to a column manually also. The terms associated with individual columns reside as an array inside the column_info attribute.
In the first scenario we will add a single term and data class to a file which does not have the "column_info"
attribute. This can happen if a csv file has been uploaded locally. In this case, we need to use the Create Asset API to create and populate an instance of "column_info"
attribute inside the asset's primary metadata document.
POST {service_URL}/v2/assets/{asset_id}/attributes?catalog_id={catalog_id}
Body:
{
"name": "column_info",
"entity": {
"BUSINESS_NAME": {
"column_terms": [
{
"term_id": "--your term id--",
"term_display_name": "term_name"
}
],
"data_class": {
"selected_data_class": {
"id": "--your data class id--",
"name": "data_class_name"
}
}
}
}
}
The response should look like:
{
"asset_id": "--your asset id--",
"column_info": {
"BUSINESS_NAME": { // Name of your column
"column_terms": [
{
"term_id": "--your term id--",
"term_display_name": "term_name"
}
],
"data_class": {
"selected_data_class": {
"id": "--your data class id--",
"name": "data_class_name"
}
}
}
},
"href": "/v2/assets/b8____/attributes/column_info?catalog_id=e1____"
}
If you want to associate additional terms or data class to a column, or add terms or data class to a column which does not have terms or data class, or remove existing terms or data class from a column, then use the Update Asset Attribute API. In the following example, we will replace terms assigned to the column BUSINESS_NAME and add terms to a column which does not have any terms yet.
PATCH {service_URL}/v2/assets/{asset_id}/attributes/column_info?catalog_id={catalog_id}
Body:
[
{
"op": "replace",
"path": "/BUSINESS_NAME/column_terms", // column to which we already have terms and data class
"value": [
{
"term_id": "--your term id--",
"term_display_name": "new_term_name"
}
]
},
{
"op": "add",
"path": "/ADDRESS", // column to which no terms are data classes are assigned
"value": {
"column_terms": [
{
"term_id": "--your term id--",
"term_display_name": "term_name_2"
}
]
}
}
]
The response should look like:
{
"BUSINESS_NAME": {
"column_terms": [
{
"term_id": "--your term id--",
"term_display_name": "new_term_name"
}
],
"data_class": {
"selected_data_class": {
"id": "--your data class id--",
"name": "data_class_name"
}
}
},
"ADDRESS": {
"column_terms": [
{
"term_id": "--your term id--",
"term_display_name": "term_name_2"
}
]
}
}
Custom Attributes for Columns
Custom attributes can be added to individual columns.
Creating an Asset Type for Column Level
The first step is to create an asset type which can be assigned to a column. To search based on column level custom attributes, use global search. Users are not allowed to create assets of asset type which have properties associated with columns. This asset type can only be used to create attributes on a data_asset or a cobol_copybook. All properties are on column level.
- Ensure is_column_custom_attribute=true. This field can be edited from false to true, but cannot be edited from true to false value
- Ensure allow_decorators=false
- Ensure all searchable fields are listed in the fields list. All the elements of the fields list must have a valid key and a valid search_path.
- Populate the global_search_searchable with the keys mentioned in the elements of the fields
Given below is an example
POST {service_URL}/v2/asset_types
{
"description": "Asset type for custom column attribute",
"name": "column_attribute",
"is_column_custom_attribute" : true, // is_column_custom_attribute should be true
"allow_decorators": false, // allow_decorators shuld be false
"attribute_only":true,
"decorates": [
{
"asset_type_name": "data_asset" // only two values allowed; data_asset and/or cobol_copybook
}
],
"fields": [
{
"key": "name",
"type": "string",
"facet": false,
"search_path": "columns.*.name"
},
{
"key": "field1",
"type": "string",
"facet": false,
"search_path": "columns.*.field1"
},
{
"key": "field2",
"type": "string",
"facet": false,
"search_path": "columns.*.field2"
}
],
"global_search_searchable": ["name", "field1", "field2"],
"properties" : {
"name" : {
"type": "string"
},
"host" : {
"type": "string"
},
"new_prop" : {
"type": "string"
}
}
}
Adding the Column Level Attribute to a Column
Users can add columns with a map. The top level field name will always be columns. The value of columns fields will be a map.
Example of attribute being listed as a map is
"columns": {
"city": {
"host": "host1",
"field1": "column1field1",
"field2": "column1field2"
}
}
The details are listed in section Adding the Attribute as a Map.
Adding the Attribute as a Map
When column level attribute is added to the asset, then the attribute in the asset's entity will have the following structure. Please note the top level columns field. e.g.
"column_attribute": {
"columns": {
"city": { // column name
"host": "host1",
"field1": "column1field1",
"field2": "column1field2"
}
}
}
}
In the above example, for column name "city", the associated property values are host=host1, field1=column1field1, field2=column1field2
To add this attribute to an asset:
POST /v2/assets/{asset_id}/attributes API. Sample body will look like
{
"name": "column_attribute",
"entity": {
"columns": {
"city": {
"host": "host1",
"field1": "column1field1",
"field2": "column1field2"
}
}
}
}
To update this attribute use the PATCH /v2/assets/{asset_id}/attributes/{attribute_key} or the POST /v2/assets/bulk_patch API
Data Profiles
Introduction
Data profiles contains classification and information about the distribution of your data, which helps you to understand your data better and make the appropriate data shaping decisions.
Data profiles are automatically created when a data set is added to a catalog with data policy enforcement. The profile summary helps you in analyzing your data more closely and in deciding which cleansing operations on your data will provide the best results for your use-case. You can also perform CRUD operations on data profiles for data sets in catalogs or projects without data policy enforcement.
Create a data profile
You can use this API to:
- Create a data profile
- Create and execute a data profile
To create a data profile for a data set in a specified catalog or project and not execute it, call the following POST method:
POST /v2/data_profiles?start=false
OR
POST /v2/data_profiles
To create a data profile for a data set in a specified catalog or project and execute it, call the following POST method:
POST /v2/data_profiles?start=true
The minimal request payload required to create a data profile is as follows:
{
"metadata": {
"dataset_id": "{DATASET_ID}",
"catalog_id": "{CATALOG_ID}"
}
}
OR
{
"metadata": {
"dataset_id": "{DATASET_ID}",
"project_id": "{PROJECT_ID}"
}
}
The request payload can have an entity
part which is optional:
{
"metadata": {
"dataset_id": "{DATASET_ID}",
"catalog_id": "{CATALOG_ID}"
},
"entity": {
"data_profile": {
"options": {
"max_row_count": {MAX_ROW_COUNT_VALUE},
"max_distribution_size": {MAX_SIZE_OF_DISTRIBUTIONS},
"max_numeric_stats_bins": {MAX_NUMBER_OF_STATIC_BINS},
"classification_options": {
"disabled": {BOOLEAN_TO_ENABLE_OR_DISABLE_CLASSIFICATION_OPTIONS},
"class_codes": {DATA_CLASS_CODE},
"items": {ITEMS}
}
}
}
}
The following parameters are required in the URI and the payload:
-
start
: Specifies whether to start the profiling service immediately after the data profile is created. The default isfalse
. -
max_row_count
: Specifies the maximum number of rows to perform profiling on. If no value is provided or if the value is invalid (negative), the default is to 5000 rows. -
row_percentage
: Specifies the percentage of rows to perform profiling on. If no value is provided or if the value is invalid (<0 or >100). -
max_distribution_size
: Specifies the maximum size of various distributions produced by the profiling process. If no value is provided, the default is 100. -
max_numeric_stats_bins
: Specifies the maximum number of bins to use in the numerical statistics. If no bin size is provided, the default is 100 bins. -
classification_options
: Specifies the various options available for classification.(i).
disabled
: If true, the classification options are disabled and default values are used.(ii).
class_codes
: Specifies the data class code to consider during profiling.(iii).
items
: Specifies the items.Note: You can get various data class codes through the data class service.
To create a data profile for a data set, the following steps must be completed:
-
You must have a valid IAM token to make REST API calls and a project or catalog ID.
-
You must have an IBM Cloud Object Storage bucket, which must be associated with your catalog in the project.
-
The data set must be added to your catalog in the project.
-
Construct a request payload to create a data profile with the values required in the payload.
-
Send a POST request to create a data profile.
When you call the method, the payload is validated. If a required value is not specified or a value is invalid, you get a response message with an HTTP status code of 400 and information about the invalid or missing values.
The response of the method includes a location header with a value that indicates the location of the profile that was created. The response body also includes a field href
which contains the location of the created profile.
The execution.status
of the profile is none
if the start
parameter is not set or is set to false
. Otherwise, it is in submitted
state or any other state depending on the profiling execution status.
The following are possible response codes for this API call:
Response HTTP status | Cause | Possible Scenarios |
---|---|---|
201 | Created | A data profile was created. |
400 | Bad Request | The request payload either had some invalid values or invalid/unwanted parameters. |
401 | Unauthorized | Invalid IAM token was provided in the request header. |
403 | Forbidden | User is not allowed to create a data profile. |
500 | Internal Server Error | Some runtime error occurred. |
Get a data profile
To get a data profile for a data set in a specified catalog or project, call the following GET method:
GET /v2/data_profiles/{PROFILE_ID}?catalog_id={CATALOG_ID}&dataset_id={DATASET_ID}
OR
GET /v2/data_profiles/{PROFILE_ID}?project_id={PROJECT_ID}&dataset_id={DATASET_ID}
The value of PROFILE_ID
is the value of metadata.guid
from the successful response payload of the create data profile call.
For other runtime errors, you might get an HTTP status code of 500 indicating that profiling didn't finished as expected.
The following are possible response codes for this API call:
Response HTTP status | Cause | Possible Scenarios |
---|---|---|
200 | Success | Data profile is created and executed. |
202 | Accepted | Data profile is created and under execution. |
401 | Bad Request | Invalid IAM token was provided in the request header. |
403 | Forbidden | User is not allowed to get the data profile. |
404 | Not Found | The data profile specified was not found. |
500 | Internal Server Error | Some runtime error occurred. |
Update a data profile
To update a data profile for a data set in a specified catalog or project, call the following PATCH method:
PATCH /v2/data_profiles/{PROFILE_ID}?catalog_id={CATALOG_ID}&dataset_id={DATASET_ID}
OR
PATCH /v2/data_profiles/{PROFILE_ID}?project_id={PROJECT_ID}&dataset_id={DATASET_ID}
The value of PROFILE_ID
is the value of metadata.guid
from the successful response payload of the create data profile call.
The JSON request payload must be as follows:
[
{
"op": "add",
"path": "string",
"from": "string",
"value": {}
}
]
During update, the entire data profile is replaced, apart from any read-only or response-only attributes.
If profiling processes are running and the start parameter is set to true, then a data profile is only updated if the stop_in_progress_runs parameter
is set to true.
The updates must be specified by using the JSON patch format, described in RFC 6902.
Modify asset level classification
This API is used for CRUD operations on asset level classification.
To modify the asset level classification details in the data_profile
parameter for a data set in a specified catalog or project, call the following PATCH method:
PATCH /v2/data_profiles/classification?catalog_id={CATALOG_ID}&dataset_id={DATASET_ID}
OR
PATCH /v2/data_profiles/classification?project_id={PROJECT_ID}&dataset_id={DATASET_ID}
The JSON request payload must be structured in the following way:
[
{
"op": "add",
"path": "/data_classification",
"value": [
{
"id":"{ASSET_LEVEL_CLASSIFICATION_ID}",
"name":"{ASSET_LEVEL_CLASSIFICATION_NAME}"
}
]
}
]
The path
attribute must be set to what is written in the previous JSON request payload, otherwise you will get a validation error with an HTTP status code of 400.
The values of ASSET_LEVEL_CLASSIFICATION_ID
and ASSET_LEVEL_CLASSIFICATION_NAME
can be: PII
and PII details
respectively.
The data updates must be specified by using the JSON patch format, described in RFC 6902 [https://tools.ietf.org/html/rfc6902]. For more details about JSON patch, see [http://jsonpatch.com].
A successful response has an HTTP status code of 200 and lists the asset level classifications.
The following are possible response codes for this API call:
Response HTTP status | Cause | Possible Scenarios |
---|---|---|
200 | Success | Asset Level Classification is added to the asset. |
400 | Bad Request | The request payload either had some invalid values or invalid/unwanted parameters. |
401 | Unauthorized | Invalid IAM token was provided in the request header. |
403 | Forbidden | User is not allowed to add asset level classification to the asset. |
500 | Internal Server Error | A runtime error occurred. |
Delete a data profile
To delete a data profile for a data set in a specified catalog or project, call the following DELETE method:
DELETE /v2/data_profiles/{PROFILE_ID}?catalog_id={CATALOG_ID}&dataset_id={DATASET_ID}&stop_in_progress_profiling_runs=false
OR
DELETE /v2/data_profiles/{PROFILE_ID}?project_id={PROJECT_ID}&dataset_id={DATASET_ID}&stop_in_progress_profiling_runs=true
The value of PROFILE_ID
is the value of metadata.guid
from the successful response payload of the create data profile call.
You can't delete a profile if the profiling execution status is in running
state and the query parameter stop_in_progress_profiling_runs
is set to false.
A successful response has an HTTP status code of 204.
Stream Flows
Introduction
The streams flow service provides APIs to create, update, delete, list, start, and stop stream flows.
A streams flow is a continuous flow of massive volumes of moving data that real-time analytics can be applied to. A streams flow can read data from a variety of sources, process that data by using analytic operations or your custom code, and then write it to one or more targets. You can access and analyze massive amounts of changing data as it is created. Regardless of whether the data is structured or unstructured, you can leverage data at scale to drive real-time analytics for up-to-the-minute business decisions.
The sources that are supported include Kafka, Message Hub, MQTT, and Watson IoT. Targets that are supported include Db2 Warehouse on Cloud, Cloud Object Storage, and Redis. Analytic operators that are supported include Aggregation, Python Machine Learning, Code, and Geofence.
Authorization
Authorization is done via Identity Access Management (IAM) bearer token. All API calls will require this Bearer token in the header.
Create a Streams Flow
1. Streaming Analytics instance ID
The streams flow is submitted to a Streaming Analytics service for compilation and running. When creating a flow, the Streaming Analytics instance ID must be provided. The instance ID can be found in the service credentials, which can be accessed from the service dashboard.
2. The pipeline graph
The streams flow represents it's source, targets, and operations in a pipeline graph. The pipeline graph can be generated by choosing the relevant operators in the Streams Designer canvas. To retrieve a pipeline graphcreated by the Streams Designer, use:
GET /v2/streams_flows/85be3e09-1c71-45d3-8d5d-220d6a6ea850?project_id=ff1ab70b-0553-409a-93f9-ccc31471c218
This will return a streams flow containing a pipeline field in the entity. This pipeline object can be copied and submitted into another flow via:
POST /v2/streams_flows/?project_id=ff1ab70b-0553-409a-93f9-ccc31471c218
Request Payload:
{
"name": "My Streams Flow",
"description": "A Sample Streams Flow.",
"engines": {
"streams": {
"instance_id": "8ff81caa-1076-41ce-8de1-f4fe8d79e30e"
}
},
"pipeline": {
"doc_type": "pipeline",
"version": "1.0",
"json_schema": "http://www.ibm.com/ibm/wdp/flow-v1.0/pipeline-flow-v1-schema.json",
"id": "",
"app_data": {
"ui_data": {
"name": "mqtt 2"
}
},
"primary_pipeline": "primary-pipeline",
"pipelines": [
{
"id": "primary-pipeline",
"runtime": "streams",
"nodes": [
{
"id": "messagehubsample_29xse4zvabe",
"type": "binding",
"op": "ibm.streams.sources.messagehubsample",
"outputs": [
{
"id": "target",
"schema_ref": "schema0",
"links": [
{
"node_id_ref": "mqtt_o6are9c4f",
"port_id_ref": "source"
}
]
}
],
"parameters": {
"schema_mapping": [
{
"name": "time_stamp",
"type": "timestamp",
"path": "/time_stamp"
},
{
"name": "customerId",
"type": "double",
"path": "/customerId"
},
{
"name": "latitude",
"type": "double",
"path": "/latitude"
},
{
"name": "longitude",
"type": "double",
"path": "/longitude"
}
]
},
"connection": {
"ref": "EXAMPLE_MESSAGE_HUB_CONNECTION",
"project_ref": "EXAMPLE",
"properties": {
"asset": {
"path": "/geofenceSampleData",
"type": "topic",
"name": "Geospatial data",
"id": "geofenceSampleData"
}
}
},
"app_data": {
"ui_data": {
"label": "Sample Data",
"x_pos": 60,
"y_pos": 90
}
}
},
{
"id": "mqtt_o6are9c4f",
"type": "binding",
"op": "ibm.streams.targets.mqtt",
"parameters": {},
"connection": {
"ref": "cd5388c3-b203-4c77-803b-bc902d864a30",
"project_ref": "a912d673-54d3-4e5c-800f-5088554d3aa8",
"properties": {
"asset": "t"
}
},
"app_data": {
"ui_data": {
"label": "MQTT",
"x_pos": 420,
"y_pos": 90
}
}
},
{
"id": "mqtt_y84zc3vfche",
"type": "binding",
"op": "ibm.streams.sources.mqtt",
"outputs": [
{
"id": "target",
"schema_ref": "schema1",
"links": [
{
"node_id_ref": "debug_9avg3zdig25",
"port_id_ref": "source"
}
]
}
],
"parameters": {
"schema_mapping": [
{
"name": "time_stamp",
"type": "timestamp",
"path": "/time_stamp"
},
{
"name": "customerId",
"type": "double",
"path": "/customerId"
},
{
"name": "latitude",
"type": "double",
"path": "/latitude"
},
{
"name": "longitude",
"type": "double",
"path": "/longitude"
}
]
},
"connection": {
"ref": "cd5388c3-b203-4c77-803b-bc902d864a30",
"project_ref": "a912d673-54d3-4e5c-800f-5088554d3aa8",
"properties": {
"asset": "t"
}
},
"app_data": {
"ui_data": {
"label": "MQTT",
"x_pos": -120,
"y_pos": -210
}
}
},
{
"id": "debug_9avg3zdig25",
"type": "binding",
"op": "ibm.streams.targets.debug",
"parameters": {},
"app_data": {
"ui_data": {
"label": "Debug",
"x_pos": 240,
"y_pos": -270
}
}
}
]
}
],
"schemas": [
{
"id": "schema0",
"fields": [
{
"name": "time_stamp",
"type": "timestamp"
},
{
"name": "customerId",
"type": "double"
},
{
"name": "latitude",
"type": "double"
},
{
"name": "longitude",
"type": "double"
}
]
},
{
"id": "schema1",
"fields": [
{
"name": "time_stamp",
"type": "timestamp"
},
{
"name": "customerId",
"type": "double"
},
{
"name": "latitude",
"type": "double"
},
{
"name": "longitude",
"type": "double"
}
]
}
]
}
}
Streams Flow Lifecycle
After a Streams Flow is created it will be in the STOPPED state unless it's been submitted as a job to be started. When starting a job, a Cloudant asset is created to track the status of the streams flow run. The start job operation can take up to minute to complete, during which time the streams flow will be in the STARTING state. Once the submission and compilation has completed, the streams flow will be in the RUNNING state.
To change the run state use the POST api:
POST /v2/streams_flows/85be3e09-1c71-45d3-8d5d-220d6a6ea850/runs?project_id=ff1ab70b-0553-409a-93f9-ccc31471c218
Request Payload:
{
"state": "started",
"allow_streams_start": true
}
-
For starting the streams flow run, use { state: started }. To stop the flows run, use { state: stopped }.
-
Specify "allow_streams_start" to start the Streaming Analytics service in the event that it is stopped.
The start job operation triggers a long running process on the Streaming Analytics service instance. During this time the progress/status of this job can be viewed:
GET https://api.dataplatform.cloud.ibm.com/v2/streams_flows/85be3e09-1c71-45d3-8d5d-220d6a6ea850/runs?project_id=ff1ab70b-0553-409a-93f9-ccc31471c218
A version of the pipeline that has been deployed is saved to represent the Runtime Pipeline. The streams flow can still be edited in the Streams Designer, and it will not have an impact on the Runtime Pipeline that has been deployed, until the user stops the running flow, and starts it again..
Metadata Discovery
Metadata Discovery can be used to automatically discover assets from a connection. The connection used for a discovery run can be associated with a catalog or project, but new data assets will be created in a project. Each asset that is discovered from a connection is added as a data asset to the project.
For a list of the supported types of connections against which the Metadata Discovery service can be invoked, see Discover data assets from a connection.
In general, the discovery process takes a significant amount of time. Therefore, the API to create a discovery run actually only queues a discovery run and then returns immediately (typically before the discovery run is even started). Subsequent calls to different APIs can then be made to monitor the progress of the discovery run (see Monitoring a metadata discovery run and Retrieving discovered assets).
The following example shows a request to create a metadata discovery run. It assumes that a project, a connection, and a catalog have already been created, and that their IDs are known by the caller. If a catalog is provided (as in the following example), the connection is associated with the catalog. If no catalog is provided, the connection is associated with the project.
Note: In the following examples, the discovered assets are found in a connection to a DB2 database, but the details of the database are hidden within the connection. So, the caller of the data_discoveries
API specifies the database to discover indirectly via the connection.
API request - Create discovery run:
POST /v2/data_discoveries
Request payload:
{
"entity": {
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
In the example request payload, you can see the ID of the connection whose assets will be discovered, and the ID of the project into which the newly created assets will be added.
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z"
},
"entity": {
"status": "CREATED",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
In the response, you can see that the discovery run was created with the ID dcb8a234ad5e438d904a4cdbe0ba70e2
, which you'll need to use if you want to get the status of the discovery run that you just created. Also shown in the response is:
- `invoked_by`: the IAM ID of the account that kicked off the discovery process
- `bss_account_id`: the BSS account ID of the catalog
- `created_at`: the creation date and time of the discovery job
To get the status of a discovery run use the GET data_discoveries
API. You can request the status of a discovery run as often as desired. In the following sections, you will be shown a few such calls to illustrate the progression of a discovery run.
API Request - Get status of discovery run:
GET /v2/data_discoveries/dcb8a234ad5e438d904a4cdbe0ba70e2
There is no request payload for the previous GET data_discoveries
request. Instead, the ID of the discovery run whose status is being requested is supplied as a path parameter. In the previous URL, use the discovery run ID that was returned by the earlier call to POST data_discoveries
. If you no longer have access to the ID of the discovery run for which you want to see status information, see the section Call Discovery API to get the ID of a metadata discovery run.
The following examples show various responses to the same GET data_discoveries
monitor request previously shown, made at various points during the discovery run.
Response to status request immediately after creation of a discovery run:
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z"
},
"entity": {
"status": "CREATED",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
In the previous response, you can see that the status of the discovery run has not yet changed - it is still CREATED
. This is because the request to discover assets is put into a queue and will be initiated in the order in which it was received.
Response to status request immediately after a discovery run has actually started:
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z",
"started_at": "2018-06-22T15:42:06.167Z",
"ref_project_connection_id": "2526ed95-dedd-4904-bb31-c06d9cb1e105"
},
"entity": {
"statistics": {
},
"status": "RUNNING",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
Now notice that the status
has changed to RUNNING
which indicates that the discovery process has actually started. Also, the metadata
field has some additional fields added to it:
started_at
: the date and time at which the discovery run startedref_project_connection_id
: a reference to a cloned project connection ID, internally set when a discovery is created for a connection in a catalog
In addition, notice that a new statistics
object was introduced into the response body. In the response, that object is empty because the discovery run, which has just started hasn't yet discovered any assets.
Response to status request after some assets were discovered:
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z",
"started_at": "2018-06-22T15:42:06.167Z",
"discovered_at": "2018-06-22T15:42:27.970Z",
"ref_project_connection_id": "2526ed95-dedd-4904-bb31-c06d9cb1e105"
},
"entity": {
"statistics": {
"discovered": 128,
"submit_succ": 128
},
"status": "RUNNING",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
Notice the statistics
object now contains two fields:
discovered
: the number of assets discovered so far during the discovery runsubmit_succ
: the number of assets successfully submitted for creation so far during the discovery run. A discovered asset goes through an internal pipeline with various stages from being discovered at the connection to being created in the project. Here, submitted means the asset was submitted to the internal pipeline.
Refer to Watson Data API schema for the complete list of the possible fields that might show up in the statistics
object.
Because the discovery run isn't yet finished, the status
in the previous response is still RUNNING
.
Response to status request after the discovery run was completed:
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z",
"started_at": "2018-06-22T15:42:06.167Z",
"discovered_at": "2018-06-22T15:42:27.970Z",
"processed_at": "2018-06-22T15:42:45.877Z",
"finished_at": "2018-06-22T15:43:14.969Z",
"ref_project_connection_id": "2526ed95-dedd-4904-bb31-c06d9cb1e105"
},
"entity": {
"statistics": {
"discovered": 179,
"submit_succ": 179,
"create_succ": 179
},
"status": "COMPLETED",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
Notice the status
field has changed to COMPLETED
to indicate that the discovery run is finished. Other response fields to note:
finished_at
: the date and time at which the discovery run finisheddiscovered
: indicates that 179 assets were discovered at the connectionsubmit_succ
: indicates that 179 of the discovered assets were successfully submitted to the discovery run's internal asset processing pipeline.create_succ
: indicates that 179 assets were successfully created in the project
At any time during or after a discovery run, you call Asset APIs to get the list of metadata for the currently discovered assets in the project. To retrieve metadata for any list of assets you can make the following call:
POST /v2/asset_types/{type_name}/search?project_id={project_id}
More specifically, to find the metadata for discovered assets the value to use for the {type_name}
path parameter is discovered_asset
. So, for the discovery run we created, the call to retrieve metadata for the discovered assets would look like this:
API Request - Get metadata for discovered assets:
POST /v2/asset_types/discovered_asset/search?project_id=960f6aff-295f-4de1-a9d7-f3b6805b3590
where the project_id
query parameter value 960f6aff-295f-4de1-a9d7-f3b6805b3590
is the same value that was specified in the body of the POST request that was used to create the discovery run.
In addition, the ID of the connection that the discovery was run against has to be specified in the body of the POST, like this:
{
"query": "discovered_asset.connection_id:\"f638398f-fcc7-4856-b78d-5c8efa5b9282\""
}
Here is part of the response body for the previous query:
{
"total_rows": 179,
"results": [
{
"metadata": {
"name": "EMP_SURVEY_TOPIC_DIM",
"description": "Warehouse table EMP_SURVEY_TOPIC_DIM describes employee survey questions for employees of the Great Outdoors Company, in supported languages.",
"tags": [
"discovered",
"GOSALESDW"
],
"asset_type": "data_asset",
"origin_country": "ca",
"rating": 0.0,
"total_ratings": 0,
"sandbox_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590",
"catalog_id": "a682c698-6019-437d-a0b9-224aa0a4dbc9",
"created": 0,
"created_at": "2018-06-22T15:41:47Z",
"owner": "abc123@us.ibm.com",
"owner_id": "IBMid-50S...",
"size": 0,
"version": 0.0,
"usage": {
"last_update_time": 1.52968210955E12,
"last_updater_id": "iam-ServiceId-87f49...",
"access_count": 0.0,
"last_accessor_id": "iam-ServiceId-87f49...",
"last_access_time": 1.52968210955E12,
"last_updater": "ServiceId-87f49...",
"last_accessor": "ServiceId-87f49..."
},
"asset_state": "available",
"asset_attributes": [
"data_asset",
"discovered_asset"
],
"rov": {
"mode": 0
},
"asset_category": "USER",
"asset_id": "e35cfd4d-590f-40a5-b75c-ec07c0a4bcbc"
}
},
...
]
}
Notice that the total_rows
value 179 matches the create_succ
value that was returned in the result of the API call to get the final status of the completed discovery run.
The results
array in the previous response body has an entry containing metadata for each asset that was discovered by the discovery run. In the previous code snippet, for brevity, only 2 of the 179 entries are shown. The metadata created by the discovery run includes:
name
: in this case, the name of the DB2 table that was discovereddescription
: a description of the table as provided by DB2tags
: these are useful for searching. Thediscovered
tag is one of the tags set for a discovered asset.asset_type
: the type of the asset that was created in the project
Each entry in the results
array also contains an href
field that points to the actual asset that was created by the discovery run.
There might be times in which you no longer have the ID of the metadata discovery run whose status you're interested in, and so might not be able to call the following API for the specific discovery run you're interested in (which requires that ID):
GET /v2/data_discoveries/dcb8a234ad5e438d904a4cdbe0ba70e2
The following example illustrates how to get the IDs of metadata discovery runs for the connection and catalog that were used in the previous call to create a discovery run:
API Request - Get information for discovery runs:
GET /v2/data_discoveries?offset=0&limit=1000&connection_id=f638398f-fcc7-4856-b78d-5c8efa5b9282&catalog_id=816882fa-dcda-46e1-8c6b-fa23c3cbad14
Note that the values of the query parameters connection_id
and catalog_id
correspond to the values for the identically named fields in the payload for the previous request to create a discovery run.
Notice also that you can use the offset
and limit
query parameters to focus on a particular subset of the full list of related discoveries.
The response payload will look like this:
{
"resources": [
{
"metadata": {
"id": "dcb8a234ad5e438d904a4cdbe0ba70e2",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:42:02.843Z",
"started_at": "2018-06-22T15:42:06.167Z",
"discovered_at": "2018-06-22T15:42:27.970Z",
"processed_at": "2018-06-22T15:42:45.877Z",
"finished_at": "2018-06-22T15:43:14.969Z",
"ref_project_connection_id": "2526ed95-dedd-4904-bb31-c06d9cb1e105"
},
"entity": {
"statistics": {
"discovered": 179,
"submit_succ": 179,
"create_succ": 179
},
"status": "COMPLETED",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
],
"first": {
"href": "http://localhost:9080/v2/data_discoveries?offset=0&limit=1000&connection_id=f638398f-fcc7-4856-b78d-5c8efa5b9282&catalog_id=816882fa-dcda-46e1-8c6b-fa23c3cbad14"
},
"next": {
"href": "http://localhost:9080/v2/data_discoveries?offset=1000&limit=1000&connection_id=f638398f-fcc7-4856-b78d-5c8efa5b9282&catalog_id=816882fa-dcda-46e1-8c6b-fa23c3cbad14"
},
"limit": 1000,
"offset": 0
}
Anything that is found because it matches the query criteria is returned in the resources
array. In the previous response, there is only one entry and it corresponds to the discovery run which was created in the previous Create a metadata discovery run section.
There might be times when you want to stop a discovery run before it's completed. To do so, use the PATCH data_discoveries
API. The following illustrates how to abort a discovery run (a different discovery run than the one used in the previous examples):
API Request - Abort a discovery run:
PATCH /v2/data_discoveries/09cbff0981f84c51be4b4d93becc17b0
The previous PATCH request requires the following request body to set the status
of the discovery run to "ABORTED":
{
"op": "replace",
"path": "/entity/status",
"value": "ABORTED"
}
The response payload will look like this:
{
"metadata": {
"id": "09cbff0981f84c51be4b4d93becc17b0",
"invoked_by": "IBMid-50S...",
"bss_account_id": "e348e...",
"created_at": "2018-06-22T15:45:54.638Z",
"started_at": "2018-06-22T15:45:56.202Z",
"finished_at": "2018-06-22T15:46:02.274Z",
"ref_project_connection_id": "2526ed95-dedd-4904-bb31-c06d9cb1e105"
},
"entity": {
"statistics": {
},
"status": "ABORTED",
"connection_id": "f638398f-fcc7-4856-b78d-5c8efa5b9282",
"catalog_id": "816882fa-dcda-46e1-8c6b-fa23c3cbad14",
"project_id": "960f6aff-295f-4de1-a9d7-f3b6805b3590"
}
}
Notice in the previous response payload that the status
has now been set to ABORTED
.
Any assets discovered before the run was aborted will remain discovered. In the example, the abort occurred so quickly after the creation of the discovery run that no assets had been discovered, hence the statistics
object is empty.
Lineage
Introduction
The lineage of an asset includes information about all events, and other assets, that have led to its current state and its further usage. Asset and Event are the two main entities that are part of the lineage data model. An asset can either be generated from or used in subsequent events. An event can be any of:
- asset-generation-events
- asset-modification-events
- asset-usage-events.
Use the Lineage API to publish events on an asset or to query the lineage of an asset.
Publish a lineage event
The following example shows a sample lineage event that can be posted when a data set is published from a project to a catalog:
Request URL
POST /v2/lineage_events
Request Body
{
"message_version": "v1",
"user_id": "IAM-Id_of_User",
"account_id": "e86f2b06b0b267d559e7c387ceefb089",
"event_details": {
"event_id": "sample-event1",
"event_type": "DATASET_PUBLISHED",
"event_category": [
"additions"
],
"event_time": "2018-04-03T14:01:08.603Z",
"event_source_service": "Watson Knowledge Catalog"
},
"generates_assets": [
{
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset03",
"asset_type": "DataSet",
"relation": {
"name": "Created"
},
"properties": {
"dataset": {
"type": "dataset",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset03",
"name": "Asset Name in Catalog XX",
"catalog_id": "9f9c961a-78d1-4c06-a601-4b589catalog"
}
},
"catalog": {
"type": "catalog",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b589catalog"
}
}
}
}
],
"uses_assets": [
{
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset02",
"asset_type": "DataSet",
"relation": {
"name": "Used"
},
"properties": {
"dataset": {
"type": "dataset",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset02",
"name": "2017_sales_data",
"project_id": "9f9c961a-78d1-4c06-a601-4b589project"
}
},
"project": {
"type": "project",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b589project"
}
}
}
}
]
}
Response Body
{
"metadata": {
"id": "01014d1f-31cf-4956-bd41-7a77ba14004c",
"source_event_id": "sample-event1"
}
}
The id generated in the response can be used to query the details of the published event with the following request:
Request URL
GET v2/lineage_events/01014d1f-31cf-4956-bd41-7a77ba14004c
For more details on each field in the lineage event JSON payload, refer to the Lineage Events section of API documentation.
Query lineage of an asset
The lineage of an asset involved in the sample event can be queried using the following request:
Request URL
GET v2/asset_lineages/9f9c961a-78d1-4c06-a601-4b5890fdataset03
Response Body
{
"resources": [
{
"metadata": {
"id": "01014d1f-31cf-4956-bd41-7a77ba14004c",
"source_event_id": "sample-event1",
"created_at": "2018-04-03T14:01:08.603Z",
"created_by": "IAM-Id_of_User"
},
"entity": {
"type": "DATASET_PUBLISHED",
"generates_assets": [
{
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset03",
"type": "DataSet",
"relation": {
"name": "Created"
},
"properties": {
"catalog": {
"type": "catalog",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b589catalog"
}
},
"dataset": {
"type": "dataset",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset03",
"name": "Asset Name in Catalog XX",
"catalog_id": "9f9c961a-78d1-4c06-a601-4b589catalog"
}
}
}
}
],
"uses_assets": [
{
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset02",
"type": "DataSet",
"relation": {
"name": "Used"
},
"properties": {
"dataset": {
"type": "dataset",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b5890fdataset02",
"name": "2017_sales_data",
"project_id": "9f9c961a-78d1-4c06-a601-4b589project"
}
},
"project": {
"type": "project",
"value": {
"id": "9f9c961a-78d1-4c06-a601-4b589project"
}
}
}
}
],
"properties": {
"event_time": "2018-04-03T14:01:08.603Z",
"event_category": [
"additions"
],
"event_source_service": "Watson Knowledge Catalog"
}
}
}
],
"limit": 50,
"offset": 0,
"first": {
"href": "https://api.dataplatform.cloud.ibm.com/v2/asset_lineages/9f9c961a-78d1-4c06-a601-4b5890fdataset03?offset=0&_=1528182675331"
}
}
Simple Query
The simple query can be invoked like this:
GET /v3/search?query='fred flintstone'&limit=100
With the simple query you can peform simple textual searches using the Lucene syntax. The above query will return items containing fred, flintstone, or both.
Advanced Query
You can use the Global Search api to issue queries using the full capabilities of the Elasticsearch Query Language to search for Catalog assets and Governance artifacts. For details on the structure of an item indexed in global search see below. The advanced query can look something like this:
POST /v3/search -d '
{
"_source":["provider_type_id", "artifact_id", "metadata.name"],
"query": {
"query_string" : { "query" : "flintstone" }
}
}'
The above query returns any items containg the string "flintstone".
Searching with authorization cache
Global search searches across the Cloud Pak for Data platform and restricts search results to content that a user is authorized to view. For faster search results, you can use cached authorization information by setting the auth_cache
parameter to true
. The auth_cache
parameter is set to false
by default to use the most current authorization information.
A simple query using cached authorization information:
GET /v3/search?query='fred flintstone'&limit=100&auth_cache=true
An advanced query using cached authorization information:
POST /v3/search?auth_cache=true -d '
{
"_source":["provider_type_id", "artifact_id", "metadata.name"],
"query": {
"query_string" : { "query" : "flintstone" }
}
}'
Searching with limited authorization scope
Global Search searches across the Cloud Pak for Data platform. For a faster search, you can limit your search scope to certain platform components using the auth_scope
parameter.
For example, to limit the scope of your search to assets within catalogs, you can use auth_scope=catalog
, or to limit your search to assets within projects, you can use auth_scope=project
.
Valid values for the auth_scope
parameter are catalog
, project
, space
, category
, ibm_watsonx_governance_catalog
, ibm_data_product_catalog
and all
(default). Note that only a single value can be provided.
A simple query limiting the scope of the search to catalog only:
GET /v3/search?query='fred flintstone'&limit=100&auth_scope=catalog
An advanced query limiting the scope of the search to catalogs only:
POST /v3/search?auth_scope=catalog -d '
{
"_source":["provider_type_id", "artifact_id", "metadata.name"],
"query": {
"query_string" : { "query" : "flintstone" }
}
}'
### Searching for terms in specific fields
The above query searched for the word `flintstone` anywhere in an indexed artifact. You can specify which fields to search in, instead of searching throughout the document using the following example:
```json
{
"_source":["provider_type_id", "artifact_id", "metadata.name"],
"query": {
"match" : { "metadata.name" : "flintstone" }
}
}
In the above example, the query is searching for the term flintstone
but only in the metadata.name
field.
Key-Value Search
Use key-value pairs to restrict search within specific properties such as the name, description, tags, column names, terms, custom properties and more
key:value
: This matchesvalue
in the key property.key:"value here"
: This matchesvalue here
in the key property. Quoted value is treated as a whole phrase.key1:value1 key2:value2 key3:value3
: Multiple key value pairs implies an AND between all pairs. The above matches (value1
in thekey1
property) AND (value2
in thekey2
property) AND (value3
in thekey3
property).text before key1:value1 in-between key2:value2 key3:value3 after
: Key-value pairs mixed with regular strings. The above matches (text
ORbefore
ORin-between
ORafter
in any property) AND (value1
in thekey1
property) AND (value2
in thekey2
property) AND (value3
in thekey3
property)
The following properties can be specified as the key.
name: Search within the name of an asset or artifact
desc: Search within the description of an asset or artifact
type: Search by the type of an asset (asset_type) or artifact (artifact_type)
owner: Search by the user ID of the owner of an asset
term: Search in assets and artifacts with the specified business term assigned
tag: Search in assets and artifacts with the specified tag
category: Search for artifacts with the specified primary category
category2: Search for artifacts with the specified secondary category
abbr: Search by the abbreviation of a business term
syn: Search by the synonym of a business term
classification: Search by the classification of an asset or artifact
column: Search with the name of a column in a data asset
columnDesc: Search within the description of a column in a data asset
columnTerm: Search with a business term assigned to a column in a data asset
columnTag: Search with a tag on a column in a data asset
columnDataclass:Search with a data class of a column in a data asset
columnClassification: Search with a classification on a column in a data asset
connection: Search with a connection path of an asset
schema: Search for data assets with the specified schema name
table: Search for data assets with the specified table name
resourceKey: Search with a resource key of an asset
steward: Search by the user ID of the steward of an artifact
Global Search searchable custom attribute names can also be used as a key to restrict search to the specified custom attribute.
Restrictions for keys:
For key-value search by type:
, the value must be an asset_type
such as data_asset
(as referenced by GET /v2/asset_types
) or artifact_type
such as data_class
(as referenced by GET /v3/governance_artifact_types
). The value is case-sensitive.
For key-value search by owner:
or steward:
, the value must be the user ID and not the user name or display name. The value is case-sensitive.
Sample query to restrict search to few properties
{
"query":{
"bool":{
"must":[
{
"gs_user_query":{
"search_string":" name:job tag:Toronto",
"nlq_analyzer_enabled": true}
}
]
}
}
}
In this sample query, we want to query on any asset or artifact having job
in the name property AND Toronto
in the tags property.
Sample query to restrict search to a custom attribute
{
"query":{
"bool":{
“must”:[
"nested": {
"path": "custom_attributes",
"query": {
"gs_user_query":{
“search_string” : "book.author.last_name:Smith”,
"nlq_analyzer_enabled": true,
"nested": true
}
}
}
]
}
}
In this sample query, we want to query on assets of type book
, having a Global Search searchable ("global_search_searchable"
) field called author.last_name
with a value of Smith
.
Sample Query with Sort
{
"_source":["provider_type_id", "artifact_id", "metadata.name"],
"query": {
"query_string" : {
"query" : "flintstone"
}
},
"sort": [
{"metadata.modified_on": {"order": "desc","unmapped_type": "date"}}
]
}
The above query will sort the search results based on the date the item was modified.
Sample Query with Aggregation
Here is a sample query of a search for the word flintstone
with an aggregation (a count) of the words that people put in their tags
fields and their terms
fields. See the below for the fields that exist documents indexed in Global Search.
{
"query": {
"query_string" : {
"query" : "flintstone"
}
},
"aggregations" : {
"num_tags" : {"terms" : { "field" : "metadata.tags.keyword" }},
"num_terms" : {"terms" : { "field" : "metadata.terms.keyword" }}
}
}
Nested Queries and Custom Attributes
You can add any number of custom attributes to an item you index with Global Search, and each custom attribute consists of combinations of a name
field, and a value
field.
"custom_attributes": [
{
"last_updated_at": 0,
"attribute_name": "string",
"attribute_value": "string"
}
]
Because custom attributes normally consist of two fields acting as one, they are nested objects and you must use nested queries to query on those nested objects.
The custom_attributes fields will be included in the response for any result which has custom_attributes. Nested queries are only required to query properties of the custom_attributes, such as custom_attributes.attribute_name or custom_attributes.attribute_value.
Sample Nested Query
In this sample query, we want to query on any asset having a custom attribute named city
having a value of ottawa
, and a second custom attribute named colour
having a value red
. In this example, the city
attribute is treated as a text field, while the colour
attribute will simulate an enumerated list of colours having exact values (i.e. red, blue, green, etc).
{
"_source":["metadata.name", "custom_attributes"],
"query": {
"bool": {
"must": [
{
"nested": {
"path": "custom_attributes",
"query": {
"bool": {
"must": [
{"term": {"custom_attributes.attribute_name": "city"}},
{"match": {"custom_attributes.attribute_value": "Ottawa"}}
]
}
}
}
},
{
"nested": {
"path": "custom_attributes",
"query": {
"bool": {
"must": [
{"term": {"custom_attributes.attribute_name": "colour"}},
{"term": {"custom_attributes.attribute_value.keyword": "red"}}
]
}
}
}
}
]
}
},
"aggs": {
"custom_attr_count": {
"nested": {
"path": "custom_attributes"
},
"aggs": {
"city_count": {
"filter": {
"term": {"custom_attributes.attribute_name": "city"}
},
"aggs": {
"city_count": {
"terms": {
"field": "custom_attributes.attribute_value.keyword",
"size": 20
}
}
}
}
}
}
}
}
In the query body illustrated above, there's a query portion, and an aggregations (aggs) portion. There can be any number of custom attributes. Because we only want counts of city
we must include a filter
in the aggregation so that only attributes whose name is city
are counted. Notice that the count is returned for custom_attribute.attribute_value.keyword
, not custom_attribute.attribute_value
. This is important to note. You cannot sort or aggregate on text
fields. You can only do so on keyword
fields. Every text field in global search has a corresponding keyword field with a .keyword
extension. Use the .keyword
field for things you want to count or sort on. Finally, the size
parameter restricts the number of counts to return to the top 20.
Caution The key-value search is not supported inside a nested clause.
General Purpose Search Function
Global Search provides a general purpose search function that is tailored to the requirements of CloudPak For Data users. You can invoke it using Global Search's Advanced API (see the Methods section below). It is this function that CloudPak for Data uses when a user enters a search term at the top search bar of the CloudPak for Data user interface. You can invoke it anywhere you would normally invoke a normal ElasticSearch search function. For example it can be the main function of your query:
{
"query":{
"gs_user_query":{
"search_string":"The quick red fox jumped over the lazy brown dog"
}
}
}
The search_string
field is required and specifies the search query string.
The following optional parameters can be specified as part of the gs_user_query
:
search_fields
- List of fields that the search will be restricted to. If not specified, the search will run across all fields in the configuration.nlq_analyzer_enabled
- Specifytrue
to enable the natural language analyzer. The default value isfalse
.semantic_expansion_enabled
- Specifytrue
to enable semantic query expansion. The default value isfalse
.nested
- Specifytrue
to optimize nested queries. The default value isfalse
.
{
"query":{
"gs_user_query":{
"search_string": "The quick red fox jumped over the lazy brown dog",
"search_fields": ["metadata.name", "metadata.description"],
"nlq_analyzer_enabled": true,
"semantic_expansion_enabled": true,
"nested": false
}
}
}
This search function will find:
- a single phrase
- multiple individual words
- partial words (within words or at the beginning of words)
- the first letter of a word
If no search fields are specified, the function will search the entire document, including the name
fields, the description
field, tags
, synonyms
, custom attribute values
, column names
, and column descriptions
, etc. It will give the highest priority to the name field of the document.
You can embed gs_user_query
within a compound query:
{
"query":{
"bool":{
"must":[
{"gs_user_query":{"search_string": "the quick red fox jumped over the lazy brown dog"}}
],
"filter":[
{"term":{"provider_type_id":"cams"}}
]
}
},
"sort": [
{"metadata.modified_on": {"order": "desc","unmapped_type": "date"}}
]
}
You can include gs_user_query
with complex queries that include aggregations along with sorts:
{
"query": {
"gs_user_query" : {
"search_string": "fred flintstone"
}
},
"sort" : [
{"metadata.modified_on": {"order": "desc", "unmapped_type": "date"}}
],
"aggregations": {
"first_letter": {
"terms": {
"script": "doc['metadata.name.keyword'].getValue().substring(0,1)",
"order": {
"_key": "asc"
}
},
"aggs": {
"first_letter_group": {
"terms": {
"field": "metadata.name.keyword",
"order": {
"_key": "asc"
}
}
}
}
}
}
}
You can use gs_user_query
in a nested query to search for custom attributes:
{
"query": {
"bool": {
"should": [
{
"gs_user_query": {
"search_string": "quick red fox",
"nlq_analyzer_enabled": true
}
},
{
"nested": {
"path": "custom_attributes",
"query": {
"gs_user_query": {
"search_string": "lazy brown dog",
"nlq_analyzer_enabled": true,
"nested": true
}
}
}
}
]
}
}
Searching for a quoted phrase
Wrap the phrase in quotes within your query as follows:
{
"query":{
"gs_user_query":{
"search_string":"\"The quick red fox jumped over the lazy brown dog\""
}
}
}
The above query will search for exactly the phrase "The quick red fox jumped over the lazy brown dog".
A quoted phrase can also be included in a longer string:
{
"query":{
"gs_user_query":{
"search_string":"The \"quick red fox\" jumped over the \"lazy brown dog\"",
"nlq_analyzer_enabled": true
}
}
}
The above query will search for the phrases quick red fox and lazy brown dog, and will not return results containing only quick, red, fox, lazy, brown, or dog. The query will, however, return results matching the individual words jumped or over.
Searching for words starting with ...
To search for words starting with a letter or letters, enter only the first 1 to 3 letters of the word.
{
"query":{
"gs_user_query":{
"search_string":"in"
}
}
}
The above query will return documents with words like infinite and invitation, but not words like definitive.
Searching for parts of words
If your search terms include more than three letters, then Global Search will search for any partial word matches. For example
{
"query":{
"gs_user_query":{
"search_string":"init"
}
}
}
The above query will find documents with words like initialize (i.e. at the beginning of the word) and trinitoluene (i.e. within the word).
Partial matches will not be found on the metadata.description
or entity.assets.column_description
fields.
Searching with natural language
Natural language analysis can be applied to English search strings to optimize search results in the following ways:
- Words that are not important to the search intent are removed from the search query.
- Phrases in the search string that are common in English are automatically ranked higher than results for individual words.
{
"query": {
"gs_user_query": {
"search_string": "credit card interest in United States",
"nlq_analyzer_enabled": true
}
}
}
The above query will find documents with:
- Matches for the phrases credit card interest and United States ranked highest
- Matches for individual words credit, card, interest, United, and States ranked lower
The above query will not return documents containing only the word in.
Result scoring
Results are prioritized using a combination of field priority and type of match as follows:
Type of match:
- Matches for entire phrases will score highest.
- Exact matches of complete words will score next highest.
- If the search term is 3 characters or less results that contain words STARTING with that search term will score next highest.
- Partial matches of complete words will score next highest.
- Fuzzy matches (i.e.
adidas
vsadadas
) will match, but will score lowest.
Field:
- Name
- Synonyms, Abbreviation, Terms, Tags or Classifications
- Description, Primary or Secondary Category
- Column Descriptions, Column Terms, Column Tags or Column Data Class Names
Documents in Global Search
You can query on any of the fields within the document by including the field name in a flattened json structure. For example the field:
{
"entity":{
"artifacts":{
"artifact_id":"<id>"
}
}
}
is queried for by using the following
entity.artifacts.artifact_id
Documents indexed in global search have the following structure:
{
"provider_type_id": "string",
"tenant_id": "string",
"artifact_id": "string",
"last_updated_at": 0,
"metadata": {
"name": "string",
"description": "string",
"artifact_type": "string",
"tags": [
"string"
],
"modified_on": "2021-02-11T11:25:59.384Z",
"modified_by": "string",
"terms": [
"string"
],
"term_global_ids": [
"string"
],
"steward_ids": [
"string"
],
"steward_group_ids": [
"string"
]
"state": "string",
"classifications": [
"string"
],
"classification_global_ids": [
"string"
]
},
"entity": {
"artifacts": {
"global_id": "string",
"version_id": "string",
"artifact_id": "string",
"rule_type": "string",
"effective_start_date": "2021-02-11T11:25:59.384Z",
"effective_end_date": "2021-02-11T11:25:59.384Z",
"abbreviation": [
"string"
],
"synonyms": [
"string"
],
"synonym_global_ids": [
"string"
],
"enabled": true
},
"assets": {
"asset_id": "string",
"asset_name": "string",
"attribute_quality_score": 0,
"catalog_id": "string",
"project_id": "string",
"space_id": "string",
"column_names": [
"string"
],
"column_terms": [
"string"
],
"column_term_global_ids": [
"string"
],
"column_descriptions": [
"string"
],
"connection_paths": [
"string"
],
"column_tags": [
"string"
],
"connection_ids": [
"string"
],
"column_data_class_names": [
"string"
],
"data_quality_score": 0,
"data_quality_score_dbl": 0,
"data_quality_delta": 0,
"data_quality_delta_dbl": 0,
"metadata_enrichment_area_id": "string",
"metadata_enrichment_review_date": 0,
"resource_key": "string",
"metadata_enrichment_area_id": "string",
"asset_id": "string",
"suggested_column_term_global_ids": [
"string"
],
"suggested_column_terms": [
"string"
],
"schema_name": "string",
"table_name": "string",
"structured_profiling_completed_date": "2021-02-11T11:25:59.384Z",
"structured_profiling_status": "string",
"term_assignment_completed_date": "2021-02-11T11:25:59.384Z",
"metadata_import_id": "string",
"first_imported_timestamp": "2021-02-11T11:25:59.384Z",
"last_imported_timestamp": "2021-02-11T11:25:59.384Z",
"outdated_timestamp": "2021-02-11T11:25:59.384Z",
"last_discovered_timestamp": "2021-02-11T11:25:59.384Z",
"outdated_reason": "string"
},
"data_asset_column": {
"data_class_name": "string",
"data_class_id": "string",
"data_class_set_by_user": true,
"data_quality_score": 0,
"data_quality_score_dbl": 0,
"metadata_enrichment_review_date": "2021-02-11T11:25:59.384Z",
"data_quality_delta": 0,
"data_quality_delta_dbl": 0,
"column_order": 0,
"suggested_class_ids": ["string", "string", "string"],
"suggested_class_names": ["string", "string", "string"]
},
"ref_data_value": {
"value": "string",
"value_long": 0,
"value_dbl": 0,
"ref_data_set_artifact_id": "string",
"ref_data_set_version_id": "string",
"parent_ref_data_value_artifact_id": "string"
}
},
"custom_attributes": [
{
"last_updated_at": 0,
"attribute_name": "string",
"attribute_value": "string"
}
],
"categories": [
{
"last_updated_at": 0,
"primary_category_id": "string",
"primary_category_global_id": "string",
"primary_category_name": "string",
"secondary_category_ids": [
"string"
],
"secondary_category_global_ids": [
"string"
],
"secondary_category_names": [
"string"
]
}
]
}
Methods
Archive the asset container
Archive the assets database of the asset container.
Archiving the assets database of not used asset containers helps saving storage and computing resources of the database, therefore reduce the footprint of CAMS.The information stored in the archived asset containers will not be accessible until the asset containers are restored.
POST /v2/asset_containers/archive
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Archive request
Response
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Status Code
Accepted
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Get archive information of the asset container
Get archive information of the asset container.
GET /v2/asset_containers/archive_info
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Response
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Get configurations
Get the configurations of a catalog/project/space.
GET /v2/asset_containers/configurations
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Response
Action to take if duplicate assets are found for a given asset
Possible values: [
IGNORE
,REJECT
,REPLACE
,UPDATE
]Default duplicate detection strategy for assets of types that do not have duplicate detection strategy defined
Possible values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]Duplicate detection strategies of assets with specified asset types in this catalog/project/space
Authorize reporting in this catalog
Whether assets has to be purged on delete
Days after which assets has to be purged after soft delete
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
No Sample Response
Replace or create configurations
Replace the configurations of a catalog/project/space or create the configurations if they do not exist.
PUT /v2/asset_containers/configurations
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Configurations
Action to take if duplicate assets are found for a given asset
Allowable values: [
IGNORE
,REJECT
,REPLACE
,UPDATE
]Default duplicate detection strategy for assets of types that do not have duplicate detection strategy defined
Allowable values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]Duplicate detection strategies of assets with specified asset types in this catalog/project/space
Authorize reporting in this catalog
Whether assets has to be purged on delete
Days after which assets has to be purged after soft delete
Response
Action to take if duplicate assets are found for a given asset
Possible values: [
IGNORE
,REJECT
,REPLACE
,UPDATE
]Default duplicate detection strategy for assets of types that do not have duplicate detection strategy defined
Possible values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]Duplicate detection strategies of assets with specified asset types in this catalog/project/space
Authorize reporting in this catalog
Whether assets has to be purged on delete
Days after which assets has to be purged after soft delete
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Restore the asset container
Restore the assets database of the archived asset container. Only successfully archived asset container can be restored.
POST /v2/asset_containers/restore
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Response
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Status Code
OK
Accepted
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
List all asset lists defined for an account
Retrieve a paginated list of asset lists defined for an account.
GET /v2/asset_lists
Request
Query Parameters
Query to retrieve a paginated list of asset list metadatas for which the request user has permission to view.
Search query in Common Expression Language, with special characters encoded. For example, & is %26.
Examples of query parameters:
* id
* type
* state
* access_control.owner
* created_at
* last_updated_atPossible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$%&=|\(\)\s\-\_\^"]+$
Example:
type==order%26%26created_at%3E=2023-08-23T21:10:40Z
Start token for pagination
Possible values: 1 ≤ length ≤ 512, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Search result limit
Possible values: 1 ≤ value ≤ 200
Default:
200
Example:
10
This parameter can be repeated to add additional sort fields.
Default: nullExamples of sort fields (these are case insensitive):
- created_at
- -created_at
- last_updated_at
- -last_updated_at
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
curl --request GET --url '{url}/v2/asset_lists' --header 'authorization: Bearer {access_token}' --header 'accept: application/json'
Response
The Asset list collection response API model.
Collection of asset list results.
Possible values: 0 ≤ number of items ≤ 200
The limit on the number of results returned.
Example:
200
The total count on the number of results returned, accounting for filtering but not pagination.
Example:
50
A page in a pagination collection.
The next page in the collection.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
{ "limit": 200, "total_count": 1, "first": { "href": "https://api.dataplatform.cloud.ibm.com/v2/asset_lists" }, "next": { "href": "https://api.dataplatform.cloud.ibm.com/v2/asset_lists?start=start_token", "start": "start_token" }, "asset_lists": [ { "name": "my_order", "description": "Order of multiple items", "type": "order", "state": "default_state", "id": "3204c622-dcb8-4728-869f-484b6ac73dff", "asset": { "id": "701f4af4-679c-42d1-80b5-0de5796a7514", "type": "ibm_data_product_version", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId" } ] }
Create a new asset list
Use this API to creates a new asset list.
If the state
is not specified, the asset list will be created in DEFAULT state.
The list may be created with items, or without items.
The list will be owned by the request user.
POST /v2/asset_lists
Request
The asset list for creation.
Example REST body for asset list creation of type ORDER.
{
"name": "my_order",
"description": "Order of multiple items",
"type": "order",
"items": [
{
"asset": {
"id": "8509d09e-b3d0-4e8b-9682-726a7b95f69a",
"type": "data_asset",
"container": {
"id": "71993ab5-3374-4355-8a38-fc4c32debc00",
"type": "catalog"
}
},
"properties": {
"my_property_1": "Value 1",
"my_property_2": "Value 2"
}
},
{
"asset": {
"id": "8509d09e-b3d0-4e8b-9682-726a7b95f69b",
"type": "data_asset",
"container": {
"id": "71993ab5-3374-4355-8a38-fc4c32debc00",
"type": "catalog"
}
},
"properties": {
"my_property_1": "Value 1",
"my_property_map": {
"my_property_map_key": "Value 2"
}
}
}
]
}
Asset list name. A name can contain letters, numbers, understores, dashes, spaces or periods. Names are mutable and reusable.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
My wish list
The type of the asset list. Valid types are: ['order', 'cart', 'wishlist']
Allowable values: [
cart
,wishlist
,order
]Example:
order
An asset from a container.
The items in the asset list.
Possible values: 0 ≤ number of items ≤ 20
A description of the asset list.
Possible values: 1 ≤ length ≤ 4000, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
This list contains assets for project X.
The state of the asset list. Valid states are: ['default_state', 'confirmed']. If not specified, the default value will be 'DEFAULT'.
Allowable values: [
default_state
,received
,under_review
,ready_to_deliver
,rejected
,in_progress
,partially_delivered
,succeeded
,failed
,workflow_exception
,cancelled
,redeliver
,reattempt
,rerequest
]Default:
default_state
Example:
default_state
A message used for communicating reasons for state changes.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
The reason for the state change.
The properties of an asset list item.
curl --request POST --url '{url}/v2/asset_lists' --header 'accept: application/json' --header 'authorization: Bearer {access_token}' --header 'content-type: application/json' --data '{"name": "my_order","description": "Order of multiple items","items": [{"asset": {"metadata": {"id": "8509d09e-b3d0-4e8b-9682-726a7b95f69a","type": "data_asset","container": {"id": "71993ab5-3374-4355-8a38-fc4c32debc00","type": "catalog"}}},"properties": {"my_property_1": "Value 1","my_property_2": "Value 2"}},{"asset": {"metadata": {"id": "8509d09e-b3d0-4e8b-9682-726a7b95f69b","type": "data_asset","container": {"id": "71993ab5-3374-4355-8a38-fc4c32debc00","type": "catalog"}}},"properties": {"my_property_1": "Value 1","my_property_map": {"my_property_map_key": "Value 2"}}}],"type": "order"}'
Response
The Asset list summary API model for retrieval.
The unique identifier of the asset list formatted either as a UUID or a special case such as cart_userID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
RFC 3339 timestamp when the asset list was created (system managed).
Example:
2022-04-22T03:22:32.000Z
RFC 3339 timestamp when the asset list was last updated (system managed).
Example:
2022-04-22T03:22:32.000Z
The user who last updated the asset list.
Example:
userId
Asset list name. A name can contain letters, numbers, understores, dashes, spaces or periods. Names are mutable and reusable.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
My wish list
The type of the asset list. Valid types are: ['order', 'cart', 'wishlist']
Possible values: [
cart
,wishlist
,order
]Example:
order
The state of the asset list. Valid states are: ['default_state', 'confirmed']. If not specified, the default value will be 'DEFAULT'.
Possible values: [
default_state
,received
,under_review
,ready_to_deliver
,rejected
,in_progress
,partially_delivered
,succeeded
,failed
,workflow_exception
,cancelled
,redeliver
,reattempt
,rerequest
]Example:
default_state
An asset from a container.
Access control permissions for the asset list.
A description of the asset list.
Possible values: 1 ≤ length ≤ 4000, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
This list contains assets for project X.
A message used for communicating reasons for state changes.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
The reason for the state change.
The properties of an asset list item.
Status Code
Created
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
Internal Server Error
{ "name": "my_order", "description": "Order of multiple items", "type": "order", "state": "default_state", "id": "3204c622-dcb8-4728-869f-484b6ac73dff", "asset": { "id": "701f4af4-679c-42d1-80b5-0de5796a7514", "type": "ibm_data_product_version", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId" }
Delete asset lists
Deletes asset lists based on a query in Common Expression Language format.
DELETE /v2/asset_lists
Request
Query Parameters
Query to delete asset lists for which the request user has permission.
Search query in Common Expression Language, with special characters encoded. For example, & is %26.
Examples of query parameters:
* id
* type
* state
* access_control.owner
* created_at
* last_updated_atPossible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$%&=|\(\)\s\-\_\^"]+$
Example:
type==order%26%26created_at%3E=2023-08-23T21:10:40Z
curl --request DELETE --url '{url}/v2/asset_lists?query=created_at%3E%3D2023-08-23T21%3A10%3A40Z' --header 'accept: application/json' --header 'authorization: Bearer {access_token}'
Retrieve an asset list identified by ID
Retrieves the metadata of an asset list of a given identified by ID.
GET /v2/asset_lists/{asset_list_id}
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
curl --request GET --url '{url}/v2/asset_lists/{asset_list_id}' --header 'authorization: Bearer {access_token}' --header 'accept: application/json'
Response
The Asset list summary API model for retrieval.
The unique identifier of the asset list formatted either as a UUID or a special case such as cart_userID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
RFC 3339 timestamp when the asset list was created (system managed).
Example:
2022-04-22T03:22:32.000Z
RFC 3339 timestamp when the asset list was last updated (system managed).
Example:
2022-04-22T03:22:32.000Z
The user who last updated the asset list.
Example:
userId
Asset list name. A name can contain letters, numbers, understores, dashes, spaces or periods. Names are mutable and reusable.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
My wish list
The type of the asset list. Valid types are: ['order', 'cart', 'wishlist']
Possible values: [
cart
,wishlist
,order
]Example:
order
The state of the asset list. Valid states are: ['default_state', 'confirmed']. If not specified, the default value will be 'DEFAULT'.
Possible values: [
default_state
,received
,under_review
,ready_to_deliver
,rejected
,in_progress
,partially_delivered
,succeeded
,failed
,workflow_exception
,cancelled
,redeliver
,reattempt
,rerequest
]Example:
default_state
An asset from a container.
Access control permissions for the asset list.
A description of the asset list.
Possible values: 1 ≤ length ≤ 4000, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
This list contains assets for project X.
A message used for communicating reasons for state changes.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
The reason for the state change.
The properties of an asset list item.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
Internal Server Error
{ "name": "my_order", "description": "Order of multiple items", "type": "order", "state": "default_state", "id": "3204c622-dcb8-4728-869f-484b6ac73dff", "asset": { "id": "701f4af4-679c-42d1-80b5-0de5796a7514", "type": "ibm_data_product_version", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId" }
Delete an asset list identified by ID
Deletes an asset list identified by ID. The user must have permission to perform this action.
DELETE /v2/asset_lists/{asset_list_id}
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
curl --request DELETE --url '{url}/v2/asset_lists/{asset_list_id}' --header 'accept: application/json' --header 'authorization: Bearer {access_token}'
Update the metadata of an asset list
Use this API to update the asset list metadata. Specify patch operations using http://jsonpatch.com/ syntax.
Fields that can be patched:
- /name
- /description
- /type
- /state (Approvers and Functional users only)
- /message (Approvers and Functional users only)
Note: The ownership cannot be modified after creation.
PATCH /v2/asset_lists/{asset_list_id}
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Example REST body for asset list update of the state parameter.
[
{
"op": "replace",
"path": "/state",
"value": "confirmed"
}
]
The Json patch path as defined by RFC 6902.
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&/\(\)\s\-\_\^"]+$
The Json patch from path as defined by RFC 6902.
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&/\(\)\s\-\_\^"]+$
The Json patch value.
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
The Json patch operation as defined by RFC 6902.
Allowable values: [
add
,replace
,remove
,move
,copy
,test
]
curl --request PATCH --url '{url}/v2/asset_lists/{asset_list_id}' --header 'accept: application/json' --header 'authorization: Bearer {access_token}' --header 'content-type: application/json' --data '[{"op": "replace","path": "/state","value": "confirmed"}]'
Response
The Asset list summary API model for retrieval.
The unique identifier of the asset list formatted either as a UUID or a special case such as cart_userID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
RFC 3339 timestamp when the asset list was created (system managed).
Example:
2022-04-22T03:22:32.000Z
RFC 3339 timestamp when the asset list was last updated (system managed).
Example:
2022-04-22T03:22:32.000Z
The user who last updated the asset list.
Example:
userId
Asset list name. A name can contain letters, numbers, understores, dashes, spaces or periods. Names are mutable and reusable.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
My wish list
The type of the asset list. Valid types are: ['order', 'cart', 'wishlist']
Possible values: [
cart
,wishlist
,order
]Example:
order
The state of the asset list. Valid states are: ['default_state', 'confirmed']. If not specified, the default value will be 'DEFAULT'.
Possible values: [
default_state
,received
,under_review
,ready_to_deliver
,rejected
,in_progress
,partially_delivered
,succeeded
,failed
,workflow_exception
,cancelled
,redeliver
,reattempt
,rerequest
]Example:
default_state
An asset from a container.
Access control permissions for the asset list.
A description of the asset list.
Possible values: 1 ≤ length ≤ 4000, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
This list contains assets for project X.
A message used for communicating reasons for state changes.
Possible values: 1 ≤ length ≤ 256, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
The reason for the state change.
The properties of an asset list item.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Conflict
Unsupported Media Type
Internal Server Error
{ "name": "my_order", "description": "Order of multiple items", "type": "order", "state": "default_state", "id": "3204c622-dcb8-4728-869f-484b6ac73dff", "asset": { "id": "701f4af4-679c-42d1-80b5-0de5796a7514", "type": "ibm_data_product_version", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId" }
List the items of an asset list identified by ID
Retrieves the list items of an asset list identified by ID.
GET /v2/asset_lists/{asset_list_id}/items
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
Query Parameters
Query to retrieve a paginated list of asset list items for which the request user has permission to view.
Search query in Common Expression Language, with special characters encoded. For example, & is %26.Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$%&=|\(\)\s\-\_\^"]+$
Example:
properties.data_product_id==aaff9159-8ba4-4ae5-a9d7-3e59d5903d72%26%26created_at%3E=2023-08-23T21:10:40Z
Start token for pagination
Possible values: 1 ≤ length ≤ 512, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Search result limit
Possible values: 1 ≤ value ≤ 200
Default:
200
Example:
10
This parameter can be repeated to add additional sort fields.
Default: nullExamples of sort fields (these are case insensitive):
- created_at
- -created_at
- last_update_time
- -last_update_time
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
curl --request GET --url '{url}/v2/asset_lists/{asset_list_id}/items' --header 'accept: application/json' --header 'authorization: Bearer {access_token}'
Response
The Asset list item collection response API model.
Collection of asset list item results.
Possible values: 0 ≤ number of items ≤ 200
The limit on the number of results returned.
Example:
200
The total count on the number of results returned, accounting for filtering but not pagination.
Example:
50
A page in a pagination collection.
The next page in the collection.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
Internal Server Error
{ "limit": 200, "total_count": 2, "first": { "href": "https://api.dataplatform.cloud.ibm.com/v2/asset_lists/3204c622-dcb8-4728-869f-484b6ac73dff/items" }, "next": { "href": "https://api.dataplatform.cloud.ibm.com/v2/asset_lists/3204c622-dcb8-4728-869f-484b6ac73dff/items?start=start_token", "start": "start_token" }, "items": [ { "asset": { "id": "8509d09e-b3d0-4e8b-9682-726a7b95f69a", "type": "data_asset", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "properties": { "my_property_1": "Value 1", "my_property_2": "Value 2" }, "id": "6357fce0-41ef-433b-95ee-03ade01bd02d", "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId", "asset_list": { "id": "3204c622-dcb8-4728-869f-484b6ac73dff" } }, { "asset": { "id": "8509d09e-b3d0-4e8b-9682-726a7b95f69b", "type": "data_asset", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "properties": { "my_property_1": "Value 1", "my_property_map": { "my_property_map_key": "Value 2" } }, "id": "cdaf6713-ede0-4a61-9fa4-d65467c59e9c", "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId", "asset_list": { "id": "3204c622-dcb8-4728-869f-484b6ac73dff" } } ] }
Retrieve an item in an asset list
Retrieves an item identified by ID of an asset list identified by ID.
GET /v2/asset_lists/{asset_list_id}/items/{asset_list_item_id}
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
Asset list item id (GUID)
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d73
curl --request GET --url '{url}/v2/asset_lists/{asset_list_id}/items/{asset_list_item_id}' --header 'accept: application/json' --header 'authorization: Bearer {access_token}'
Response
The Asset list item canonical API model.
The unique identifier of the asset list item formatted as a UUID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
RFC 3339 timestamp when the asset list item was created (system managed).
Example:
2022-04-22T03:22:32.000Z
RFC 3339 timestamp when the asset list item was last updated (system managed).
Example:
2022-04-22T03:22:32.000Z
The user who last updated the asset list item.
Example:
userId
The reference to the asset list to which the item belongs.
- asset_list
The unique identifier of the asset list item formatted as a UUID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
An asset from a container.
Access control permissions for the asset list.
The properties of an asset list item.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
Internal Server Error
{ "asset": { "id": "8509d09e-b3d0-4e8b-9682-726a7b95f69a", "type": "data_asset", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "properties": { "my_property_1": "Value 1", "my_property_2": "Value 2" }, "id": "6357fce0-41ef-433b-95ee-03ade01bd02d", "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId", "asset_list": { "id": "3204c622-dcb8-4728-869f-484b6ac73dff" } }
Update an asset list item
Use this API to update an asset list item. Specify patch operations using http://jsonpatch.com/ syntax.
Fields that can be patched:
- /properties
PATCH /v2/asset_lists/{asset_list_id}/items/{asset_list_item_id}
Request
Path Parameters
Asset list id (GUID, or [type]_[userId])
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d72
Asset list item id (GUID)
Possible values: length = 36, Value must match regular expression
^[\w\.,:$&\(\)\s\-\_\^"]+$
Example:
aaff9159-8ba4-4ae5-a9d7-3e59d5903d73
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Example REST body for asset list item update of the properties map.
[
{
"op": "add",
"path": "/properties/my_prop",
"value": {
"key": "Val 1"
}
},
{
"op": "replace",
"path": "/properties/my_prop2",
"value": {
"key": "Val 2"
}
}
]
The Json patch path as defined by RFC 6902.
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&/\(\)\s\-\_\^"]+$
The Json patch from path as defined by RFC 6902.
Possible values: 1 ≤ length ≤ 1024, Value must match regular expression
^[\w\.,:$&/\(\)\s\-\_\^"]+$
The Json patch value.
The Json patch operation as defined by RFC 6902.
Allowable values: [
add
,replace
,remove
,move
,copy
,test
]
curl --request PATCH --url '{url}/v2/asset_lists/{asset_list_id}/items/{asset_list_item_id}' --header 'accept: application/json' --header 'authorization: Bearer {access_token}' --header 'content-type: application/json' --data '[{"op": "add","path": "/properties/my_prop","value": {"key": "Val 1"}},{"op": "replace","path": "/properties/my_prop2/key","value": "Val 2"}]'
Response
The Asset list item canonical API model.
The unique identifier of the asset list item formatted as a UUID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
RFC 3339 timestamp when the asset list item was created (system managed).
Example:
2022-04-22T03:22:32.000Z
RFC 3339 timestamp when the asset list item was last updated (system managed).
Example:
2022-04-22T03:22:32.000Z
The user who last updated the asset list item.
Example:
userId
The reference to the asset list to which the item belongs.
- asset_list
The unique identifier of the asset list item formatted as a UUID.
Example:
8509d09e-b3d0-4e8b-9682-726a7b95f69f
An asset from a container.
Access control permissions for the asset list.
The properties of an asset list item.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Unsupported Media Type
Internal Server Error
{ "asset": { "id": "8509d09e-b3d0-4e8b-9682-726a7b95f69a", "type": "data_asset", "container": { "id": "71993ab5-3374-4355-8a38-fc4c32debc00", "type": "catalog" } }, "properties": { "my_property_1": "Value 1", "my_property_2": "Value 2" }, "id": "6357fce0-41ef-433b-95ee-03ade01bd02d", "access_control": { "owner": "userId", "bss_account_id": "bss_account_id" }, "created_at": "2023-09-17T23:19:08.000Z", "last_updated_at": "2023-09-17T23:19:08.000Z", "last_updated_by": "userId", "asset_list": { "id": "3204c622-dcb8-4728-869f-484b6ac73dff" } }
Searches for relationship types
Searches for relationship types using various criteria. The query parameters here provide filters for the relationship types that are returned in the result. If no query parameters are provided, the result will include all relationship types that are visible to your bss account. If an asset container is specified, the results are limited to relationship types where at least one end can be used with the specified asset container. This is the case if the relationship end either has no declared asset container (for example because the end references a global asset type or is untyped) or if the end has an asset container that is the same as the asset container specified. If a relname
filter is specified, the results are limited to relationship types where at least one end has a relationship name in the list. In order to see relationship types where at least one end is in the container asset type(s), you must either belong to the container's BSS account or have at least editor permission on the container. All users have permission to see global relationship types.
Limit the relationship types based on the item types:
- If the
asset_type
is specified, the results are limited to relationship types where at least one end is either untyped or is in the specified asset type list. - If both
asset_type
andrelated_asset_type
are specified, the results are limited to relationship types where both ends are untyped, or where one end is in the specified asset type list and the other end has the specified asset type list. - If the
artifact_type
is specified, the results are limited to relationship types where at least one end is either untyped or is in the specified artifact type list. - If both
artifact_type
andrelated_asset_type
are specified, the results are limited to relationship types where both ends are untyped, or where one end is in the specified artifact type list and the other end has the specified asset type list. - If the
asset_type
orrelated_asset_type
is set to*#_column
, the results are limited to relationship types where one end can be used with any supported column type (currently limited todata_asset
columns) - If the item type is set to
*
, the results are limited to relationship types where one end can be used with any asset type or any artifact type.
Note: In CP4D environments there is just one global bss account, so all relationship types are always visible to all users.
GET /v2/asset_relationship_types
Request
Query Parameters
Required when container_id is specified and the container is not a catalog. Specifies the type of the container (CATALOG/PROJECT/SPACE). Defaults to CATALOG.
Default:
CATALOG
Container ID. This limits results to relationship types that have an endpoint with the specified ID of a catalog, project, or space. If the container is a project or space, the container_type parameter must be also specified, otherwise container_id is treated as a catalog ID. This parameter is required when querying for relationship types associated with a catalog, project, or space. If this parameter is omitted, relationships types are not filtered by their container. Note that any type with an endpoint that has no container ID will also be included the result.
The parameter
type_name
is deprecated, please useasset_type
insteadAsset Type Name(s). This limits the results to only relationship types that have an endpoint in the specified asset type list, or have an endpoint that is untyped.
Artifact Type Name(s). This limits the results to only relationship types that have an endpoint in the specified artifact type list, or have an endpoint that is untyped.
Related Asset Type Name(s). This parameter can only be set when either asset_type or artifact_type is also set.
When asset_type is set, this limits the results to only relationship types that have an endpoint with the specified asset type and the other endpoint with this specified asset type.
When artifact_type is set, this limits the results to only relationship types that have an endpoint with the specified artifact type and the other endpoint with this specified asset type.
limit to use when finding relationship types
Default:
25
Optional bookmark to use when finding relationship types.
Relationship name(s). This limits the results to relationship types that have an endpoint with the specified name. The parameter should be repeated for each allowed relationship name.
Default display name(s). This limits the results to relationship types that have an endpoint with the specified name. The parameter should be repeated for each allowed relationship name.
Specifies the scope of the relationship types which can be (GLOBAL/BSS_ACCOUNT). If GLOBAL - it returns all types scoped to Global.If BSS_ACCOUNT - it returns all types scoped to BSS_ACCOUNT.If query parameter is omitted, it returns all.
If specified, the results are limited to relationship types that can be accessed by the specified BSS account. If
container_id
is also specified, the catalog's BSS account must be the same as the BSS account specified here.If
bss_account_id
is not specified and the API is called with an accredited service ID access token, then there is no filtering of the results based on BSS account.If
bss_account_id
is not specified, and the API is called with a regular user (i.e. non-service ID) access token, then the result depends on whethercontainer_id
is specified. Ifcontainer_id
is specified, then the result includes global relationship types and account-scoped relationship types for the container's BSS account. Ifcontainer_id
is not specified, then the result includes global relationship types, and account-scoped relationship types for all of the user's BSS accounts.
Creates an asset relationship type
Use this API to create an asset relationship type. The type definition consists of two endpoints which specify the two ends of a bidirectional relationship. The endpoints define the name of the relationship at each end, and the qualified asset type or the qualified artifact type that contains the relationship endpoint. The names of the relationships that can be used with any given asset type or artifact type are required to be unique. If the qualified asset type and the qualified artifact type are omitted, that end can be used with any asset type and any artifact type in any container.
Specifying the Type
The combination of container_type
, container_id
, and containing_asset_type(s)
control what asset types can be used at each end of the relationship. The asset type can be a global asset type, an account-level asset type, a container-scoped asset type, or any asset type.
Global Asset Type
To define a relationship end for a global asset type, the containing_asset_type(s)
field must be set to the name of the asset type and the container_id
must be omitted. In this case, the relationship end can only be used with the specified global asset type. If the container_type
field is set when using a global asset type, it is ignored.
Account Level Asset Type
To define a relationship end for an account-level asset type, the containing_asset_type(s)
field must be set to the name of the asset type and the container_id
must be omitted. In addition, the bss_account_id
query parameter must be set to the bss account that owns the account-level asset type. In this case, the relationship end can only be used with the specified account-level asset type. If the container_type
field is set when using an account-level asset type, it is ignored.
Container-Scoped Asset Type
The relationship mechanism allows relationship ends to be restricted to an asset type defined in a specific catalog, project, or space. To do this, the containing_asset_type(s)
, container_type
, and container_id
fields must all be set.
Artifact Type
To define a relationship end for an governance artifact type, the containing_artifact_type(s)
field must be set to the name of the artifact type and the container_id
must be omitted.
The on_delete field can only set to IGNORE on both ends
Column Type
To define a relationship end for a column type, the containing_asset_type(s)
field must be set in the format of <asset_type>#_column
.
When *#_column
is specified, the relationship end can be used with any supported column type.
Currently, only data_asset
columns are supported.
The on_delete
field can only set to IGNORE
on both ends
Any Type
To allow a relationship end to be used with any asset type, the containing_artifact_type(s)
, containing_asset_type(s)
and container_id
fields for the relationship end must be omitted. In this case, the relationship end can be used with any asset type, any artifact type, and any supported column type in any container. If the container_type
field is set, it will be ignored.
Note: Cannot have containing_artifact_type(s)
specified in both ends
Specifying the Relationship Type Scope
A relationship type can either be global or be scoped to a BSS account.
A relationship type can be explicitly scoped to a specific bss account using the bss_account_id
query parameter. When a relationship type is scoped to a bss account, all specified asset containers and account-level asset types must belong to that bss account. If either endpoint for the relationship type has a specific asset container, the relationship type is automatically scoped to the bss account for the asset container. Account scoped relationship types can only be created if the user either belongs to the same bss account or is an administrator of all asset containers in the relationship type definition.
By default, if neither end of the relationship type specifies an asset container, the relationship type is created in the global scope and is accessible to all users. Only service ids with permission to create global asset types are allowed to create globally scoped relationship types. It is possible to scope a relationship type where neither end has an asset container to a specific BSS account. To do this, you must set the bss_account_id
query parameter when creating the relationship type. The bss account must be set to a bss account that you have access to. In that case, the relationship type will be created, but its use will be restricted to asset containers associated with that bss account.
By default, on_delete is set to IGNORE
if it is not specified
Note that in CP4D environments, there is only one bss account (999
). As a result, in that context relationship types scoped to a specific BSS account are effectively global.
Cascading clone/publish/promote to Referenced Assets
The following flags control how related assets are processed when an asset is copied using the deepcopy
endpoint
|Option |Allowed Value |Default |Description |
|-----------|----------------|----------|-----------------------------------------------------------------------------------------------------------|
|on_clone |CASCADE/IGNORE |IGNORE |If CASCADE
, deepcopy
from a catalog to a project causes the referenced assets to be copied as well |
|on_publish |CASCADE/IGNORE |IGNORE |If CASCADE
, deepcopy
from project to a catalog causes the referenced assets to be copied as well |
|on_promote |CASCADE/IGNORE |IGNORE |If CASCADE
, deepcopy
from a project to a space causes the referenced assets to be coped as well |
POST /v2/asset_relationship_types
Request
Query Parameters
This parameter forces the relationship type to be scoped to the specified bss account. It allows the creation of an account-scoped relationship type where both ends are either untyped, or are global asset types. It is required if the relationship type entity references any account-level asset types. An account-scoped relationship type is only visible to users in the BSS account for the relationship type. If the query parameter is unset, such relationships will be global in scope and can only be created if the user has permission to create a global asset type. Note: the BSS account can also be set via an impersonation header. If it is set in both places, the
bss_account_id
query parameter has precedence. This parameter is not required if either end of the relationship type has a catalog asset type. If it is set for a relationship type that has a catalog asset type, it must be the same as the BSS account for the catalog.
Relationship Type request body
First relationship type endpoint (order is not important)
- end1
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Allowable values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Allowable values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Allowable values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Second relationship type endpoint (order is not important)
- end2
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Allowable values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Allowable values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Allowable values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Allowable values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Set of flags. These flags determine the relationship type's behavior
- flags
Description of the relationship type
Response
relationship type metadata
- metadata
Unique ID of the relationship type
ID of the user that created the relationship type
Creation timestamp in milliseconds
Creation timestamp in yyyy-MM-dd'T'HH:mm:ss.SSSX format
Update timestamp in milliseconds
Update timestamp in yyyy-MM-dd'T'HH:mm:ss.SSSX format
ID of the user that updated the relationship type
Tenancy of the relationship type
- tenancy
Possible values: [
GLOBAL
,BSS_ACCOUNT
]
relationship type entity
- entity
First relationship type endpoint (order is not important)
- end1
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Possible values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Possible values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Possible values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Second relationship type endpoint (order is not important)
- end2
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Possible values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Possible values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Possible values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Set of flags. These flags determine the relationship type's behavior
- flags
Description of the relationship type
Status Code
Created
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Request
No Sample Response
Searches for relationship type ends
Find relationship type ends that can be used with an asset type or artifact type. This API supports finding relationship type ends that use global or catalog asset types.
For an asset type:
- set
asset_type
. This parameter should be repeated for each allowed asset type - All asset types in the list must either be global/account level asset types or be scoped to the asset container specified by the
catalog_id
,project_id
, orspace_id
.
For a column type:
- set
asset_type
as<asset_type>#_column
- if
asset_type
is specific to a container (catalog, space, or project), then set a container ID parameter as appropriate (catalog_id
,space_id
,project_id
).
For a governance artifact type:
- set
artifact_type
. This parameter should be repeated for each allowed artifact type
POST /v2/asset_relationship_types/search_ends
Request
Query Parameters
catalog_id. If no container parameters are set, any asset container the user has access to are allowed.
project_id. If no container parameters are set, any asset containers the user has access to are allowed.
space_id. If no container parameters are set, any asset containers the user has access to are allowed.
asset_type(s). This specifies the asset type(s) whose relationship type ends should be found. This parameter should be repeated for each allowed asset type.
artifact_type(s). This specifies the artifact type(s) whose relationship type ends should be found. This parameter should be repeated for each allowed artifact type.
end_types. This filters the result to only include relationship type ends of the specified type(s) - valid filter values are {ASSET, ARTIFACT, COLUMN}. If this parameter is not specified, all relationship ends that match other criteria will be returned.
Optional limit to use when finding relationship type ends
Default:
25
Optional bookmark to use when finding relationship type ends.
orderby. This parameter can be repeated to add additional sort fields.
Default: relationship_display_name_ascSupported sort fields (these are case insensitive):
- relationship_name_asc
- relationship_name_desc
- relationship_display_name_asc
- relationship_display_name_desc
- create_time_asc
- create_time_desc
Default:
relationship_display_name_asc
Limits results to only include the specified relationships. By default, all relationship type ends which match the other search criteria are included. Repeat the query parameter to specify multiple relationship names.
Response
- options
Possible values: contains only unique items
Possible values: contains only unique items
Possible values: [
ASSET
,ARTIFACT
,COLUMN
]Possible values: [
RELATIONSHIP_NAME_ASC
,RELATIONSHIP_NAME_DESC
,RELATIONSHIP_DISPLAY_NAME_ASC
,RELATIONSHIP_DISPLAY_NAME_DESC
,CREATE_TIME_ASC
,CREATE_TIME_DESC
]
- next
Status Code
Success
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
No Sample Response
Finds relationship types that are used by an asset, column or governance artifact
Finds relationship types that are used by an asset, column or a governance artifact. This API only returns the first page of results. It does not accept a bookmark. Any additional pages of results must be fetched using the next page url included in the response.
You can refer to the options in the response to get the full list of relationship names.
POST /v2/asset_relationship_types/search_used_ends
Request
Query Parameters
Catalog ID
Project ID
Space ID
Asset ID
Artifact ID
Artifact Type
Optional limit to use when finding relationship types
Default:
25
Includes a list of the related asset types and artifact types in the result. This option adds overhead, so enable it only if the related type information is needed.
Default:
false
orderby. This parameter can be repeated to add additional sort fields.
Default: relationship_display_name_ascSupported sort fields (these are case insensitive):
- relationship_name_asc
- relationship_name_desc
- relationship_display_name_asc
- relationship_display_name_desc
- create_time_asc
- create_time_desc
Default:
relationship_display_name_asc
Response
- options
Possible values: contains only unique items
Possible values: contains only unique items
Possible values: [
ASSET
,ARTIFACT
,COLUMN
]Possible values: [
RELATIONSHIP_NAME_ASC
,RELATIONSHIP_NAME_DESC
,RELATIONSHIP_DISPLAY_NAME_ASC
,RELATIONSHIP_DISPLAY_NAME_DESC
,CREATE_TIME_ASC
,CREATE_TIME_DESC
]
- next
Status Code
Success
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
No Sample Response
Patches a relationship type
Updates an existing relationship type. The patch body must be an array of json patch operations as defined in RFC 6902.
The following fields can be patched in each relationship type endpoint:
default_display_name
(value can be changed but not removed)localized_display_name
(values can be added, changed, and removed)on_delete
(value can be added, removed, or deleted)
Relationship types where at least one endpoint is a catalog asset type can be patched either by an admin member of the catalog, or by a user in the catalog's BSS account. Global relationship types can only be patched by a user with permission to create global asset types.
It is important to note that the patch specifies updates to apply the current json representation of the relationship types. When specifying an add/replace/remove patch operation, all of parent elements in the path must already exist. In particular, if the
relationship type endpoint was originally created without a localized_display_name
element, that element needs to be created by the patch operations.
Here are some examples of patches that can be applied:
Add a display name to an endpoint without an existing localized_display_name
element:
[
{
"op": "add",
"path": "/end2/localized_display_name",
"value": {"
"en-us": "English Display Name"
}
}
]
Add a display name to an endpoint with an existing localized_display_name
element:
[
{
"op": "add",
"path": "/end2/localized_display_name/en-gb",
"value": "British Display Name",
}
]
Change localized display name for en-gb in end2:"
[
{
"op": "replace",
"path": "/end2/localized_display_name/en-gb",
"value": "New British Display Name"
}
]
Remove localized display name for en-gb rom end1:
[
{
"op": "remove",
"path": "/end1/localized_display_name/en-gb"
}
]
Add a relationship option override, replace an existing relationship option override, delete an existing relationship option override:
[
{
"op": "add",
"path": "/end1/option_overrides/-",
"value": {
"this_asset_type": "ibm_bi_report",
"other_asset_type": "ibm_bi_report_query",
"on_delete": "CASCADE",
"on_publish": "IGNORE",
"on_promote": "CASCADE",
"on_clone": "IGNORE"
}
},
{
"op": "replace",
"path": "/end1/option_overrides/1/on_delete",
"value": "CASCADE"
},
{
"op": "add",
"path": "/end1/option_overrides/1/on_promote",
"value": "IGNORE"
},
{
"op": "remove",
"path": "/end1/option_overrides/1/on_promote"
}
[
PATCH /v2/asset_relationship_types/{relationship_type_id}
Request
Path Parameters
relationship_type_id
JSON array of patch operations as defined in RFC 6902.
Note: Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in `{"foo/" : {"~bar" : "value"}}`, the path for `"~bar"` is `"/foo~1/~0bar"`.
[
{
"op": "replace",
"path": "/end2/default_display_name",
"value": "new end2 display name"
},
{
"op": "replace",
"path": "/end1/default_display_name",
"value": "new end1 display name"
},
{
"op": "add",
"path": "/flags/use_for_lineage_traversal",
"value": "true or false"
},
{
"op": "add",
"path": "flags/copy_to_knowledge_graph",
"value": "true or false"
}
]
Response
relationship type metadata
- metadata
Unique ID of the relationship type
ID of the user that created the relationship type
Creation timestamp in milliseconds
Creation timestamp in yyyy-MM-dd'T'HH:mm:ss.SSSX format
Update timestamp in milliseconds
Update timestamp in yyyy-MM-dd'T'HH:mm:ss.SSSX format
ID of the user that updated the relationship type
Tenancy of the relationship type
- tenancy
Possible values: [
GLOBAL
,BSS_ACCOUNT
]
relationship type entity
- entity
First relationship type endpoint (order is not important)
- end1
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Possible values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Possible values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Possible values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Second relationship type endpoint (order is not important)
- end2
Default display name for this end of the relationship. This is what will be shown in the UI.
Internal, non-translated, name for this end of the relationship. This becomes part of the path in the APIs for manipulating and querying relationship types.
Multiplicity of this end of the relationship. This controls how many targets the relationship can point to.
Possible values: [
ONE
,MANY
]Map of localized display names, keyed on the ISO 639 language code (e.g. "zh")
- localized_display_name
Optionally specifies the type of the artifact that contains this relationship. If omitted with containing_artifact_types, any artifact types are allowed.
Optionally specify the types of the artifacts that contain this relationship. If omitted with containing_artifact_type, any artifact types are allowed.
Possible values: contains only unique items
Specifies the catalog, project, or space id that contains the asset type. This is only required if the asset type is scoped to a particular catalog, project, or space. It is not allowed for global asset types, for example data_asset.
Specifies the type of the container which contains the asset type. This field is required if the container_id is set. It is ignored if the container_id is unset. Allowed values: 'CATALOG', 'SPACE', 'PROJECT'
Possible values: [
CATALOG
,PROJECT
,SPACE
,EXTERNAL
]Example:
CATALOG
Whether the relationship attribute is displayed in child asset.
Possible values: [
true
,false
]Determines processing of the relationship target when the source object is deleted. Allowed values: CASCADE - delete the referenced object, IGNORE - do not delete the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is cloned to a target catalog. Allowed values: CASCADE - clone the referenced object, IGNORE - do not clone the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is published. Allowed values: CASCADE - publish the referenced object, IGNORE - do not publish the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Determines processing of the relationship target when the source object is promoted. Allowed values: CASCADE - promote the referenced object, IGNORE - do not promote the referenced object. Default: IGNORE
Possible values: [
CASCADE
,IGNORE
]Optionally specifies any override of cascade behavior on
delete, clone, promote and publish
based on specific asset type pairs Note: All duplicate elements will be ignored by the option_overridesSet
when provided via create or patch REST API(s)Possible values: contains only unique items
Optionally specify the types of the assets that contain this relationship. If omitted with containing_asset_type, any asset types are allowed.
Possible values: contains only unique items
Optionally specifies the type of the asset that contains this relationship. If omitted with containing_asset_types, any asset types are allowed.
Set of flags. These flags determine the relationship type's behavior
- flags
Description of the relationship type
Status Code
Relationship type was successfully patched
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
No Sample Response
List all asset types defined for an account, catalog, project or space.
Get all asset types in an account, catalog, project or space. Custom properties added/updated at the account level asset types can only be fetched if bss_account_id parameter is passed
GET /v2/asset_types
Request
Query Parameters
This parameter allows retrieving catalog-scoped asset types. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving project-scoped asset types. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving space-scoped asset types. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving account-scoped asset types. May be different than account of user specified in Bearer token. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows to retrieve asset types which can be decorated. You can provide either a true or a false.
Default:
FALSE
This parameter allows to retrieve asset types which can be used only as attributes and are user defined. You can provide either a true or a false. At a time, either permits_decorators or custom_attributes can be true.
Default:
FALSE
Language code, such as default, en, etc.
Creates an asset type in an account, catalog, project or space.
Creates an asset type in account, catalog, project or space.
POST /v2/asset_types
Request
Query Parameters
This parameter allows the creation of a catalog-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows the creation of a project-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows the creation of a space-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows the creation of an account-scoped asset type. May be different than account of user specified in Bearer token. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
Asset Type request body
Fields that are indexed if present in the asset.
Asset type name
Example:
connection
Asset type description
Example:
Connection asset type
External Asset Preview. Clients can make use of this to render content appropriately for this asset type.
- external_asset_preview
Base client URL
Example:
https://ibm.com
URL parameters
URL path
Example:
id
URL parameters fixed
Example:
allow_login_screen=true
Relationship definitions for this asset type.
Provides the IKC UI with information about displaying properties that can be present in the asset
- properties
Metadata about properties that can be present in an instance of a custom asset type.
Example "localized_metadata_attributes": { "name": { "default": "Data Asset", "en": "Data Asset", "fr": "Data Asset" } }
- localized_metadata_attributes
- any property
List of pointers to types this type can decorate.
Identity definition of the assets of the type
- identity
The strategy for detecting duplicate asset of the type
Allowable values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]
List of fields made searchable through Global Search functionality
Define the behavior only if this asset type is used as an attribute.
- attribute_behavior
Remove the attribute when cloning, publishing or promoting the asset
If set to true, the type definition will be custom attributes, which can be used to decorate other asset types and cannot create this as a top level asset of this asset type. If set to false, the type definition will be custom asset type, which can be used to create top level assets of this asset type and/or also decorate other asset types.
Asset type properties and attachments that should be filtered out for users without data access
- data_protection
Indicates if all fields of the asset type are protected. If 'all_fields' is true, then 'fields' is ignored.
Allowable values: [
true
,false
]Indicates if all attachments of the asset type are protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Allowable values: [
true
,false
]List of json paths that lead to the fields to be protected. e.g., "scoring.input_data", "columns[].name", "columns[0].code", "*.connection_id", etc. If 'all_fields' is true, then 'field_search_paths' is ignored.
Example:
columns[].name
List of types of attachments of the asset type to be protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Allowable values: [
MANAGED
,REFERENCED
,REMOTE
]
Set it to true if this is to be used for column custom attributes. Set to false if this is used for asset types or attribute. If true, allow_decorators should be false.
Set it to true if the type is expected to support uploading of images, else set it to false.
Name/id of the svg image to be used as the icon for the to be created asset type.
The version of defined asset type
Example:
1
Response
Fields that are indexed if present in the asset.
Asset type name
Example:
connection
Asset type description
Example:
Connection asset type
External Asset Preview. Clients can make use of this to render content appropriately for this asset type.
- external_asset_preview
Base client URL
Example:
https://ibm.com
URL parameters
URL path
Example:
id
URL parameters fixed
Example:
allow_login_screen=true
Relationship definitions for this asset type.
Provides the IKC UI with information about displaying properties that can be present in the asset
- properties
Metadata about properties that can be present in an instance of a custom asset type.
Example "localized_metadata_attributes": { "name": { "default": "Data Asset", "en": "Data Asset", "fr": "Data Asset" } }
- localized_metadata_attributes
- any property
List of pointers to types this type can decorate.
Identity definition of the assets of the type
- identity
The strategy for detecting duplicate asset of the type
Possible values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]
List of fields made searchable through Global Search functionality
Define the behavior only if this asset type is used as an attribute.
- attribute_behavior
Remove the attribute when cloning, publishing or promoting the asset
If set to true, the type definition will be custom attributes, which can be used to decorate other asset types and cannot create this as a top level asset of this asset type. If set to false, the type definition will be custom asset type, which can be used to create top level assets of this asset type and/or also decorate other asset types.
Asset type properties and attachments that should be filtered out for users without data access
- data_protection
Indicates if all fields of the asset type are protected. If 'all_fields' is true, then 'fields' is ignored.
Possible values: [
true
,false
]Indicates if all attachments of the asset type are protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
true
,false
]List of json paths that lead to the fields to be protected. e.g., "scoring.input_data", "columns[].name", "columns[0].code", "*.connection_id", etc. If 'all_fields' is true, then 'field_search_paths' is ignored.
Example:
columns[].name
List of types of attachments of the asset type to be protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
MANAGED
,REFERENCED
,REMOTE
]
Set it to true if this is to be used for column custom attributes. Set to false if this is used for asset types or attribute. If true, allow_decorators should be false.
Set it to true if the type is expected to support uploading of images, else set it to false.
Name/id of the svg image to be used as the icon for the to be created asset type.
The version of defined asset type
Example:
1
The scope of defined asset type. Derived-field. Will be one of GLOBAL, ACCOUNT, CATALOG, PROJECT, or SPACE.
Example:
GLOBAL
List of pointers to types this type can be decorated by.
Status Code
Created
Accepted - indicates the asset type creation is being completed in the background
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Request
No Sample Response
Retrieves an asset type of a given name.
Retrieves an asset type of a given name. Custom properties added/updated at the account level asset types can only be fetched if bss_account_id parameter is passed
GET /v2/asset_types/{type_name}
Request
Path Parameters
Asset Type name (eg: data_asset)
Query Parameters
This parameter allows retrieving a catalog-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving a project-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving a space-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows retrieving an account-scoped asset type. May be different than account of user specified in Bearer token. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
Language code, such as default, en, etc.
This parameter can have one or more values as below:
- GLOBAL - return asset type in global scope
- BSS_ACCOUNT - return asset type in account scope
- CONTAINER - return asset type in catalog/ project/ space scope
If no value is provided, the default value is taken as null, and the asset type with the highest search scope is returned. A combination of more than one search scope is also valid. In this case, an asset type with the highest search scope from the given scopes will be returned.
Response
Fields that are indexed if present in the asset.
Asset type name
Example:
connection
Asset type description
Example:
Connection asset type
External Asset Preview. Clients can make use of this to render content appropriately for this asset type.
- external_asset_preview
Base client URL
Example:
https://ibm.com
URL parameters
URL path
Example:
id
URL parameters fixed
Example:
allow_login_screen=true
Relationship definitions for this asset type.
Provides the IKC UI with information about displaying properties that can be present in the asset
- properties
Metadata about properties that can be present in an instance of a custom asset type.
Example "localized_metadata_attributes": { "name": { "default": "Data Asset", "en": "Data Asset", "fr": "Data Asset" } }
- localized_metadata_attributes
- any property
List of pointers to types this type can decorate.
Identity definition of the assets of the type
- identity
The strategy for detecting duplicate asset of the type
Possible values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]
List of fields made searchable through Global Search functionality
Define the behavior only if this asset type is used as an attribute.
- attribute_behavior
Remove the attribute when cloning, publishing or promoting the asset
If set to true, the type definition will be custom attributes, which can be used to decorate other asset types and cannot create this as a top level asset of this asset type. If set to false, the type definition will be custom asset type, which can be used to create top level assets of this asset type and/or also decorate other asset types.
Asset type properties and attachments that should be filtered out for users without data access
- data_protection
Indicates if all fields of the asset type are protected. If 'all_fields' is true, then 'fields' is ignored.
Possible values: [
true
,false
]Indicates if all attachments of the asset type are protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
true
,false
]List of json paths that lead to the fields to be protected. e.g., "scoring.input_data", "columns[].name", "columns[0].code", "*.connection_id", etc. If 'all_fields' is true, then 'field_search_paths' is ignored.
Example:
columns[].name
List of types of attachments of the asset type to be protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
MANAGED
,REFERENCED
,REMOTE
]
Set it to true if this is to be used for column custom attributes. Set to false if this is used for asset types or attribute. If true, allow_decorators should be false.
Set it to true if the type is expected to support uploading of images, else set it to false.
Name/id of the svg image to be used as the icon for the to be created asset type.
The version of defined asset type
Example:
1
The scope of defined asset type. Derived-field. Will be one of GLOBAL, ACCOUNT, CATALOG, PROJECT, or SPACE.
Example:
GLOBAL
List of pointers to types this type can be decorated by.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
No Sample Response
Replace an asset type
Replace asset attributes for the given asset type or create a new asset type if the given asset type does not exist.
PUT /v2/asset_types/{type_name}
Request
Path Parameters
Asset Type name (eg: data_asset)
Query Parameters
This parameter allows creating or updating a catalog-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating a project-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating a space-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating an account-scoped asset type. May be different than account of user specified in Bearer token. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
Asset Type request body
Fields that are indexed if present in the asset.
Asset type description
Example:
Connection asset type
External Asset Preview. Clients can make use of this to render content appropriately for this asset type.
- external_asset_preview
Base client URL
Example:
https://ibm.com
URL parameters
URL path
Example:
id
URL parameters fixed
Example:
allow_login_screen=true
Relationship definitions for this asset type.
Provides the IKC UI with information about displaying properties that can be present in the asset
- properties
Metadata about properties that can be present in an instance of a custom asset type.
Example "localized_metadata_attributes": { "name": { "default": "Data Asset", "en": "Data Asset", "fr": "Data Asset" } }
- localized_metadata_attributes
- any property
List of pointers to types this type can decorate.
Identity definition of the assets of the type
- identity
The strategy for detecting duplicate asset of the type
Allowable values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]
List of fields made searchable through Global Search functionality
Define the behavior only if this asset type is used as an attribute.
- attribute_behavior
Remove the attribute when cloning, publishing or promoting the asset
If set to true, the type definition will be custom attributes, which can be used to decorate other asset types and cannot create this as a top level asset of this asset type. If set to false, the type definition will be custom asset type, which can be used to create top level assets of this asset type and/or also decorate other asset types.
Asset type properties and attachments that should be filtered out for users without data access
- data_protection
Indicates if all fields of the asset type are protected. If 'all_fields' is true, then 'fields' is ignored.
Allowable values: [
true
,false
]Indicates if all attachments of the asset type are protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Allowable values: [
true
,false
]List of json paths that lead to the fields to be protected. e.g., "scoring.input_data", "columns[].name", "columns[0].code", "*.connection_id", etc. If 'all_fields' is true, then 'field_search_paths' is ignored.
Example:
columns[].name
List of types of attachments of the asset type to be protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Allowable values: [
MANAGED
,REFERENCED
,REMOTE
]
Set it to true if this is to be used for column custom attributes. Set to false if this is used for asset types or attribute. If true, allow_decorators should be false.
Set it to true if the type is expected to support uploading of images, else set it to false.
Name/id of the svg image to be used as the icon for the to be created asset type.
Response
Fields that are indexed if present in the asset.
Asset type name
Example:
connection
Asset type description
Example:
Connection asset type
External Asset Preview. Clients can make use of this to render content appropriately for this asset type.
- external_asset_preview
Base client URL
Example:
https://ibm.com
URL parameters
URL path
Example:
id
URL parameters fixed
Example:
allow_login_screen=true
Relationship definitions for this asset type.
Provides the IKC UI with information about displaying properties that can be present in the asset
- properties
Metadata about properties that can be present in an instance of a custom asset type.
Example "localized_metadata_attributes": { "name": { "default": "Data Asset", "en": "Data Asset", "fr": "Data Asset" } }
- localized_metadata_attributes
- any property
List of pointers to types this type can decorate.
Identity definition of the assets of the type
- identity
The strategy for detecting duplicate asset of the type
Possible values: [
DUPLICATE_DETECTION_BY_NAME
,DUPLICATE_DETECTION_BY_NAME_AND_RESOURCE_KEY
,DUPLICATE_DETECTION_BY_NAME_AND_FOLDER
,DUPLICATE_DETECTION_BY_RESOURCE_KEY
,DUPLICATE_DETECTION_NOT_APPLICABLE
]
List of fields made searchable through Global Search functionality
Define the behavior only if this asset type is used as an attribute.
- attribute_behavior
Remove the attribute when cloning, publishing or promoting the asset
If set to true, the type definition will be custom attributes, which can be used to decorate other asset types and cannot create this as a top level asset of this asset type. If set to false, the type definition will be custom asset type, which can be used to create top level assets of this asset type and/or also decorate other asset types.
Asset type properties and attachments that should be filtered out for users without data access
- data_protection
Indicates if all fields of the asset type are protected. If 'all_fields' is true, then 'fields' is ignored.
Possible values: [
true
,false
]Indicates if all attachments of the asset type are protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
true
,false
]List of json paths that lead to the fields to be protected. e.g., "scoring.input_data", "columns[].name", "columns[0].code", "*.connection_id", etc. If 'all_fields' is true, then 'field_search_paths' is ignored.
Example:
columns[].name
List of types of attachments of the asset type to be protected. If 'all_attachments' is true, then 'attachment_types' is ignored.
Possible values: [
MANAGED
,REFERENCED
,REMOTE
]
Set it to true if this is to be used for column custom attributes. Set to false if this is used for asset types or attribute. If true, allow_decorators should be false.
Set it to true if the type is expected to support uploading of images, else set it to false.
Name/id of the svg image to be used as the icon for the to be created asset type.
The version of defined asset type
Example:
1
The scope of defined asset type. Derived-field. Will be one of GLOBAL, ACCOUNT, CATALOG, PROJECT, or SPACE.
Example:
GLOBAL
List of pointers to types this type can be decorated by.
Status Code
OK
Accepted
Bad Request
Unauthorized
Forbidden
Not Found
Conflict
Too Many Request
Internal Server Error
No Sample Response
Deletes an asset type
Deletes an asset type in given account or container. Note that, deletion of an account scope asset type or a container scope asset type triggers background clean up of assets:
- Assets of this type are deleted.
- And in case of other assets, attributes of this type are removed.
DELETE /v2/asset_types/{type_name}
Request
Path Parameters
Asset Type name (eg: data_asset)
Query Parameters
This parameter allows creating or updating a catalog-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating a project-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating a space-scoped asset type. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
This parameter allows creating or updating an account-scoped asset type. May be different than account of user specified in Bearer token. You must provide either a catalog_id, a project_id, a space_id, OR a bss_account_id but not more than one.
Finds relationship types for an asset type
Finds relationship types that originate from a given asset type. This API supports limiting the results to a specific set of relationship types as well as providing pagination and sorting options. It also supports finding relationship types that use both global and catalog asset types.
This API is deprecated - use POST /v2/asset_relationship_types/search_ends
instead
GET /v2/asset_types/{type_name}/relationships
Request
Path Parameters
type_name. This specifies the asset type whose relationships should be found.
Query Parameters
catalog_id. If set, this restricts catalog asset types in the result to the given catalog. If no container parameters are set, any asset container the user has access to are allowed.
catalog_id. If set, this restricts catalog asset types in the result to the given project. If no container parameters are set, any asset containers the user has access to are allowed.
catalog_id. If set, this restricts catalog asset types in the result to the given space. If no container parameters are set, any asset containers the user has access to are allowed.
Optional limit to use when finding relationship types
Default:
25
Optional bookmark to use when finding relationship types.
orderby. This parameter can be repeated to add additional sort fields.
Default: relationship_display_name_ascSupported sort fields (these are case insensitive):
- relationship_name_asc
- relationship_name_desc
- relationship_display_name_asc
- relationship_display_name_desc
- create_time_asc
- create_time_desc
Default:
relationship_display_name_asc
Limits results to only include the specified relationships. By default, all relationship types for the asset type are included. Repeat the query parameter to specify multiple relationship names.
Response
- options
Possible values: contains only unique items
Possible values: contains only unique items
Possible values: [
ASSET
,ARTIFACT
,COLUMN
]Possible values: [
RELATIONSHIP_NAME_ASC
,RELATIONSHIP_NAME_DESC
,RELATIONSHIP_DISPLAY_NAME_ASC
,RELATIONSHIP_DISPLAY_NAME_DESC
,CREATE_TIME_ASC
,CREATE_TIME_DESC
]
- next
Status Code
Success
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
No Sample Response
Search for asset metadata within assets of the specified type
Use this API to search for assets of the generic asset type (asset) or any specific asset type in a Catalog, Space, or Project.
The request body must contain a query
field, which specifies a Lucene query to search for assets on indexed fields.
See Searchable Asset and Attachment Fields below for the list of fields that can be searched in the query.
If query
contains no selection criteria ("*:*"
), then the result includes all assets that are instances of the type specified in the type_name
path parameter,
including assets whose primary asset type (i.e. its metadata.asset_type
value) is that type, and assets that have an attribute of that type.
To filter out assets whose primary type does not match the type_name
parameter, the query must include an asset.asset_type:{type_name}
predicate.
The query results can be sorted by specifying a sort
field in the request with format field<sort_type>
. The sort_type
can be either number
or string
(e.g. "sort" : "asset.name<string>"
).
Search With Pagination
The limit
request body field can be specified to limit the number of assets in the search results.
The default limit
is 200. The maximum limit
value is 200, and any greater value is ignored.
Sample Request Body:
{
"query" : "asset.name:Asset*",
"limit" : 2,
"sort" : "asset.name<string>"
}
Response:
{
"next": {
"bookmark": "g1AAAXXXXXXXX",
"query" : "asset.name:Asset*",
"limit" : 2,
"sort" : "asset.name<string>"
},
"results": [
{
..asset 1...
},
{
..asset 2...
}
],
"total_rows": 3
}
When more search results are available, then the response will contain a next
JSON object - next
contains bookmark
along with original query which needs to be returned
to retrieve the next page of results. To paginate through the entire result set, the request should be re-executed with the bookmark
from the next.bookmark
value of the
previous response until there is no next
object in the response (note that this response may contain an empty results
array).
Sample request body to get the next page of results:
{
"bookmark": "g1AAAXXXXXXXX",
"query" : "asset.name:Asset*",
"limit" : 2,
"sort" : "asset.name<string>"
}
Searchable Asset and Attachment Fields
The following asset and attachment fields can be searched in the query - note that type <string>
or <number>
only needs to be added for use in the sort
request field:
asset.asset_attributes<string>
asset.asset_category<string>
asset.asset_state<string>
asset.asset_type<string>
asset.catalog_id<string>
asset.created_at<string>
asset.delete_timestamp<string>
asset.description<string>
asset.memberIds<string>
asset.editor_id<string>
asset.viewer_id<string>
asset.member_or_owner_id<string>
asset.memberGroupIds<string>
asset.editor_group_id<string>
asset.viewer_group_id<string>
asset.member_or_owner_group_id<string>
asset.mode<number>
asset.name<string>
asset.owner_id<string>
asset.owner_group_id<string>
asset.project_id<string>
asset.rating<number>
asset.resource_key<string>
asset.identity_key<string>
asset.source_asset_id<string>
asset.source_catalog_id<string>
asset.source_project_id<string>
asset.source_space_id<string>
asset.source_system<string>
asset.source_system_asset_id<string>
asset.source_system_created_at<string>
asset.source_system_id<string>
asset.source_history_asset_id<string>
asset.source_history_system_id<string>
asset.space_id<string>
asset.tags<string>
attachment.connection_id<string>
attachment.connection_path<string>
attachment.name<string>
attachment.object_key<string>
POST /v2/asset_types/{type_name}/search
Request
Path Parameters
Asset Type name (eg: data_asset)
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Search Criteria. The "include" field in the request body is a list of asset sections to include in the search results, values are comma-separated and the supported values are "entity" or "entity,attachments".
Lucene query
Example:
*:*
Returns the number of query results for each unique value of each named field.
Examples:[ "asset.tags", "asset.asset_type" ]
Restrict results to documents with a dimension equal to the specified label. Note that, multiple values for a single key in a drilldown means an OR relation between them and there is an AND relation between multiple keys.
Examples:[ { "asset.tags": [ "tag1", "untagged" ], "asset.asset_type": [ "data_asset", "job" ] } ]
- drilldown
Bookmark of the query result
Sort order for the query
Example:
asset.name<string>
entity
Example:
entity
Get the status of an account scope or global scope asset type.
Retrieves an account scope or global scope status, If bss_account_id is not provided, api will try to retreive status of an asset type at global level if present.
GET /v2/asset_types/{type_name}/status
Create an asset
Use this API to create an asset in catalog or project. Assets contain information about the contents of your data and how to access the data. You store asset metadata in a catalog and add collaborators from your organization to analyze data. Your data can reside in a variety of sources. For example, you can keep your data in your existing on-premises data sources, cloud data services, or streaming data feeds. By adding connection information to these remote sources in the catalog, you can allow other catalog users to access the data with the stored credentials. Alternatively, you can copy a snapshot of the remote data into the catalog's encrypted cloud storage.
All asset types have a common set of properties. Some asset types have additional properties.
When you add an asset to a catalog, you specify these common properties:
- The asset name and an optional description. The name can only contain letters, numbers, underscore, dash, space, and period. The name can't be only blank spaces.
- Privacy. You can choose to restrict access to the asset with the privacy level and asset membership.
- Public = Default. No restrictions on finding or using the asset.
- Private = Only asset members can find or use the asset.
- Members. The catalog collaborators can be added as members of the asset. Members are important if you restrict access to the asset. When you create an asset, you are the owner (and implicitly a member) of it. Members can be set in metadata/rov/collaborator_ids or metadata/rov/member_roles, but not both. Members set in metadata/rov/member_roles are given an explicit set of roles. Members can have up to two roles: OWNER and either EDITOR or VIEWER. Only service ids can assign the OWNER role. The request to add the OWNER role is ignored for other callers. Asset members with the EDITOR or VIEWER role must also be members of the catalog. Members set in metadata/rov/collaborator_ids are automatically assigned a role based on their role in the catalog.
- Tags. Metadata that makes searching for the asset easier. Tags can contain only letters, numbers, underscores, dashes, and the symbols # and @.
POST /v2/assets
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Asset metadata
- Examples:
{ "data_asset": { "mime_type": "text/csv", "dataset": false } }
- entity
AssetMetadata Model
Get one or more assets.
Use this API to retrieve one or more assets located in catalog or project or space. Access to an asset is controlled by a combination of the privacy level and the members of the asset. For a governed catalog, data assets are protected from unauthorized access by the governance policies that are defined in Data Governance. Data assets in ungoverned catalogs are not subject to governance policies.
Response includes, for each asset, asset ID, status, HTTP status and, if no errors, asset details & relationship count or list of errors. For example:
{
"resources":[
{
"asset_id":"cf38d92c-eb55-462e-89bf-cdb9fe67f0b7",
"status":"success",
"http_status":200,
"asset":{...},
"relationship_count":2
},
{
"asset_id":"c6650020-3176-4d67-8fbe-5a19ba1c2c7e",
"status":"failed",
"http_status":404,
"errors":[
...
]
},
...
]
}
GET /v2/assets/bulk
Request
Query Parameters
Comma separated list of asset IDs; maximum of 20. For assets with revisions, an optional revision number can be specified using colon character in the form of '<asset id>:<revision>,<asset id>,...'. Revision numbers are non-zero positive numbers likes 1, 2, 3 etc.
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Whether to include per-asset relationship count. Note: The default will be changed to 'false' in a later release.
Default:
true
Whether to include relationships where source end is a column of the source asset in the asset's relationship counter.
Default:
false
Marks existing assets for delete
Use this API to delete assets. You can delete an asset if you are the owner of the asset or a member of the asset with Admin or Editor permissions on the catalog or project or space.
Purge Options to Delete Assets
purge_on_delete = true and purge_after_days is not specified: the asset is purged immediately. purge_on_delete = true and purge_after_days = n: the asset is moved to trash and purged after n days. purge_on_delete = false and purge_after_days is not specified: the asset is moved to trash and never purged. purge_on_delete is not specified and purge_after_days is not specified: the asset will be deleted/purged based on the catalog setting. purge_on_delete = false or is not specified and purge_after_days = n: invalid combination - the call will fail with a 400 Bad Request error.
DELETE /v2/assets/bulk
Request
Query Parameters
Comma separated list of asset IDs; maximum of 20.
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
If true, asset is also deleted from the trash.
Default:
false
Number of days after which asset will be purged. Purge On Delete should be true for this to be valid
Get assets membership for a specific user
Returns any memberships this user has on assets, including direct membership and any access group memberships.
GET /v2/assets/bulk/members/{member_id}
Request
Path Parameters
member_id
Query Parameters
Comma separated list of asset IDs; maximum of 25.
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Should resolve member roles from usergroup(s) and collaborator roles
Default:
false
Type of the member specified by the member_id. If not specified, the service will determine if the member is a user or an access group based on the member_id, which is not always possible or accurate. Always supplying the member type is recommended.
Allowable values: [
user
,group
]IAM Profile ID for Trusted Profile users: If the getMemberForAssetV2 API is called to fetch details for a Trusted Profile, the member_id is the IAM ID of the actual user, and the profile_id will be the IAM ID of the trusted user. For IBM users and App ID users, this value should be null.
Copies one or more assets and, if deep copy option is on, their related assets from a source container to a target container.
Use this API to publish one or more assets from a project to a catalog or promote from a project to a space. This API can also be used to clone one or more assets from a catalog to a project. The default maximum number of assets in a request is 50.
Copy sends the same Rabbit MQ messages as publish, promote or clone. Messages are sent for all assets that are copied.
This API supports using a refresh token in place of an access token. This can be helpful if the copy is expected to be a long running operation. The copy operation uses the access token to authenticate with other services, so it is important that it not expire while the operation is in progress. When a refresh token is specified, it is used to generate an initial access token as well as additional access tokens if the initial access token expires
POST /v2/assets/bulk_copy
Request
Query Parameters
Source catalog ID. Only one of either catalog ID or project ID must be provided.
Source project ID. Only one of either catalog ID or project ID must be provided.
Determines if the copy operation should be cascaded to referenced assets. Referenced assets are determined using bidirectional external relationships and as well as in-line relationships. In-line relationship types are defined as part of asset type definitions. Cascading happens only if
CASCADE
is set for the following relationship type property:on_clone
, in case of copying from catalog to project,on_publish
, in case of copying from project to catalog andon_promote
, in case of copying from project to space.
Default:
false
Whether to automatically copy connections in remote attachments
Default:
true
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Request body should include target container ID and list of assets with metadata overrides, and may include asset permissions override. Target container ID should be either that of a catalog (publish) or of a space (promote) if source conatiner is a project, or that of a project (clone) if source is a catalog. Example:
{
"catalog_id": "string",
"rov_mode": "integer",
"member_roles": {
"Member's IAM ID": {
"user_iam_id": "Member's IAM ID",
"roles": ["EDITOR"]
},
"Access Group ID": {
"access_group_id": "Access Group ID",
"roles": ["VIEWER"]
},
"Another IAM ID": {
...
}
},
"copy_configurations":[
{
"asset_id": "string",
"revision_id": "string",
"metadata_overrides": {
"name":"string", "description":"string", "tags":["string","string"]
}
},
...
]
}
Fields 'rov_mode' and 'member_roles' are optional.
- If specified, they apply to all newly copied (target) assets.
- If not supplied, newly copied asset gets mode and 'member_roles' from its corresponding source asset.
- Valid values for 'rov_mode' are 0 (public), 8 (private) and 16 (hidden).
- If caller doesn't have permissions to set any of the roles then 'member_roles' is completely ignored.
- If supplied, 'member_roles' should include at least one OWNER role.
The 'metadata_overrides' field is optional and may contain attributes to overwrite the values in original asset. Currently only name, description and tags can be overriden. If specified 'metadata_overrides' cannot be empty and only applies to respective root asset.
Any cross-referenced asset is copied only once.
To assign owners and other collaborators on newly copied target assets.
- member_roles
Response
Possible values: [
WAITING
,IN_PROGRESS
,COMPLETED
,FAILED
]
Status Code
OK - indicates the copy operation has completed - assets and its related assets (if applicable) have been fully copied to the target catalog.
Accepted - indicates that the copy request is being completed in the background.
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Create one or more assets.
Use this API to create one or more assets in a catalog or a project or a space. The default maximum number of assets in a request is 20. Example of request (POST) data:
{
"assets":[
{
"name":"<Name of the first asset>",
"description":"<Description of the first asset>",
"tags":["<Tag 1>", "<Tag 2>", ...],
"asset_type":"<Asset type such as "connection", "data_asset", etc.>",
"entity":{
...
}
"attachments":{
...
}
},
{
"name":"<Name of the second asset>",
...
},
...
]
}
Response includes, for each asset in the request and in the same order, details of newly created asset or details of updated asset if a duplicate/match is found or errors if any:
{
"trace":"<Trace ID, if failed to create or update one or more assets>",
"responses":[
{
"http_status":201,
"asset":{...}
},
{
"http_status":400,
"errors":[
...
]
},
...
]
}
POST /v2/assets/bulk_create
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
JSON containing list of assets to be created.
Asset Model
Patch one or more assets.
Use this API to patch one or more assets in catalog or project or space. The default maximum number of assets in a request is 20.
The API supports the following operations:
- Patching of allowed metadata properties. See
PATCH /v2/assets/{asset_id}
API for list of allowed metadata properties that can be patched." - Add, update and delete of attributes such as
asset_terms
,data_profile
, etc." - Create, update and delete of attachments:
- The (asset)
attachments
field is an array and the index of the first attachment is 0, the index of the second one is 1 and so on. Adding or deleting an attachment shifts indexes of any attachments that follow it. - A newly created attachment is added at the specified location. If the
path
is/attachments/-
then it will be added at the end. - Newly added attachment cannot be updated or deleted (in the same API call). And, newly updated attachment cannot be deleted.
- Some of the new attachment fields like
attachment_id
,url1
,url2
, etc. are only included in the response the first time when the attachments are created. - Only the following attachment fields can be updated:
name
,description
,mime
,size
,partitions
andtransfer_complete
.
- The (asset)
The API does not support patching of whole metadata, entity and attachments.
- i.e. The
path
cannot be the following values:/metadata
,/entity
or/attachments
.
Access to an asset is controlled by a combination of the privacy level and the members of the asset. For a governed catalog, data assets are protected from unauthorized access by the governance policies that are defined in Data Governance. Data assets in ungoverned catalogs are not subject to governance policies.
Note: Patch 'operations' should conform to RFC 6902 specification.
Example of request (POST) data:
{
"resources": [
{
"asset_id": "cf38d92c-eb55-462e-89bf-cdb9fe67f0b7",
"operations": [
{
"op": "add",
"path": "/metadata/description",
"value": "Four score and seven years ago..."
},
{
"op": "add",
"path": "/entity/data_asset/mime_type",
"value": "text/csv"
},
{
"op": "replace",
"path": "/attachments/1/name",
"value": "New name for the 2nd attachment."
},
{
"op": "remove",
"path": "/attachments/0"
}
{
"op":"add",
"path":"/attachments/-",
"value":{...}
},
{
"op":"replace",
"path":"/attachments/1/name",
"value":"New name for the 2nd attachment."
},
{
"op":"add",
"path":"/attachments/2",
"value":{...}
},
]
},
{
"asset_id":"c6650020-3176-4d67-8fbe-5a19ba1c2c7e",
"operations": [
...
]
},
...
]
}
Response includes, for each asset, asset ID, status and updated version or errors, if any:
{
"resources": [
{
"asset_id": "cf38d92c-eb55-462e-89bf-cdb9fe67f0b7",
"status": 200,
"http_status": 200,
"asset": {
...
}
},
{
"asset_id": "c6650020-3176-4d67-8fbe-5a19ba1c2c7e",
"status": 404,
"http_status": 404,
"errors": [
...
]
},
...
]
}
Currently the API allows to patch up to twenty (20) assets. There is no limit on number of patch instructions. Any number of (allowed) metadata properties and/or attributes can be:
- modified,
- new ones added,
- existing ones be replaced or
- deleted
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Note: Any
~
characters need to be escaped as~0
in the path field
Any /
characters need to be escaped as ~1
in the path field
For example, in {"foo/: {"~bar: "value"}}
, the path for ~bar
is /foo~1/~0bar
POST /v2/assets/bulk_patch
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
JSON containing per-asset updates/patches.
Gets the status of asset copy operation
For large deep copies, portions of the copy operation are performed in the background. This API provides a way to get the status of the background processing. It also provides additional details about deep copy operations that have already been completed.
Similar to /v2/assets/deepcopy/{deep_copy_task_id} API but response can support completion details of multiple source assets.
Note: The status of completed copy operations is only available for 24 hours.
GET /v2/assets/copy_status/{copy_task_id}
Create a PUT signed url for an object or a file
Use this API to create a PUT signed url to an object or a file in the catalog/project/space bucket/storage.
You must specify both name and asset_type (to create a new object or a file) OR the asset_id (to overwrite an existing object or a file) in the request.
You must specify one of these -- catalog_id, project_id or the space_id as a query parameter.
Details:
- No new meta-data (asset_id) is created or updated for this request.
- Only the PUT signed url is returned for the object or the file.
- If the asset_id is given in the request, a PUT signed url is returned for that asset's attachment's object/file, if the object/file is stored in the catalog/project/space bucket/storage. If the asset_type is given, the object/file from the first attachment with that asset_type is the one for which the PUT signed url will be returned. If the asset_type is not given, then, the asset's asset_type, then the "data_asset", will be used to find the attachment with the object/file.
- If the asset_id is not given in the request, and asset_type and name are specified, a PUT signed url is returned for an object_key where the object_key value is created using the combination of asset_type, sanitized name, and a random string.
- The name in the request can only contain letters, numbers, underscore, dash, space, and period -- same requirements as the attachment name.
- The value of expires_in is in seconds. The default value of expires_in is 604800 (7 days). Given a value of 0, the default value will be applied. The max value is the same as the default value; if a value larger than the max value is input, the max value will be used.
POST /v2/assets/create_object_url
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
object_url request
Gets the status of a deep copy operation
For large deep copies, portions of the copy operation are performed in the background. This API provides a way to get the status of the background processing. It also provides additional details about deep copy operations that have already been completed.
Note: The status of completed deep copy operations is only available for 24 hours.
GET /v2/assets/deepcopy/{deep_copy_task_id}
Bulk search duplicate assets
Use this API to search for duplicate assets of an array of assets in the target catalog, project, or space. The maximum number of assets in a request is 20.
Each asset in the request can either be an existing asset, or an asset object that is the same as the user would provide for creating a new asset. E.g.,
{
"resources":[
{
"asset": {
"metadata": {},
"entity": {}
}
},
{
"project_id": "6f79ff7d-9227-4cba-81e7-726db8984d16",
"asset_id": "faad7530-4a30-41da-ac27-d3a0f07b2070"
},
{
"asset": {
"metadata": {},
"entity": {}
}
}
]
}
The response contains an array of duplicate search results. The index of the results in the array is the same as the index of the corresponding asset in the request array. Each result can be a success or failure depending on the corresponding request asset. E.g.,
{
"trace":"32s0wywr3wdqpsn86b4cfzb4i",
"resources":[
{
"http_status": 200,
"total_count": 1,
"best_updatable_duplicate_asset_id": "b81c84f5-a551-43bb-9a6e-c8e7820995f9",
"results": [
{
"asset_id": "b81c84f5-a551-43bb-9a6e-c8e7820995f9",
"highest_match_score": {},
"asset": {
"metadata": {},
"entity": {}
}
}
]
},
{
"http_status": 404,
"errors": [
{
"code": "does_not_exist",
"message": "ATTSV3024E: There is no asset with an ID of 'faad7530-4a30-41da-ac27-d3a0f07b2070' in a project with an ID of '6f79ff7d-9227-4cba-81e7-726db8984d16' ."
}
]
},
{
"http_status": 200,
"total_count": 0,
"results": [
]
}
]
}
The response of each requested item contains a http_status
field that can be used to determine if the specific request was successful or not. If a request failed, the corresponding response will contain a field errors
indicating the cause of the failure. For detail explaination of other fields in the request and response of each item, please refer to the POST /v2/assets/duplicates/search
endpoint.
POST /v2/assets/duplicates/bulk_search
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
The maximum number of items to return for each requested asset
Default:
100
The assets to search for duplicate assets.
the assets to search duplicates for
Search duplicate assets
Use this API to search for duplicate assets of an asset in the target catalog, project, or space.
Users can provide an existing asset like below to search for duplicate assets of the given asset in the target asset container. The authenticated user needs to have the read access to the specified asset.
{
"project_id": "6f79ff7d-9227-4cba-81e7-726db8984d16",
"asset_id": "faad7530-4a30-41da-ac27-d3a0f07b2070"
}
Or provide an asset object to search for duplicate assets of the given asset payload in the target asset container. The asset payload is the same as the user would provide for creating a new asset.
{
"asset": {
"metadata": {},
"entity": {}
}
}
The query parameter limit
can be used to limit the number of duplicates in the response. If not supplied, the maximum allowed value 100
is used. Note that the API does not support paignation as users usually only care about top x
number of duplicates instead of all duplicates. If they do, they can always use a large limit to retrieve all duplicates. There number of the duplicates of any given asset should be far less than the maximum allowed value of the parameter limit
. The system would become unusable long before the number of duplicates reaches that value.
The duplicate assets in the response are ordered by the matching score from high to low. The higher the matching score, the more certain that the asset is a duplicate of the incoming asset.
The total_count
indicates the total number of duplicates of the given asset. This is useful in case the total number of duplicates is greater than the specified limit.
The best_updatable_duplicate_asset_id
indicates the highest ranking duplicate asset that can be updated by the authenticated user. This duplicate asset would be updated/replaced if the incoming asset were to be added in the target asset container and the effective duplicate_action
were UPDATE/REPLACE
. If duplicates are found but this field doesn't have a value, it indicates that the authenticated user doesn't have permission to update any of the duplicates. In such a case, if the incoming asset were to be added in the target asset container and the effective duplicate_action
were UPDATE/REPLACE
, the transaction would fail.
POST /v2/assets/duplicates/search
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
The maximum number of items to return
Default:
100
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Supply either an existing asset or an asset payload to search for duplicate assets.
Id of the catalog the existing asset resides in. You must provide either a catalog id, a project id, or a space id, but not more than one
Example:
b81c84f5-a551-43bb-9a6e-c8e7820995f9
Id of the project the existing asset resides in. You must provide either a catalog id, a project id, or a space id, but not more than one
Example:
814bc4e9-a8e6-4a02-9fb8-096211f09fb5
Id of the space the existing asset resides in. You must provide either a catalog id, a project id, or a space id, but not more than one
Example:
21a374d0-9d04-48d5-884b-0f6cbfc67779
Id of the existing asset
Example:
c43efbc1-7a65-44b8-bc40-c59f008716a8
Revision of the existing asset
Example:
1,2,...,latest
The asset payload, which is the same as the user would provide for creating a new asset.
- asset
- Examples:
{ "data_asset": { "mime_type": "text/csv", "dataset": false } }
- entity
AssetMetadata Model
Response
List of matching duplicate assets
The id of the best updatable duplicate asset that would be chosen for updating/overwriting in a real persistence transaction. It is possible that there are matching duplicate assets but this field is null, which indicates the caller does not have permission to update/replace any of the matching duplicate assets
Example:
b81c84f5-a551-43bb-9a6e-c8e7820995f9
Total number of duplicates
Example:
1
Status Code
OK
Bad Request
Unauthorized
Forbidden
Not Found
Internal Server Error
No Sample Response
Get an ibm_data_source asset
Use this API to get an ibm_data_source asset with provided endpoint or physical_collection. There is only one ibm_data_source asset that meets the specified parameters. Currently we only support the following three query methods:
- host + port, the host is either ipV4 or ipV6 or a DNS name
- host + port + physical_collection
- physical_collection
GET /v2/assets/ibm_data_source
Request
Query Parameters
Catalog GUID or UID
the bss_account_id
It can be database, project_id or bucket
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Publish an asset from a project which is only referenced in that project
Use this API to publish an asset whose only metadata are only in project as reference to a target catalog. Assets contain information about the contents of your data and how to access the data. You store asset metadata in a catalog and add collaborators from your organization to analyze data. Your data can reside in a variety of sources. For example, you can keep your data in your existing on-premises data sources, cloud data services, or streaming data feeds. By adding connection information to these remote sources in the catalog, you can allow other catalog users to access the data with the stored credentials. Alternatively, you can copy a snapshot of the remote data into the catalog's encrypted cloud storage.
All asset types have a common set of properties. Some asset types have additional properties.
When you add an asset to a catalog, you specify these common properties:
- The asset name and an optional description. The name can only contain letters, numbers, underscore, dash, space, and period. The name can't be only blank spaces.
- Privacy. You can choose to restrict access to the asset with the privacy level and asset membership.
- Public = Default. No restrictions on finding or using the asset.
- Private = Only asset members can find or use the asset.
- Members. The catalog collaborators can be added as members of the asset. Members are important if you restrict access to the asset. When you create an asset, you are the owner (and a member) of it.
- Tags. Metadata that makes searching for the asset easier. Tags can contain only letters, numbers, underscores, dashes, and the symbols # and @.
POST /v2/assets/publish
Request
Query Parameters
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Asset metadata
Publish Asset Metadata
- entity
Get an asset
Use this API to retrieve an asset located in catalog or project. Access to an asset is controlled by a combination of the privacy level and the members of the asset. For a governed catalog, data assets are protected from unauthorized access by the governance policies that are defined in Data Catalog. Data assets in ungoverned catalogs are not subject to governance policies.
GET /v2/assets/{asset_id}
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Marks an existing asset for delete
Use this API to delete an existing asset.
Permissions Required to Delete an Asset
To delete an asset in a project or space, all of the following must hold:
- Caller must have the EDITOR or ADMIN role in the project/space
To delete a public asset in a catalog, any of the following must hold:
- Caller must have the ADMIN role in the catalog
- Caller must have the EDITOR role in the catalog and caller must have either the OWNER or EDITOR role on the asset
To delete a private asset in a catalog, all of the following must hold:
- Caller must have either the ADMIN or EDITOR role in the CATALOG
- Caller must have either the OWNER, EDITOR, or VIEWER role in the asset
Purge Options to Delete an Asset
purge_on_delete = true and purge_after_days is not specified: the asset is purged immediately. purge_on_delete = true and purge_after_days = n: the asset is moved to trash and purged after n days. purge_on_delete = false and purge_after_days is not specified: the asset is moved to trash and never purged. purge_on_delete is not specified and purge_after_days is not specified: the asset will be deleted/purged based on the catalog setting. purge_on_delete = false or is not specified and purge_after_days = n: invalid combination - the call will fail with a 400 Bad Request error.
DELETE /v2/assets/{asset_id}
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
If true, asset is also deleted from the trash.
Default:
false
Number of days after which asset will be purged. Purge On Delete should be true for this to be valid
Update an asset
Overview
Use this API to update certain metadata fields in assets.
Fields that can be patched:
/metadata/name
/metadata/description
/metadata/tags
/metadata/origin_country
/metadata/resource_key
/metadata/source_system/last_modification_timestamp
, Expected Format:yyyy-MM-ddTHH:mm:ssX
/metadata/child_source_systems/{idx}/last_modification_timestamp
, Expected Format:yyyy-MM-ddTHH:mm:ssX
/metadata/source_system_history
- array, useadd
operation with path/metadata/source_system_history/-
to append a new element/metadata/rov/mode
- allowed values are 0 (public), 8 (private) or 16 (hidden)/metadata/rov/collaborator_ids
/metadata/rov/member_roles
/metadata/rov/member_roles/{iam id}/roles
- array, allowed values are OWNER/EDITOR/VIEWER, see below for details
Updating Asset Collaborators and Owners
Either the metadata/rov/collaborator_ids field or the metadata/rov/member_roles field can be updated, but not both.
The collaborator_ids field shown in the asset is derived from the information in member_roles, so updating one of these fields will automatically update the other. The collaborator_ids field shows the members in the member_roles field that have either the asset editor role or the asset viewer role. This field is present for backward compatibility with earlier versions of the catalog service.
Collaborators added to the metadata/rov/collaborator_ids field will be assigned a default asset role based on their role in the catalog. Catalog editors will be assigned the asset editor role, and Catalog viewers will be assigned the asset viewer role.
Users added to the metadata/rov/member_roles present must have either the OWNER role, the OWNER role and EDITOR role, or the OWNER role and VIEWER role. Other combinations of roles are not allowed. Asset users with the VIEWER or EDITOR role must be members of the catalog. In addition, asset users with the OWNER role must also be members if the catalog with either the EDITOR or ADMIN role unless certain service ids are used. For projects and spaces, new owners must have a role in the project/space.
Permissions Required to Modify Asset
To update non-governed fields an asset in a project or space, all of the following must hold:
- Caller must have the EDITOR or ADMIN role in the project/space
To update non-governed fields in a private asset in a catalog, all of the following must hold:
- Caller must have either the ADMIN or EDITOR role in the CATALOG
- Caller must have either the OWNER, EDITOR, or VIEWER role in the asset
To update non-governed fields in a public asset in a catalog, any of the following must hold:
- Caller must have the ADMIN role in the CATALOG
- Caller must have the EDITOR role in the CATALOG and caller must have either the OWNER or EDITOR role in the asset
Permissions required to modify governed metadata fields
The "tags" field is the only governed field. For governed catalogs a policy check is made to determine whether updates to it are allowed and if an asset transformation is needed. The outcome of the policy check can either be "ALLOW", "DENY", or "TRANSFORM"
- If the outcome is "ALLOW", the operation is allowed if the caller has either the "ADMIN" or "EDITOR" catalog role
- If the outcome is "TRANSFORM", the permission checks described above for non-governed fields are applied
- If the outcome is "DENY", the permission checks described above for non-governed fields are applied
Allowed Operations
The combination of asset roles and catalog roles determines what asset operations are allowed by users.
These abilities apply to public assets:
- All members of the catalog can find the asset and see its properties.
- All members of the catalog who have the Editor, Auditor, or Admin roles can use the asset.
These abilities apply to private assets:
- All asset collaborators can find the asset and see its properties. Asset collaborators with the Editor, Auditor, or Admin role can use the asset.
PATCH /v2/assets/{asset_id}
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
]
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Note:
Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}, the path for "~bar" is "/foo~1/~0bar".
Clone an asset
Use this API to clone catalog asset to project. This will create new copy of asset metadata, including asset attachments.
POST /v2/assets/{asset_id}/clone
Request
Path Parameters
asset_id
Query Parameters
catalog_id must be provided
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Request body should include project ID, and may include asset permissions override.
{
"project_id": "string",
"member_roles": {
"Member's IAM ID": {
"user_iam_id": "Member's IAM ID",
"roles": ["EDITOR"]
},
"Access Group ID": {
"access_group_id": "Access Group ID",
"roles": ["VIEWER"]
},
"Another IAM ID": {
...
}
}
}
Field 'member_roles' is optional.
- If specified, 'member_roles' apply to newly cloned asset.
- If not supplied, newly cloned asset gets 'member_roles' from its source asset.
- If caller doesn't have permissions to set any of the roles then 'member_roles' is completely ignored.
- If supplied, 'member_roles' should include at least one OWNER role.
- If caller has only EDITOR role on the source asset and if 'member_roles' does not include all the owners of the source asset then it will be ignored.
'project_id' is the target project id.
Optional owners and other collaborators to assign to newly copied target assets
- member_roles
Set the privacy level mode to be Public (0), Private (8), or Hidden (16)
Add/Update asset collaborators
Overview
This API adds and removes users as collaborators and owners on an asset. A collaborator or owner is identified by user ID. Note that access groups are not supported as asset collaborators or owners. Access groups can be configured as catalog members.
Either the metadata/rov/collaborator_ids
field or the metadata/rov/member_roles
field can be updated, but not both.
The collaborator_ids
field shown in the asset is derived from the information in member_roles
, so updating one of these fields will automatically update the other. The collaborator_ids
field shows the members in the member_roles
field that have either the asset editor role or the asset viewer role. This field is present for backward compatibility with earlier versions of the catalog service.
Collaborators added to the metadata/rov/collaborator_ids
field will be assigned a default asset role based on their role in the catalog. Catalog editors will be assigned the asset editor role, and Catalog viewers will be assigned the asset viewer role.
Users added to the metadata/rov/member_roles
present must have either the OWNER role, the OWNER role and EDITOR role, or the OWNER role and VIEWER role. Other combinations of roles are not allowed. Asset users with the VIEWER or EDITOR role must be members of the catalog. In addition, asset users with the OWNER role must also be members of the catalog with either the EDITOR or ADMIN role unless certain service ids are used. For projects and spaces, new owners must have a role in the project/space.
Permissions Required to Modify Asset Members
To add or remove asset owners for a private asset in a catalog, any of the following must hold:
- Caller must have OWNER role on the asset
- Caller must have the ADMIN role on catalog and caller must have either the OWNER, EDITOR, or VIEWER role in the asset
To add or remove asset owners for a public asset in a catalog, any of the following must hold:
- Caller must have OWNER role on the asset
- Caller must have the ADMIN role in the catalog
To add or remove asset collaborators, other than asset owner, for a private asset in a catalog, any of the following must hold:
- Caller must have OWNER role on the asset
- Caller must have EDITOR role on the asset
- Caller must have the ADMIN role on catalog and caller must have either the OWNER, EDITOR, or VIEWER role in the asset
To add or remove asset collaborators, other than asset owner, for a public asset in a catalog, any of the following must hold:
- Caller must have OWNER role on the asset
- Caller must have EDITOR role on the asset
- Caller must have the ADMIN role on catalog
Allowed Operations
The combination of asset roles and catalog roles determines what asset operations are allowed by users.
These abilities apply to public assets:
- All members of the catalog can find the asset and see its properties.
- All members of the catalog who have the Editor, Auditor, or Admin catalog roles can use the asset.
These abilities apply to private assets:
- All asset collaborators can find the asset and see its properties. Asset collaborators with the Editor, Auditor, or Admin catalog role can use the asset.
PATCH /v2/assets/{asset_id}/collaborators
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
,application/merge-patch+json
]
Path Parameters
Asset GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
[{
"op": "add",
"path": "/metadata/rov/collaborator_ids/test-iam-id",
"value": {
"user_id": "test@us.ibm.com",
"user_iam_id": "test-iam-id"
}
},
{
"op": "replace",
"path": "/metadata/rov/collaborator_ids/test-iam-id",
"value": {
"user_id": "test2-iam-id",
"user_iam_id": "test-iam-id"
}
},
{
"op": "remove",
"path": "/metadata/rov/collaborator_ids/test2-iam-id"
}
]
(DEPRECATED)
[{
"op": "add",
"path": "/metadata/rov/collaborators/test@us.ibm.com",
"value": {
"user_id": "test@us.ibm.com"
}
},
{
"op": "replace",
"path": "/metadata/rov/collaborators/test@us.ibm.com",
"value": {
"user_id": "test2@us.ibm.com"
}
},
{
"op": "remove",
"path": "/metadata/rov/collaborators/test2@us.ibm.com"
}
]
Note: Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}
, the path for "~bar" is /foo~1/~0bar
.
Deep copies an asset and its related assets to a catalog, project, or space
Use this API to deep copy an asset to a project, catalog, or space. Unlike clone, publish, and promote, this API enables cascading the copy operation to referenced assets.
When copying from a project to a space, the on_promote relationship option is used to determine whether the related assets for a given relationship type should be copied. In addition, you must have Admin or Editor permissions on both the project and the space.
Deep copy sends the same Rabbit MQ messages as clone, publish, and promote. Messages are sent for all assets that are copied.
This API supports copying from a project to a catalog (publish), from a project to a space (promote), and from a catalog to a project (clone).
This API supports using a refresh token in place of an access token. This can be helpful if the deep copy is expected to be a long running operation. The deep copy uses the access token to authenticate with other services, so it is important that it not expire while the operation is in progress.
POST /v2/assets/{asset_id}/deepcopy
Request
Path Parameters
asset_id
Query Parameters
project_id
catalog_id
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]
Request body should include target (container) ID, and may include asset permissions and metadata overrides. Fields 'mode' and 'member_roles' are optional.
- If specified, they apply to newly copied (target) assets.
- If not supplied, newly copied asset gets mode and 'member_roles' from its corresponding source asset.
- Valid values for 'mode' are 0 (public), 8 (private) and 16 (hidden).
- If caller doesn't have permissions to set any of the roles then 'member_roles' is completely ignored.
- If supplied, 'member_roles' should include at least one OWNER role.
'space_id' is the target space ID i.e. target container ID.
'metadata' is optional and may contain attributes to overwrite the values in original asset;currently only name, description and tags may be overwritten.
requestBody in json
{
"space_id": "string",
"mode": 0,
"member_roles": {
"Member's IAM ID": {
"user_iam_id": "Member's IAM ID",
"roles": [
"EDITOR"
]
},
"Access Group ID": {
"access_group_id": "Access Group ID",
"roles": [
"VIEWER"
]
},
"Another IAM ID": {
"iam_id": "Another IAM ID",
"roles": [
"role"
]
}
},
"metadata": {
"name": "string",
"description": "string",
"tags": [
"string",
"string"
]
}
}
Response
Status Code
OK - indicates the deep copy operation has completed - the asset and its related assets (if applicable) have been fully copied to the target container
Accepted - indicates that the copy request is being completed in the background.
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Update the owner of an asset
Use this API to assign new owner of an asset. You must be the current owner of the asset or a collaborator of the asset with Admin permissions to change the owner. If the asset has multiple owners and the specified user is not currently an owner, the last owner in the owner list (which is the one shown in the owner_id field) is replaced with the specified owner. If the specified user is already an owner of the asset, the user is moved to the end of the owner list so that it is reported in the owner_id field.
This API is deprecated, since this is an older API and supports only single asset owner, not assets with multiple owners. Moving forward owners should be created or updated using below APIs
/v2/assets/{asset_id}
or
/v2/assets/{asset_id}/collaborators
PUT /v2/assets/{asset_id}/owner
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Asset Owner
Example:
test@us.ibm.com
Example:
IBMid-310002980
Update privacy settings of an asset
Use this API to change privacy settings on an asset.
- The owner of the asset or asset collaborators with the Admin role can change the owner of the asset or delete the asset.
- Asset collaborators with the Editor, Auditor, or Admin role can change the asset members or the privacy setting.
PUT /v2/assets/{asset_id}/perms
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Asset ROV
Promote an asset
Use this API to promote project assets to space. You must have Admin or Editor permissions on both the project and the space.
POST /v2/assets/{asset_id}/promote
Request
Path Parameters
asset_id
Query Parameters
project_id must be provided
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Request body should include space ID, and may include asset permissions and metadata overrides. Fields 'mode' and 'member_roles' are optional.
- If specified, they apply to newly promoted asset.
- If not supplied, newly promoted asset gets mode and 'member_roles' from source asset.
- Valid values for 'mode' are 0 (public), 8 (private) and 16 (hidden).
- If caller doesn't have permissions to set any of the roles then 'member_roles' is completely ignored.
- If supplied, 'member_roles' should include at least one OWNER role.
- If caller has only EDITOR role on the source asset and if 'member_roles' does not include all the owners of the source asset then it will be ignored.
'space_id' is the target space id.'
metadata' is optional and may contain attributes to overwrite the values in original asset; currently only name, description and tags may be overwritten.
requestBody in json
{
"space_id": "string",
"mode": 0,
"member_roles": {
"Member's IAM ID": {
"user_iam_id": "Member's IAM ID",
"roles": [
"EDITOR"
]
},
"Access Group ID": {
"access_group_id": "Access Group ID",
"roles": [
"VIEWER"
]
},
"Another IAM ID": {
"iam_id": "Another IAM ID",
"roles": [
"role"
]
}
},
"metadata": {
"name": "string",
"description": "string",
"tags": [
"string",
"string"
]
}
}
Publish an asset
Use this API to publish project assets to catalog. You can publish data assets from a project into a catalog. You must have Admin or Editor permissions on both the project and the catalog.
POST /v2/assets/{asset_id}/publish
Request
Path Parameters
asset_id
Query Parameters
project_id must be provided
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Request body should include catalog ID, and may include asset permissions and metadata overrides. Fields 'mode' and 'member_roles' are optional.
- If specified, they apply to newly published asset.
- If not supplied, newly published asset gets mode and 'member_roles' from source asset.
- Valid values for 'mode' are 0 (public), 8 (private) and 16 (hidden).
- If caller doesn't have permissions to set any of the roles then 'member_roles' is completely ignored.
- If supplied, 'member_roles' should include at least one OWNER role.
- If caller has only EDITOR role on the source asset and if 'member_roles' does not include all the owners of the source asset then it will be ignored.
'catalog_id' is the target catalog id. To support backwards compatibility when it is not supplied, the asset is published to the catalog associated with the project.
'metadata' is optional and may contain attributes to overwrite the values in original asset; currently only name, description and tags may be overwritten.
requestBody in json
{
"catalog_id": "string",
"mode": 0,
"member_roles": {
"Member's IAM ID": {
"user_iam_id": "Member's IAM ID",
"roles": [
"EDITOR"
]
},
"Access Group ID": {
"access_group_id": "Access Group ID",
"roles": [
"VIEWER"
]
},
"Another IAM ID": {
"iam_id": "Another IAM ID",
"roles": [
"role"
]
}
},
"metadata": {
"name": "string",
"description": "string",
"tags": [
"string",
"string"
]
}
}
Finds related assets
Finds assets related to a given asset or a given governance artifact. This API supports limiting the results to a specific set of relationships as well as providing pagination and sorting options.
Finds assets and artifacts related to an asset:
- One of the {
catalog_id
,project_id
,space_id
} must be specified. - asset_id must be specified.
- artifact_id and artifact_type cannot be specified.
Finds assets related to an artifact:
- artifact_id and artifact_type must be specified.
- asset_id and container id cannot be specified.
Finds assets and artifacts related to a column:
- One of the {
catalog_id
,project_id
,space_id
} must be specified. - asset_id must be specified. asset_id should be in the format of
<asset_id>#COLUMN#<column_identifier>
- artifact_id and artifact_type cannot be specified.
Note:When self references are present (i.e. when the source and target asset in the relationship are the same), more rows than specified by the limit may be returned. When this happens, the total rows reported is accurate and the bookmark points to next page of results.
POST /v2/assets/get_relationships
Request
Query Parameters
Catalog ID
Project ID
Space ID
Asset ID
Artifact ID
Artifact Type
Related Asset Types
Related Artifact Types
bookmark
limit. Defaults to 25
Default:
25
orderby. This parameter can be repeated to add additional sort fields.
Default: update_time_ascSupported sort fields (these are case insensitive):
- create_time_asc
- create_time_desc
- update_time_asc
- update_time_desc
- creator_id_asc
- creator_id_desc
- updater_id_asc
- updater_id_desc
Default:
update_time_asc
Limits results to only include the specified relationships. By default, all relationships are included. Repeat the query parameter to specify multiple relationship names.
Whether to lookup the asset names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
Whether to lookup the artifact names. This adds overhead, so only set this to true if the artifact names are required. Defaults to false.
Default:
false
Whether to lookup the container names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
(DEPRECATED) Whether to return relationships where target end is a column for the given source asset, artifact, or column. This parameter is deprecated. Use 'include_target_columns' instead.
Default:
false
Whether to return relationships where target end is a column for the given source asset, artifact, or column.
Default:
false
Whether to return relationships where source end is either the given asset or a column of the given asset. When this parameter set to 'true', 'include_target_columns' must also be set to 'true'. By this, all relationships of asset-asset, asset-column, asset-artifact, column-asset, column-column, column-artifact will be returned.Use this parameter when 'asset_id' refers to an asset which has columns.
Default:
false
Searches for relationships
Finds relationships that match the specified criteria. There are two general kinds of criteria that can be used: criteria that at least one relationship end must match and criteria that the overall relationship must match.
Specifying conditions that at least one end of the relationship must match
There are three different conditions of this type that can be specified:
-
relationship name - limits results to relationships with the specified name(s)
-
Required object criteria:
- required artifact — must specify artifact_type and artifact_id
- required asset - must specify asset_id and one of: catalog_id, project_id, space_id
- required asset container - must specify one of: catalog_id, project_id, space_id
- Only one type of required object criteria can be used
-
Opposite end criteria:
- related_asset_types - specifies that opposite end must have one of the specified asset types
- related_artifact_types - specifies that opposite end must have one of the specified artifact types
- related_asset_types and related_artifact_types can be used together to specify that the opposite end must have either the specified asset or artifact types
Specifying global conditions that the overall relationship much match:
There are 2 global conditions that can be specified:
- modified_after - Restricts results to relationships created/updated after a specific date and time
- modified_before - Restricts results to relationships created/updated before a specific date and time
- modified_after and modified_before can be used together to filter for relationships created/updated within a specific time window.
POST /v2/assets/search_relationships
Request
Query Parameters
Catalog ID. Specifies that a specific catalog_id must be in the returned relationships. If filtering by asset container, either catalog_id, project_id, or space_id must be specified.
Project ID. Specifies that a specific project_id must be in the returned relationships. If filtering by asset container, either catalog_id, project_id, or space_id must be specified.
Space ID. Specifies that a specific space_id must be in the returned relationships. If filtering by asset container, either catalog_id, project_id, or space_id must be specified.
Asset ID. Specifies that a specific asset must be in the returned relationships If specified, either catalog_id, project_id, or space_id must also be specified.
Artifact ID. Specifies that a specific artifact must be in the returned relationships. If specified, artifact_type must also be specified
Artifact Type. Required if artifact_id is specified
Related Asset Types. Filters the result to include relationships based on their target asset type. The parameter can be repeated to specify multiple asset types. If both related_artifact_types and related_asset types are omitted, all target types are included in the result.
Related Artifact Types. Filters the result to include relationships based on their target artifact type. The parameter can be repeated to specify multiple artifact types. If both related_artifact_types and related_asset types are omitted, all target types are included in the result.
bookmark
limit. Defaults to 25
Default:
25
orderby. This parameter can be repeated to add additional sort fields.
Default: update_time_ascSupported sort fields (these are case insensitive):
- create_time_asc
- create_time_desc
- update_time_asc
- update_time_desc
- creator_id_asc
- creator_id_desc
- updater_id_asc
- updater_id_desc
Default:
update_time_asc
Limits results to only include the specified relationships. By default, all relationships are included. Repeat the query parameter to specify multiple relationship names.
Whether to lookup the asset names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
Whether to lookup the artifact names. This adds overhead, so only set this to true if the artifact names are required. Defaults to false.
Default:
false
Whether to lookup the container names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
Includes relationships modified after this date/time - specified in ISO-8601 format (examples: 2022-09-01T14:32:17Z, 2022-09-01T14:32:17.802Z). This can be used in conjunction with modified_before to specify a required modification time window.
Includes relationships modified before this date/time - specified in ISO-8601 format (examples: 2022-09-01T14:32:17Z, 2022-09-01T14:32:17.802Z). This can be used in conjunction with modified_after to specify a required modification time window.
(DEPRECATED) Whether to return relationships where target end is a column for the given source asset, artifact, or column. This parameter is deprecated. Use 'include_target_columns' instead.
Default:
false
Whether to return relationships where target end is a column for the given source asset, artifact, or column.
Default:
false
Whether to return relationships where source end is either the given asset or a column of the given asset. When this parameter set to 'true', 'include_target_columns' must also be set to 'true'. By this, all relationships of asset-asset, asset-column, asset-artifact, column-asset, column-column, column-artifact will be returned.Use this parameter when 'asset_id' refers to an asset which has columns.
Default:
false
Create asset relationships
Use this API to create up to 20 asset relationships.
- The relationship name must correspond to a relationship type defined on the source type.
- The source must be an asset, column or governance artifact of the type specified in the relationship type definition.
Relationships created with a governance artifact as the source cannot be retrieved, modified, or deleted in the IKC UI.
These relationships can only be accessed through API calls such asPOST /v2/assets/search_relationships
,POST /v2/assets/get_relationships
, andPOST /v2/assets/unset_relationships
. - The target must be an asset, column or governance artifact of the type specified in the relationship type definition.
- If the specified relationship does not exist between a source and target, a new relationship is created.
- If either end of the relationship is defined as multiplicity-one, any existing value of a multiplicity-one end is replaced.
For Asset
- The
asset_id
and one of the {catalog_id
,project_id
,space_id
} must be set
For Column
- The
asset_id
and one of the {catalog_id
,project_id
,space_id
} must be set - The
asset_id
of a column must be in the format of<asset_id>#COLUMN#<column_identifier>
For Artifact
- The
artifact_id
andartifact_type
must be set. Use the global id of the artifact forartifact_id
POST /v2/assets/set_relationships
Deletes asset relationships
Use this API to delete up to 20 asset relationships.
- The relationship name must correspond to a relationship type defined on the source type.
- The source must be an asset or governance artifact of the type specified in the relationship type definition.
- The target must be an asset or governance artifact of the type specified in the relationship type definition.
- If the specified relationship does not exist between the source and the target, it will be ignored.
For Asset
- The
asset_id
and one of the {catalog_id
,project_id
,space_id
} must be set
For Column
- The
asset_id
and one of the {catalog_id
,project_id
,space_id
} must be set - The
asset_id
of a column must be in the format of<asset_id>#COLUMN#<column_identifier>
For Artifact
- The
artifact_id
andartifact_type
must be set
POST /v2/assets/unset_relationships
Finds relationship types used by an asset
Finds relationship types that are used by a given asset. This API only returns the first page of results. It does not accept a bookmark. Any additional pages of results must be be fetched using the next page url included in the response.
You can refer to the options in the response to get the full list of relationship names.
This API is deprecated - use POST /v2/assets/search_relationships
instead.
GET /v2/assets/{asset_id}/relationship_types
Request
Path Parameters
Asset ID
Query Parameters
Catalog ID
Project ID
Space ID
Optional limit to use when finding relationship types
Default:
25
orderby. This parameter can be repeated to add additional sort fields.
Default: relationship_display_name_ascSupported sort fields (these are case insensitive):
- relationship_name_asc
- relationship_name_desc
- relationship_display_name_asc
- relationship_display_name_desc
- create_time_asc
- create_time_desc
Default:
relationship_display_name_asc
Response
- options
Possible values: contains only unique items
Possible values: contains only unique items
Possible values: [
ASSET
,ARTIFACT
,COLUMN
]Possible values: [
RELATIONSHIP_NAME_ASC
,RELATIONSHIP_NAME_DESC
,RELATIONSHIP_DISPLAY_NAME_ASC
,RELATIONSHIP_DISPLAY_NAME_DESC
,CREATE_TIME_ASC
,CREATE_TIME_DESC
]
- next
Status Code
Success
Bad Request
Unauthorized
Forbidden
Conflict
Too Many Requests
No Sample Response
Finds related assets
Finds assets related to a given asset. This API supports limiting the results to a specific set of relationships as well as providing pagination and sorting options.
Note:When self references are present (i.e. when the source and target asset in the relationship are the same), more rows than specified by the limit may be returned. When this happens, the total rows reported is accurate and the bookmark points to next page of results.
This API is deprecated - use POST /v2/assets/get_relationships
instead.
GET /v2/assets/{asset_id}/relationships
Request
Path Parameters
Asset ID
Query Parameters
Catalog ID
Project ID
Space ID
bookmark
limit. Defaults to 25
Default:
25
orderby. This parameter can be repeated to add additional sort fields.
Default: update_time_ascSupported sort fields (these are case insensitive):
- create_time_asc
- create_time_desc
- update_time_asc
- update_time_desc
- creator_id_asc
- creator_id_desc
- updater_id_asc
- updater_id_desc
Default:
update_time_asc
Limits results to only include the specified relationships. By default, all relationships are included. Repeat the query parameter to specify multiple relationship names.
Whether to lookup the asset names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
Whether to lookup the container names. This adds overhead, so only set this to true if the asset names are required. Defaults to false.
Default:
false
Create asset relationships
Use this API to create relationships between a source asset and up to 20 target artifacts.
- The relationship name must correspond to a relationship type defined on the source asset's type.
- The target artifacts must be assets of the type specified in the relationship type definition.
- If the specified relationship does not exist between a source and target asset, a new relationship is created.
- If either end of the relationship is defined as multiplicity-one, any existing value of a multiplicity-one end is replaced.
This API is deprecated - use POST /v2/assets/set_relationships
instead.
PUT /v2/assets/{asset_id}/relationships/{relationship_name}
Deletes asset relationships
Use this API to remove a relationship between an asset and one or more target artifacts. The relationship name must correspond to a relationship type defined on the source asset's type. If the specified relationship does not exist between the source and one or more target assets, it will be ignored. For this api, all of the relationship targets need to be in the same catalog, project, or space. If the target space/catalog/project is omitted, it is assumed to be the same as the source.
This API is deprecated - use POST /v2/assets/unset_relationships
instead.
DELETE /v2/assets/{asset_id}/relationships/{relationship_name}
Request
Path Parameters
Source Asset Id
Name of relationship defined on the source asset type
Query Parameters
target asset id. To delete multiple relationships, this parameter can be repeated.
Source Asset Catalog
Source Asset Project
Source Asset Space
Target Asset Catalog
Target Asset Project
Target Asset Space
Determine if a relationship exists between an asset and a target artifact
Use this API to determine if a relationship exists between an asset and a target artifact. The relationship name must correspond to a relationship type defined on the source asset's type. The target artifact must be an asset of the type specified in the relationship type definition. The source and target assets must reside in the same catalog.
HEAD /v2/assets/{asset_id}/relationships/{relationship_name}
Request
Path Parameters
Asset GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]
Attachment meta-data.
The "asset_type" is required.
The "mime" is a reliable way to indicate the nature and format of the attachment. It should be used instead of other implementation-specific means, e.g., file extension.
If not supplied, or if the value given is less than 1, data_partitions defaults to 1 (for non-remote and non-referenced attachments).
For remote attachments, both connection_id and connection_path are required.
For referenced attachments, both object_key and object_key_is_read_only are required.
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
private_url
Default:
false
response-content-disposition
response-content-type
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]
Update attachment metadata
Update attachment metadata. Only the following fields are allowed to update:
- name
- description
- mime
- size
- partitions
- transfer_complete
PATCH /v2/assets/{asset_id}/attachments/{attachment_id}
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
,application/merge-patch+json
]
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
[
{ "op": "add", "path": "/name", "value": "my_attachment"},
{ "op": "replace", "path": "/mime", "value": "text/csv" },
{ "op": "remove", "path":"/description"}
]
Note:
- Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}, the path for "~bar" is "/foo~1/~0bar". - The transfer_complete field can only be set to true if its current value is false and the type of attachment is managed.
Mark an attachment as transfer complete
Marks an attachment as transfer complete.
POST /v2/assets/{asset_id}/attachments/{attachment_id}/complete
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Auto update attachment's datasource_type from connection asset
Auto update attachment's datasource_type from connection asset
PATCH /v2/assets/{asset_id}/attachments/{attachment_id}/datasource_type
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Get resources info for an attachment
Get resources info for an attachment.
GET /v2/assets/{asset_id}/attachments/{attachment_id}/resources
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
Increase resources for an attachment
Increase resources for an attachment.
PUT /v2/assets/{asset_id}/attachments/{attachment_id}/resources
Request
Path Parameters
Asset GUID
Attachment GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
attachment resource to increase, e.g. {"data_partitions":5, "private_url":false}
List all attributes
Use this API to retrieve all attributes of an asset.
GET /v2/assets/{asset_id}/attributes
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Create an attribute
Use this API to create an asset attribute. An attribute is identically named with, and has fields that are partially defined by, an asset type.
The name
specified in the request payload must match an existing built-in or user-defined asset type.
column_info attribute
The built-in column_info
type defines metadata (e.g. description, tags) for columns contained in a data asset which represents a relational database table or columnar data file (e.g. csv file).
Create a column_info
asset attribute to add column descriptions and/or tags if that attribute does not already exist on the asset.
Once the column_info
attribute has been created, entries can be added, updated, or removed using the PATCH /v2/assets/{asset_id}/attributes/column_info
endpoint.
The column_info
attribute is a map structure where the key is a column name corresponding to an existing entry in the asset's entity.data_asset.columns
array.
For example, here is the request payload to create a column_info
attribute that assigns a description and tags to ACCOUNT_HOLDER_ID
and NAME
columns:
{
"name": "column_info",
"entity": {
"ACCOUNT_HOLDER_ID": {
"column_description": "Globally unique identifier of the account holder.",
"column_tags": [
"identifier",
"unique_key"
]
},
"NAME": {
"column_description": "Full legal name of the account holder.",
"column_tags": [
"legal_name"
]
}
}
}
POST /v2/assets/{asset_id}/attributes
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]
Attribute metadata. The "entity" is required.
Example:
TestAttribute
- entity
Get an attribute
Use this API to retrieve an attribute of an asset.
GET /v2/assets/{asset_id}/attributes/{attribute_key}
Request
Path Parameters
asset_id
attribute_key
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Whether to allow access to metadata when the evaluation outcome of data protection rules is Deny.
The parameter
include
is deprecated, please useexclude
insteadExclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Delete an attribute
Use this API to delete an attribute of an asset.
DELETE /v2/assets/{asset_id}/attributes/{attribute_key}
Request
Path Parameters
asset_id
attribute_key
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]
Update attributes
Use this API to update/modify an asset attribute.
Updating the column_info attribute
Use this API for updating an existing column_info
attribute to set or replace the column_description
or column_tags
for any column.
Note: Use
POST /v2/assets/{asset_id}/attributes
to create thecolumn_info
attribute if it does not exist on the asset. UseGET /v2/assets/{asset_id}/attributes/column_info
to check the existence of the attribute.
Payload Examples
To add a column_info
entry that specifies a description and tags for the INVOICE_ID
column:
[{"op":"add","path":"/INVOICE_ID","value":{"column_description":"Unique identifier of the invoice", "column_tags":["tag1", "tag2"]}}]
To add or update a description on the existing column_info
entry for the ADDRESS
column:
[{"op":"add","path":"/ADDRESS/column_description","value":"customer street address"}]
To unset/remove the ADDRESS
column's description:
[{"op": "remove", "path":"/ADDRESS/column_description"}]
To remove the entire column_info
entry for the ADDRESS
column: [{"op": "remove", "path":"/ADDRESS"}]
To update value of properties
of data_asset:
{
"entity": {
"data_asset": {
"mime_type": "text/csv",
"dataset": true,
"properties": [
{
"name": "schema_name",
"value": "GOSALES1021"
},
{
"name": "table_name",
"value": "COUNTRY"
}
]
}
}
}
Request body:
[
{ "op": "replace", "path": "/properties/1/value", "value": "COUNTRY2" }
]
To update value of data_source_deployment
:
{
"entity": {
"ibm_data_source": {
"data_source_type_id": "8c1a4480-1c29-4b33-9086-9cb799d7b157",
"data_source_deployment": "on_cloud",
"data_source_flags": [],
"data_source_state": "ACTIVE",
"data_source_encoding": "UTF-8",
"data_source_protection_method": "guardium"
}
}
}
Request body:
[
{ "op": "replace", "path": "/data_source_deployment", "value": "cpd" }
]
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Note: Any ‘~’ characters need to be escaped as ~0 in the path field. Any ‘/’ characters need to be escaped as ~1 in the path field. For example, in {"foo/" : {"~bar" : "value"}}, the path for "~bar" is "/foo~1/~0bar".
PATCH /v2/assets/{asset_id}/attributes/{attribute_key}
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
]
Path Parameters
asset_id
attribute_key
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Exclude columns from the asset result. Available value: columns. Default value is empty, which will return the complete asset, including the columns.
Action to take if the changes to an asset would make it a duplicate to other assets.
IGNORE
means the call will continue and update the asset anyway; as a result, the asset will become a duplicate to other assets.REJECT
means the call will fail and the asset will not be changed. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
]
JSON array of patch operations as defined in RFC 6902.
Get asset document conflicts
Use this API to retrieve asset document conflicts in database.
GET /v2/assets/{asset_id}/conflicts
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
The number of items to skip before starting to collect the result set
The maximum number of items to return
Default:
10
Whether to retrieve conflict revision document content
Delete asset document conflicts
Use this API to delete asset document conflicts in database.
DELETE /v2/assets/{asset_id}/conflicts
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Document conflicts to delete
List of conflicting documents revs
Request
Path Parameters
Asset GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Upload an image for asset
Upload an image for asset. Use this api, you must put the image's binary data to the HTTP body.
For example:
curl -X "PUT" "api_url" -H "Authorization: $access token" --data-binary "@/image_dir_path/abc.png"
PUT /v2/assets/{asset_id}/images
Request
Path Parameters
Asset GUID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Request
Path Parameters
Asset GUID
Image id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Get asset or column notes
Use this API to retrieve a paginated list of notes from an asset or a column within an asset. Any user who has viewer permission on an asset can retrieve its notes.
GET /v2/assets/{asset_id}/notes
Request
Path Parameters
Asset ID
Query Parameters
The maximum number of asset/column notes to return.
The default value is 25. The limit cannot exceed 200.
Use thenext
URL from the response to retrieve the next page of notes.Default:
25
Sorting order.
The following fields can be used for sorting. Use hyphen prefix (-
) for descending order:column_name
created_at
updated_at
Default:
-updated_at
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Filters the result to include notes on the specified column(s). The parameter should be repeated for each column name.
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Type of notes to include in the search result:
Note Type Description ALL
Include all asset and column notes ASSET
Include asset notes only - cannot be used with the column_names
parameterCOLUMN
Include column notes only Allowable values: [
ALL
,ASSET
,COLUMN
]Default:
ALL
Create notes on an asset
Use this API to create notes for an asset and for columns within an asset. Accredited Service Editors and Viewers, and any user who has edit permission on an asset can create a note.
POST /v2/assets/{asset_id}/notes
Request
Path Parameters
Asset ID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Asset notes to be created - limited to 20 notes per request
notes to be created for an asset and for columns of an asset
Get an asset or column note
Use this API to retrieve a note from an asset or a column within an asset. Any user who has viewer permission on an asset can retrieve a note.
GET /v2/assets/{asset_id}/notes/{note_id}
Request
Path Parameters
Asset ID
Note ID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Response
Asset note metadata
- metadata
system generated unique asset note identifier
unique Identifier of asset associated with this note
Example:
892aaa55-1b0e-438a-b05f-4224246a0f3f
name of the column associated with this note
Example:
city
ID of user that created the asset note (system managed)
Example:
IBMid-66100421AN
ID of user that updated the asset note (system managed)
Example:
IBMid-66100421AN
RFC 3339 timestamp (yyyy-MM-dd'T'HH:mm:ss'Z') when the asset note was created (system managed)
Example:
2023-05-24T16:34:06.000Z
Epoch timestamp when the asset note was created (system managed)
Example:
1621874046338
RFC 3339 timestamp (yyyy-MM-dd'T'HH:mm:ss'Z') when the asset note was updated (system managed)
Example:
2023-05-24T16:34:06.000Z
Epoch timestamp when the asset note was updated (system managed)
Example:
1621874046338
Asset note entity
- entity
a note associated with an asset or a column of an asset
Example:
description of a note
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Delete an asset or a column note
Use this API to delete an asset or a column note. Accredited Service Editors and Viewers, and any user who has edit permission on an asset can delete a note.
DELETE /v2/assets/{asset_id}/notes/{note_id}
Request
Path Parameters
Asset ID
Asset or column note ID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Update an asset or column note
Use this API to update a note on an asset or a column within an asset. Accredited Service Editors and Viewers, and any user who has edit permission on an asset can update a note.
PATCH /v2/assets/{asset_id}/notes/{note_id}
Request
Path Parameters
Asset ID
Note ID
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
JSON array of patch operations as defined in RFC 6902. See http://jsonpatch.com/ for more info.
Example :
[ {"op": "replace", "path": "/note", "value": "Description of a Note" } ]
Note:
Only supported operation is replace
.
Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}
, the path for ~bar
is /foo~1/~0bar
.
Response
Asset note metadata
- metadata
system generated unique asset note identifier
unique Identifier of asset associated with this note
Example:
892aaa55-1b0e-438a-b05f-4224246a0f3f
name of the column associated with this note
Example:
city
ID of user that created the asset note (system managed)
Example:
IBMid-66100421AN
ID of user that updated the asset note (system managed)
Example:
IBMid-66100421AN
RFC 3339 timestamp (yyyy-MM-dd'T'HH:mm:ss'Z') when the asset note was created (system managed)
Example:
2023-05-24T16:34:06.000Z
Epoch timestamp when the asset note was created (system managed)
Example:
1621874046338
RFC 3339 timestamp (yyyy-MM-dd'T'HH:mm:ss'Z') when the asset note was updated (system managed)
Example:
2023-05-24T16:34:06.000Z
Epoch timestamp when the asset note was updated (system managed)
Example:
1621874046338
Asset note entity
- entity
a note associated with an asset or a column of an asset
Example:
description of a note
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Request
Path Parameters
asset_id
Query Parameters
The maximum number of asset ratings to return.
The default value is 25.Default:
25
Bookmark that gives the start of the page.
Sorting order.
Valid values:updated_at
,rating
Use hyphen prefix (-) for descending orderDefault:
-updated_at
Filter results by user.
Valid values:all
,user
,other
The default value isall
Allowable values: [
all
,user
,other
]Default:
all
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Asset rating to be created
user's rating of an asset
Example:
5
user's review of an asset
Example:
Such an asset!
Get the count of each rating value for the specified asset
Get the counts of each rating value for the specified asset.
GET /v2/assets/{asset_id}/ratings/stats
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Delete an asset rating
Use this API to delete an asset rating.
DELETE /v2/assets/{asset_id}/ratings/{asset_rating_id}
Request
Path Parameters
asset_id
asset_rating_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Use this API to update an asset rating.
Update an asset rating
PATCH /v2/assets/{asset_id}/ratings/{asset_rating_id}
Request
Path Parameters
asset_id
asset_rating_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Revision id (1, 2, 3, ...), or leave empty for the current asset version. Use 'latest' for the most recent revision.
Asset rating to be updated
user's rating of an asset
Example:
5
user's review of an asset
Example:
Such an asset!
Get a list of revisions for an asset
Use this API to retrieve an ordered list of revisions for an asset, most to least recent.
GET /v2/assets/{asset_id}/revisions
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
The parameter
include
is deprecated.The maximum number of revisions to return. The default is 25. Maximum is 200
Default:
25
The revision number to start from, or 'latest'. Latest revision is the default.
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Commit a revision of an asset
Use this API to commit a revision of an asset.
POST /v2/assets/{asset_id}/revisions
Request
Path Parameters
asset_id
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Defaults to false. If true, deprecated fields are not included in the response. This parameter is provided as a way to check compatibility with future releases and remove duplicated information from responses. It has not been implemented for all deprecated fields. The primary fields that are omitted are collaborator_ids in the asset metadata rov and the owner_id field in the asset metadata.
Default:
false
Commit options
Delete a revision
Use this API to delete a revision of an asset.
DELETE /v2/assets/{asset_id}/revisions
Request
Path Parameters
asset_id
Query Parameters
You must provide a revision id (1, 2, 3, ...)
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Find and repair stuck temporary design documents by creating cache entries that resume their processing
Use this API to find and repair broken indexes
POST /v2/cams/admin/index_repairs
Request
Query Parameters
Optionally limits to checking to certain catalogs. For projects and spaces you must use the id of the private catalog. Repeat this parameter to check multiple catalogs. If this is omitted, all catalogs are checked
Allows rate limiting the creation of cache entries. To disable rate limiting, set to -1. If set, the operation pauses every time this number of cache entries have been created.
Default:
50
When using cache entry rate limiting, this controls how long the service pauses between creating groups of cache entries.
Default:
1
Allows restricting total the number of cache entries created.
Default:
true
Response
The options being used for the indexing repair task.
- options
If rate liming is enabled, the number of minute to sleep between creating batches of cache entries
Number of created cache entries between pauses, if rate limiting is enabled
The configured list of catalogs to process, if there is one
The number of threads processing the catalogs
Whether preview mode is enabled
The number maximum number of cache entries to create
Link that can be used to get the status of the indexing repair task.
The id of the indexing repair task that was created.
Status Code
Accept, the processing was successfully started
Bad Request
Unauthorized
Forbidden
No Sample Response
Gets the status of a temporary design document repair operation
Use this API to get the status of the task to find and repair stuck temporary design documents
GET /v2/cams/admin/index_repairs/{id}
Response
The options being used for the indexing repair task.
- options
If rate liming is enabled, the number of minute to sleep between creating batches of cache entries
Number of created cache entries between pauses, if rate limiting is enabled
The configured list of catalogs to process, if there is one
The number of threads processing the catalogs
Whether preview mode is enabled
The number maximum number of cache entries to create
The current state of the repair task.
Possible values: [
QUEUED
,RUNNING
,FINISHED
,FAILED
]The time when the task started.
The amount of time the repair task has been running.
The percentage of the databases that have been processed.
The estimated time when the processing will be complete.
The time when the processing completed.
The estimated amount of processing time remaining.
The number of databases that have been processed so far.
The total number of databases that need to be processed.
The number of stuck temporary design documents that have been found.
The stuck temporary design documents that have been found.
The number of stuck temporary design documents that have been repaired so far.
The number of temporary design documents that have been processed so far.
The ids of any catalogs that were unable to be processed.
The error message if the processing failed.
Status Code
OK
Bad Request
Unauthorized
Forbidden
No Sample Response
Reconfigure the asset container
Use this API to reconfigure the asset container. The supported items are:
- type_search_indexes - reconfigure type search indexes for all asset types. It requires the user to have permission to register asset types.
POST /v2/cams/admin/reconfigure
Request
Query Parameters
The item to reconfigure, e.g. type_search_indexes
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
You must provide either a catalog id, a project id, or a space id, but not more than one
Add a task result to the task
Add a task result for future reference and servicability.
POST /v2/cams/admin/tasks/search
Get metrics of the asset lists
Get metrics of the asset lists.
The caller should use the href
returned by the service to navigate through pages and should not try to manually construct URLs for pagination.
The href
for the first page is always available, and the href
for the next page is only available if there may be a next page.
The total_rows
field is only available when summary_level
is CATALOG
.
The caller must be in at least one of the allow-lists based on the product offering to perform the operation:
DATA_PRODUCT
-data_product_account_viewers
,data_product_account_managers
GET /v2/cams/metrics/asset_list_count
Request
Query Parameters
The month of the metrics to retrieve. The value must be in format
yyyy-MM
. If not supplied, the metrics of the current month are retrieved.Retrieve metrics only for asset lists belonging to any of the given bss accounts. Specify as a comma-separated list.
The maximum number of items to return. If not specified, it will be set to 200.
Default:
200
Start token for pagination
Response
The user access metrics response API model.
The limit on the number of results returned.
Example:
200
The total count on the number of results returned, accounting for filtering but not pagination.
Example:
50
A page in a pagination collection.
The next page in the collection.
Collection of user access statistics
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
Get metrics of the users who accessed the catalogs
Get metrics of the users who accessed the catalogs.
The caller should use the href
returned by the service to navigate through pages and should not try to manually construct URLs for pagination.
The href
for the first page is always available, and the href
for the next page is only available if there may be a next page.
The total_rows
field is only available when summary_level
is CATALOG
.
The caller must be in at least one of the allow-lists based on the product offering to perform the operation:
(Note: the WKC
product offering represents IBM Knowledge Catalog):
DATA_PRODUCT
-data_product_account_viewers
,data_product_account_managers
WATSONX_GOVERNANCE
-watsonx_governance_account_viewers
,watsonx_governance_account_managers
WKC
-wkc_account_viewers
,wkc_account_managers
GET /v2/cams/metrics/user_access_count
Request
Query Parameters
The month of the metrics to retrieve. The value must be in format
yyyy-MM
. If not supplied, the metrics of the current month is retrieved.The level for aggregating the user access statistics.
Allowable values: [
ACCOUNT
,CATALOG
]Default:
ACCOUNT
Product offering. If omitted, it is default to WKC.
Allowable values: [
WKC
,DATA_PRODUCT
,WATSONX_GOVERNANCE
]Default:
WKC
Retrieve metrics only for catalogs belonging to any of the given bss accounts. Specify as a comma-separated list.
Retrieve metrics only for catalogs with the given ids. Specify as a comma-separated list.
The maximum number of items to return. If not specified, it will be set to 200.
Default:
200
Start token for pagination
Response
The user access metrics response API model.
The limit on the number of results returned.
Example:
200
The total count on the number of results returned, accounting for filtering but not pagination.
Example:
50
A page in a pagination collection.
The next page in the collection.
Collection of user access statistics
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error
No Sample Response
List all catalogs for a specific account
Use this API to get a list of catalogs or projects or spaces, or to get the total number of public catalogs, for a given account. You can use 'roles' parameter to filter results by roles for standard callers.
- Standard Callers: to list all catalogs where caller is collaborator, omit "bss_account_id" field and "include" field.
- Accredited Services: to list all catalogs or projects or spaces for a specific account, enter bss account value in "bss_account_id" field and one of 'catalogs', 'projects', or 'spaces' in "include" field.
- Accredited Services: to get the number of public catalogs for a specific account, enter bss account value in "bss_account_id" field and 'total_count' in "include" field.
GET /v2/catalogs
Request
Query Parameters
limit
Default:
25
bookmark
skip
Default:
0
Limited to use by accredited services (which must also supply 'bss_account_id'). Currently, the only supported values are 'catalogs', 'projects', 'spaces' or 'total_count'. Not supplying any of these values results in public 'catalogs', 'projects', AND 'spaces' being returned for a specific account.
- Use 'catalogs' for including all public catalogs for a specific bss_account_id.
- Use 'projects' for including all projects for a specific bss_account_id.
- Use 'spaces' for including all spaces for a specific bss_account_id.
- Use 'total_count' for including total number of public catalogs for a specific account.
- Use 'catalogs' for including all public catalogs for a specific bss_account_id.
"Used for listing catalogs, projects, or spaces for the account, or for retrieving the number of public catalogs for the account. Must be supplied when caller is an accredited service."
- "If caller is a non-service id user, then the catalogs from that user's bss account are included in the list.
Filter results by user roles. Use comma separated list for multiple roles. Valid role value: admin, editor, viewer. This param is not allowed if caller is from accredited service or ICIP mode is enabled.
orderby. This parameter can be repeated to add additional sort fields.
Default: nullSupported sort fields (these are case insensitive):
- catalog_name_asc
- catalog_name_desc
- create_time_asc
- create_time_desc
- update_time_asc
- update_time_desc
(DEPRECATED) Indicates whether results should include catalogs with a subtype. If not specified, catalogs with a subtype are excluded.This parameter is deprecated. Use 'subtype' instead.
Filter results by catalogs with the specified subtypes.
If nothing is specified, by default catalogs with subtype are not returned.
- If set to 'NO_SUBTYPES', catalogs with subtype are not returned
- If set to 'ALL_SUBTYPES', only catalogs with a subtype are returned
- If set to 'ALL_SUBYTPES, NO_SUBTYPES', both catalogs which have subtypes and not are returned
- If set to 'subtypeA, subtypeB' , only catalogs with those subtypes are returned.
Filter results by catalogs with a case insensitive name. This matches substrings.
Filter results by catalogs with the specified governance type. Allowed values: governed, ungoverned, all. Default: all
Create a catalog
Use this API to create a catalog to organize your assets and collaborators. You can use a catalog to easily find and share data for your organization, regardless of the location or format of the data.
On IBM Cloud, to create a catalog you must have the Manager role in your organization's IBM Knowledge Catalog account.
In CPD, to create a catalog you must have the Administrator role.
The catalog name can be 1 to 256 printable Unicode characters - ISO control characters and Unicode Special characters are not permitted.
On IBM Cloud, IBM Cloud Object Storage (S3 API) credentials are required in the request body's bucket
structure. This provides encrypted storage for file assets that are copied to the catalog.
In CPD, file asset storage for a catalog is provisioned after catalog creation, via a separate call to the Asset Files API. Creating a catalog does not automatically provision this storage.
When creating a catalog in CPD, set the check_bucket_existence
query parameter to false
, and the request body's bucket
structure should just have bucket_type
set to assetfiles
.
After the catalog is created, call the PUT /v2/asset_files
API to create a bucket for this catalog.
Sample request body for creating a catalog in CPD:
{
"name": "catalog1",
"description": "A catalog in CPD",
"generator": "catalogadmin@mycompany.com",
"bucket": {
"bucket_type": "assetfiles"
}
}
POST /v2/catalogs
Request
Query Parameters
Whether an existence check should be performed on the catalog bucket
Default:
true
Whether to check if catalogs with the same name already exist
Default:
false
Note: name
, generator
, and the bucket
structure are required fields in the request body.
Name of the Catalog instance
Generator of the Catalog instance
Description of the Catalog instance
Project id
Space id
Account ID
Capacity Limit
Indicates if the assets in the Catalog are governed. if not specified, it defaults to false. For private catalogs, 'false' should be always passed, because private catalogs can not be governed.
Indicates if the assets in the Catalog are allowed for profiling. if not specified
SAML instance type
UID
Response
Metadata of the Catalog instance
- metadata
GUID of the Catalog instance
URL of the Catalog instance
Example:
ibmid-50h088ud1b
Created time
Information about archiving the catalog
- archive_info
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Entity of the Catalog instance
- entity
Name of the Catalog instance
Generator of the Catalog instance
Description of the Catalog instance
Project id
Space id
Account ID
Capacity Limit
Indicates if the assets in the Catalog are governed. if not specified, it defaults to false. For private catalogs, 'false' should be always passed, because private catalogs can not be governed.
Indicates if the assets in the Catalog are allowed for profiling. if not specified
SAML instance type
UID
URL of the Catalog instance
Status Code
Created
Bad Request
Unauthorized
Forbidden
Too Many Request
No Sample Response
Response
Metadata of the Catalog instance
- metadata
GUID of the Catalog instance
URL of the Catalog instance
Example:
ibmid-50h088ud1b
Created time
Information about archiving the catalog
- archive_info
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Entity of the Catalog instance
- entity
Name of the Catalog instance
Generator of the Catalog instance
Description of the Catalog instance
Project id
Space id
Account ID
Capacity Limit
Indicates if the assets in the Catalog are governed. if not specified, it defaults to false. For private catalogs, 'false' should be always passed, because private catalogs can not be governed.
Indicates if the assets in the Catalog are allowed for profiling. if not specified
SAML instance type
UID
URL of the Catalog instance
Status Code
OK
Unauthorized
Forbidden
Not Found
Internal Server Error
No Sample Response
Request
Query Parameters
Whether an existence check should be performed on the catalog bucket
Default:
true
Whether to check if catalogs with the same name already exist
Default:
false
Example:
{
"catalog": {
"name": "The Default Catalog",
"generator": "API",
"description": "The default catalog",
"bss_account_id": "999",
"is_governed": true,
"bucket": {
"bucket_type": "assetfiles"
}
}
}
Schema for catalogEntity with bucket details
Rehome one or more synced assets.
Use this API for rehoming assets in a catalog that had been synced from an external OMRS repository to be native IKC assets. The API supports rehoming a single asset, or a bulk rehome of all synced data assets of specific external repository.
If the request is accepted, the response header contains the rehome request ID x-future-id
which can then be used to monitor the request status via /v2/catalogs/default/rehome/status?request_id=<x-future-id>
.
Service ID is required to call this API.
POST /v2/catalogs/default/rehome
Request
Query Parameters
Set to true to bulk re-home all synced data assets in a catalog
Synced asset to be rehomed - if omitted, all synced assets of
source_metadata_collection_id
repository will be rehomedSource metadata collection id required if all assets of single external repository need to be rehomed
Catalog containing the synced assets to be re-homed. If not specified, synced assets in the default catalog are re-homed.
Get a catalog by catalog ID
Members of the catalog can use this API to retrieve information about a catalog.
GET /v2/catalogs/{catalog_id}
Request
Path Parameters
Catalog GUID or UID
Query Parameters
You must provide bss_account_id when querying for uid
Response
Metadata of the Catalog instance
- metadata
GUID of the Catalog instance
URL of the Catalog instance
Example:
ibmid-50h088ud1b
Created time
Information about archiving the catalog
- archive_info
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Entity of the Catalog instance
- entity
Name of the Catalog instance
Generator of the Catalog instance
Description of the Catalog instance
Project id
Space id
Account ID
Capacity Limit
Indicates if the assets in the Catalog are governed. if not specified, it defaults to false. For private catalogs, 'false' should be always passed, because private catalogs can not be governed.
Indicates if the assets in the Catalog are allowed for profiling. if not specified
SAML instance type
UID
URL of the Catalog instance
Status Code
OK
Bad Request
Unauthorized
Forbidden
No Sample Response
Update catalog
Use this API to update the name or description of the catalog.
PATCH /v2/catalogs/{catalog_id}
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
]
Path Parameters
catalog GUID
Query Parameters
Whether to check if catalogs with the same name already exist. Only applicable if the name is to be updated.
Default:
false
JSON array of patch operations as defined in RFC 6902.
[
{ "op": "replace", "path": "/entity/name", "value": "new-name" },
{ "op": "replace", "path": "/entity/description", "value": "new-description" }
]
Note:
Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}, the path for "~bar" is "/foo~1/~0bar".
Response
Metadata of the Catalog instance
- metadata
GUID of the Catalog instance
URL of the Catalog instance
Example:
ibmid-50h088ud1b
Created time
Information about archiving the catalog
- archive_info
Possible values: [
ARCHIVE_SCHEDULED
,ARCHIVE_FAILED
,ARCHIVED
,RESTORE_SCHEDULED
,RESTORE_FAILED
]Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
Example:
2023-05-19T01:45:20.000Z
Example:
1684460720000
The times that the archive/restore process failed for the asset container. Note that a successful archive/restore will reset this field.
The details of the error.
Entity of the Catalog instance
- entity
Name of the Catalog instance
Generator of the Catalog instance
Description of the Catalog instance
Project id
Space id
Account ID
Capacity Limit
Indicates if the assets in the Catalog are governed. if not specified, it defaults to false. For private catalogs, 'false' should be always passed, because private catalogs can not be governed.
Indicates if the assets in the Catalog are allowed for profiling. if not specified
SAML instance type
UID
URL of the Catalog instance
Status Code
OK
Bad Request
Unauthorized
Forbidden
Conflict
Internal Server Error
No Sample Response
Update catalog properties
Use this API to patch catalog properties.
PATCH /v2/catalogs/{catalog_id}/properties
Request
Custom Headers
Allowable values: [
application/json
,application/json-patch+json
]
Path Parameters
catalog GUID
JSON array of patch operations as defined in RFC 6902.
[
{ "op": "add", "path": "/properties/new-property", "value": "new-value" }
]
Note:
Any ‘~’ characters need to be escaped as ~0 in the path field.
Any ‘/’ characters need to be escaped as ~1 in the path field.
For example, in {"foo/" : {"~bar" : "value"}}, the path for "~bar" is "/foo~1/~0bar".
Export as a zip file containing appropriate csv files for the selected assets
This API is responsible for preparing a zip file containing csv files related to the selected assets for export.
POST /v2/catalogs/export_assets
Request
Query Parameters
You must provide either a catalog id, a project id, or a space id, but not more than one
The id of the ibm_export_asset type asset
The list of export types the operation is expected to handle. Export Type includes
metadata, relationship, lineage
. But, for now, we are supporting onlymetadata, relationship
.
A json object containing a list of asset IDs; maximum of 100.
Process an import asset
This API begins processing of import assets. This function is used to import lineage mappings from EMD CSV documents.
POST /v2/catalogs/import_assets
Request
Query Parameters
The id of the import_asset
The action to perform on this import_asset. "process" (default): Validate and run an import job, "validate": Validate an import only without starting an import job.
You must provide either a catalog id, a project id, or a space id, but not more than one
When performing a processing request, proceed with the import even if validation fails. This option is only used when 'action' = 'process' (which is the default case)
Action to take if the call would result in a duplicate asset.
IGNORE
means the call will ignore the duplicate and create a new asset.REJECT
means the call will fail and no asset will be created.UPDATE
means the best matched duplicate will be updated with the incoming changes according to the predefined rules.REPLACE
means the best matched duplicate will be overwritten with the input values according to the predefined rules. No value means the duplicate_action specified in catalogs/projects/spaces will be used.Allowable values: [
IGNORE
,REJECT
,UPDATE
,REPLACE
]
Request temporary security credentials
Request temporary security credentails that can be used for storing asset attachment files.
The endpoint is currently only applicable for AWS deployments.
If role_arn and external_id are supplied, the caller must be a sevice id in the allow-list 'aws_s3_administrator'. The requested credentials will have permission for accessing the resources in the bucket of the catalog or the bucket itself depending on the supplied information. The allowed combinations of the fields are:
- role_arn, external_id: list and create buckets in the account that the role is defined in
- role_arn, external_id, bucket_name, shared (optional, default to false), catalog_id (must be supplied if shared is true): put/get/list/delete objects in the bucket, and list multipart uploads and delete the bucket if the bucket is not shared.
If catalog_id is supplied, the caller must be a member of the catalog, or a service id that is in the allow-list 'aws_s3_administrator', 'accredited_service_editors', 'accredited_service_viewers', or 'accredited_service_exporters'. The requested credentials will have permission for accessing the resources in the bucket of the catalog or the bucket itself based on the membership.
- admin or service id in allow-list 'aws_s3_administrator': put/get/list/delete objects in the bucket, and list multipart uploads and delete the bucket if the bucket is not shared.
- editor or service id in allow-list 'accredited_service_editors': put/get/delete objects in the bucket
- viewer or service id in allow-list 'accredited_service_viewers' or 'accredited_service_exporters': get objects in the bucket
If none of above is supplied, the caller must have permission to create catalogs in the account that the access token is scoped for. The requested credentials will have permission to list and create buckets in the account.
If min_duration_seconds is supplied, the requested temporary credentials must be valid for at least the specified duration. If it is not supplied or if the cached temporary credentials of the same requirement is valid for at least specified duration, the cached credentials will be returned.
Note: when a bucket is shared, the objects in the shared bucket must have the prefix of '{catalog_id}/', e.g. '553fa5a8-bea5-43fa-9f04-b2f5ed39e6d3/myreport.csv'. All the permissions in shared buckets are granted to only allow accesing resources with the prefix of a specific catalog id.
POST /v2/catalogs/temporary_credentials
Request
Request body
Catalog id
ARN of the AWS integration role
External id for AWS integration
Name of the bucket
Indicate if the bucket is shared by multiple asset containers
Minimum duration (in seconds) that the credentials should be valid for. The value cannot be greater than 3600 seconds. If not supplied, cached credentials that have the same requested permission can be returned.
Response
AWS access key, used to identify the user interacting with AWS
AWS secret access key, used to authenticate the user interacting with AWS
AWS session token. This token is retrieved from an AWS token service, and is used for authenticating that this user has received temporary permission to access some resource.
The epoch timestamp when the credentials will expire.
Session policy applied on the role session created for the temporary credentials. The field is only available in development environment and not in production environment for performance reason.
Status Code
OK
Bad Request
Unauthorized
Forbidden
Internal Server Error