wx-data Commands and Usage
The wx-data
command further has different commands within, using which you can perform various operations specific to IBM® watsonx.data. This topic lists the commands with a brief description of the tasks that can be performed.
The wx-data
command performs operations such as ingesting data, managing engines, storage, and data sources in watsonx.data.
Syntax:
./cpdctl wx-data [command] [options]
The wx-data command supports the following commands:
ingestion
engine
bucket
database
sparkjob
tablemaint
service
How to Use wx-data
Command --help (-h)
To list all the commands in the wx-data
plugin:
./cpdctl wx-data --help
To get details of all options and its descriptions for a specific command in wx-data
plugin:
./cpdctl wx-data [command] --help
For example:
./cpdctl wx-data ingestion -h
NAME:
ingestion - Commands for Ingestion resource.
USAGE:
cpdctl wx-data ingestion [action]
COMMANDS:
list List ingestion jobs.
create Create an ingestion job.
get Get ingestion job details.
GLOBAL OPTIONS:
--cpd-config string Configuration file path
--cpdconfig string [Deprecated] Use --cpd-config instead
-h, --help Show help
--profile string Name of the configuration profile to use
-q, --quiet Suppresses verbose messages.
--raw-output If set to true, single values in JSON output mode are not surrounded by quotes
Use "cpdctl wx-data ingestion service-command --help" for more information about a command.
To get the details of all available options and arguments in the wx-data commands to execute an operation:
./cpdctl wx-data [command] [options] --help
To use the wx-data plugin to execute an operation:
./cpdctl wx-data [command] [options]
ingestion
The ingestion
command is used for executing different ingestion operations in watsonx.data.
Syntax:
./cpdctl wx-data ingestion [options]
The ingestion
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data ingestion list |
Lists the ingestion jobs executed in watsonx.data instance. |
./cpdctl wx-data ingestion create |
Create an ingestion job in watsonx.data instance. |
./cpdctl wx-data ingestion get |
Get the details of an ingestion job executed in watsonx.data instance. |
engine
The engine
command is used for executing different engine-related operations in watsonx.data.
Syntax:
./cpdctl wx-data engine [options]
The engine
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data engine list |
Lists all the engines available in watsonx.data instance. |
./cpdctl wx-data engine create |
Create or register an engine in watsonx.data instance. |
./cpdctl wx-data engine delete |
Delete an engine from watsonx.data instance. |
./cpdctl wx-data engine attach |
Associate catalogs to a Presto engine in watsonx.data instance. |
./cpdctl wx-data engine detach |
Disassociate the catalogs associated with a Presto engine in watsonx.data instance. |
bucket
The bucket
command is used for executing different storage-related operations in watsonx.data.
Syntax:
./cpdctl wx-data bucket [options]
The bucket
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data bucket list |
Lists all the storages available in watsonx.data instance. |
./cpdctl wx-data bucket create |
Register a storage in watsonx.data instance. |
./cpdctl wx-data bucket get |
Get the details of a registered storage in watsonx.data instance. |
./cpdctl wx-data bucket delete |
Delete a storage from watsonx.data instance. |
./cpdctl wx-data bucket activate |
Activate a storage bucket in watsonx.data on IBM Cloud instance only. |
./cpdctl wx-data bucket deactivate |
Deactivate a storage bucket in watsonx.data on IBM Cloud instance only. |
database
The database
command is used for executing different data source-related operations in watsonx.data.
Syntax:
./cpdctl wx-data database [options]
The database
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data database list |
Lists all the data sources available in watsonx.data instance. |
./cpdctl wx-data database create |
Create or add a data source in watsonx.data instance. |
./cpdctl wx-data database get |
Get the details of a registered data source in watsonx.data instance. |
./cpdctl wx-data database delete |
Delete a data source from watsonx.data instance. |
sparkjob
The sparkjob
command is used for executing different Spark-related operations in watsonx.data version 2.1.2 and later.
Syntax:
./cpdctl wx-data sparkjob [options]
The sparkjob
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data sparkjob list |
List all applications available in a Spark engine. |
./cpdctl wx-data sparkjob create |
Submit a Spark application. |
./cpdctl wx-data sparkjob get |
Get the status of a Spark application. |
For more information about how to perform Spark table maintenance by using IBM cpdctl in watsonx.data on IBM Software Hub, see Spark table maintenance and for watsonx.data on IBM Cloud, see Spark table maintenance.
tablemaint
The tablemaint command is used for executing different Iceberg table maintenance operations in watsonx.data.
This is now applicable only for Amazon S3 storage.
Syntax:
./cpdctl wx-data tablemaint [options]
The tablemaint command supports the following commands:
Options | Description |
---|---|
./cpdctl wx-data tablemaint rollback_to_snapshot |
Roll back, or restore the table to a specific snapshot ID. |
./cpdctl wx-data tablemaint rollback_to_timestamp |
Roll back a table to the snapshot at a specific timestamp. |
./cpdctl wx-data tablemaint set_current_snapshot |
Sets the current snapshot ID for a table. |
./cpdctl wx-data tablemaint cherrypick_snapshot |
Cherry-picks changes from a snapshot into the current table state. Cherry-picking creates a new snapshot from an existing snapshot without altering or removing the original. |
./cpdctl wx-data tablemaint expire_snapshot |
Remove older snapshots and their files which are no longer needed. |
./cpdctl wx-data tablemaint remove_orphan |
Remove files that are not referenced in any metadata files of an Iceberg table and can thus be considered "orphaned". |
./cpdctl wx-data tablemaint rewrite_data |
Rewrites the data files. |
./cpdctl wx-data tablemaint rewrite_manifests |
Rewrite manifests for a table to optimize scan planning. |
./cpdctl wx-data tablemaint register_table |
Creates a table. |
The following flags are listed when you run each table maintenance command:
-
Force : If the value is set to TRUE, the SQL query that you are going to run will be printed.
-
Debug : If the value is set to TRUE, a copy of the Spark application file is stored to our computer.
service
The service
command is used for executing different serviceability related operations in watsonx.data.
Syntax:
./cpdctl wx-data service [options]
The service
command further supports the following commands:
Command | Description |
---|---|
./cpdctl wx-data service list-tables |
Lists all table names of hive or iceberg connectors in watsonx.data instance. |
./cpdctl wx-data service get-qhmm-config |
Get the qhmm enabled bucket name in watsonx.data instance. |
./cpdctl wx-data service monitor |
To run stats and qhmm related queries in watsonx.data instance. |