IBM Cloud Docs
wx-data Commands and Usage

wx-data Commands and Usage

The wx-data command further has different commands within, using which you can perform various operations specific to IBM® watsonx.data. This topic lists the commands with a brief description of the tasks that can be performed.

The wx-data command performs operations such as ingesting data, managing engines, storage, and data sources in watsonx.data.

Syntax:

./cpdctl wx-data [command] [options]

The wx-data command supports the following commands:

  • ingestion
  • engine
  • bucket
  • database
  • sparkjob
  • tablemaint
  • service

How to Use wx-data Command --help (-h)

To list all the commands in the wx-data plugin:

./cpdctl wx-data --help

To get details of all options and its descriptions for a specific command in wx-data plugin:

./cpdctl wx-data [command] --help

For example:

./cpdctl wx-data ingestion -h
NAME:
   ingestion - Commands for Ingestion resource.

USAGE:
   cpdctl wx-data ingestion [action]

COMMANDS:
   list     List ingestion jobs.
   create   Create an ingestion job.
   get      Get ingestion job details.

GLOBAL OPTIONS:
      --cpd-config string   Configuration file path
      --cpdconfig string    [Deprecated] Use --cpd-config instead
  -h, --help                Show help
      --profile string      Name of the configuration profile to use
  -q, --quiet               Suppresses verbose messages.
      --raw-output          If set to true, single values in JSON output mode are not surrounded by quotes

Use "cpdctl wx-data ingestion service-command --help" for more information about a command.

To get the details of all available options and arguments in the wx-data commands to execute an operation:

./cpdctl wx-data [command] [options] --help

To use the wx-data plugin to execute an operation:

./cpdctl wx-data [command] [options]

ingestion

The ingestion command is used for executing different ingestion operations in watsonx.data.

Syntax:

./cpdctl wx-data ingestion [options]

The ingestion command further supports the following commands:

Supported commands by ingestion
Command Description
./cpdctl wx-data ingestion list Lists the ingestion jobs executed in watsonx.data instance.
./cpdctl wx-data ingestion create Create an ingestion job in watsonx.data instance.
./cpdctl wx-data ingestion get Get the details of an ingestion job executed in watsonx.data instance.

engine

The engine command is used for executing different engine-related operations in watsonx.data.

Syntax:

./cpdctl wx-data engine [options]

The engine command further supports the following commands:

Supported commands by engine
Command Description
./cpdctl wx-data engine list Lists all the engines available in watsonx.data instance.
./cpdctl wx-data engine create Create or register an engine in watsonx.data instance.
./cpdctl wx-data engine delete Delete an engine from watsonx.data instance.
./cpdctl wx-data engine attach Associate catalogs to a Presto engine in watsonx.data instance.
./cpdctl wx-data engine detach Disassociate the catalogs associated with a Presto engine in watsonx.data instance.

bucket

The bucket command is used for executing different storage-related operations in watsonx.data.

Syntax:

./cpdctl wx-data bucket [options]

The bucket command further supports the following commands:

Supported commands by bucket
Command Description
./cpdctl wx-data bucket list Lists all the storages available in watsonx.data instance.
./cpdctl wx-data bucket create Register a storage in watsonx.data instance.
./cpdctl wx-data bucket get Get the details of a registered storage in watsonx.data instance.
./cpdctl wx-data bucket delete Delete a storage from watsonx.data instance.
./cpdctl wx-data bucket activate Activate a storage bucket in watsonx.data on IBM Cloud instance only.
./cpdctl wx-data bucket deactivate Deactivate a storage bucket in watsonx.data on IBM Cloud instance only.

database

The database command is used for executing different data source-related operations in watsonx.data.

Syntax:

./cpdctl wx-data database [options]

The database command further supports the following commands:

Supported commands by database
Command Description
./cpdctl wx-data database list Lists all the data sources available in watsonx.data instance.
./cpdctl wx-data database create Create or add a data source in watsonx.data instance.
./cpdctl wx-data database get Get the details of a registered data source in watsonx.data instance.
./cpdctl wx-data database delete Delete a data source from watsonx.data instance.

sparkjob

The sparkjob command is used for executing different Spark-related operations in watsonx.data version 2.1.2 and later.

Syntax:

./cpdctl wx-data sparkjob [options]

The sparkjob command further supports the following commands:

Supported commands by sparkjob
Command Description
./cpdctl wx-data sparkjob list List all applications available in a Spark engine.
./cpdctl wx-data sparkjob create Submit a Spark application.
./cpdctl wx-data sparkjob get Get the status of a Spark application.

For more information about how to perform Spark table maintenance by using IBM cpdctl in watsonx.data on IBM Software Hub, see Spark table maintenance and for watsonx.data on IBM Cloud, see Spark table maintenance.

tablemaint

The tablemaint command is used for executing different Iceberg table maintenance operations in watsonx.data.

This is now applicable only for Amazon S3 storage.

Syntax:

./cpdctl wx-data tablemaint [options]

The tablemaint command supports the following commands:

Supported commands by tablemaint
Options Description
./cpdctl wx-data tablemaint rollback_to_snapshot Roll back, or restore the table to a specific snapshot ID.
./cpdctl wx-data tablemaint rollback_to_timestamp Roll back a table to the snapshot at a specific timestamp.
./cpdctl wx-data tablemaint set_current_snapshot Sets the current snapshot ID for a table.
./cpdctl wx-data tablemaint cherrypick_snapshot Cherry-picks changes from a snapshot into the current table state. Cherry-picking creates a new snapshot from an existing snapshot without altering or removing the original.
./cpdctl wx-data tablemaint expire_snapshot Remove older snapshots and their files which are no longer needed.
./cpdctl wx-data tablemaint remove_orphan Remove files that are not referenced in any metadata files of an Iceberg table and can thus be considered "orphaned".
./cpdctl wx-data tablemaint rewrite_data Rewrites the data files.
./cpdctl wx-data tablemaint rewrite_manifests Rewrite manifests for a table to optimize scan planning.
./cpdctl wx-data tablemaint register_table Creates a table.

The following flags are listed when you run each table maintenance command:

  • Force : If the value is set to TRUE, the SQL query that you are going to run will be printed.

  • Debug : If the value is set to TRUE, a copy of the Spark application file is stored to our computer.

service

The service command is used for executing different serviceability related operations in watsonx.data.

Syntax:

./cpdctl wx-data service [options]

The service command further supports the following commands:

Supported commands by service
Command Description
./cpdctl wx-data service list-tables Lists all table names of hive or iceberg connectors in watsonx.data instance.
./cpdctl wx-data service get-qhmm-config Get the qhmm enabled bucket name in watsonx.data instance.
./cpdctl wx-data service monitor To run stats and qhmm related queries in watsonx.data instance.