IBM Cloud Docs
Backing up and restoring databases (version 4.0.x)

Backing up and restoring databases (version 4.0.x)

You can back up and restore databases in IBM Watson® Knowledge Studio for IBM Cloud Pak® for Data version 4.0.x by running scripts.

The all-backup-restore.sh script backs up or restores all the databases and deactivates the pods to prevent access. It then reactivates the pods. However, with the individual database scripts, you must run individual procedures.

Using a backup from a different instance will cause errors. You should only restore a backup which is created from the same IBM Watson® Knowledge Studio instance.

Before you begin

  • Download the scripts.

  • Review information about the script's use of the MinIO client. The client is required for the MinIO commands.

    • The scripts download the client from the MinIO website if the client isn't installed.
    • If you want the script to use your installed version, verify that you can run the client by issuing the command mc on the command line.

all-backup-restore script

The all-backup-restore.sh script backs up and restores the MongoDB, PostgreSQL, and Minio databases, and PVC.

Unless you only need to back up a single database, it is recommended that you use the all-backup-restore script, which deactivates and reactivates Knowledge Studio.

The script backs up or restores the data in the following order:

  1. MongoDB
  2. PostgreSQL
  3. MinIO
  4. PVC
all-backup-restore.sh [backup|restore] [RELEASE_NAME] [BACKUP_DIR] [-n NAMESPACE]

Use either the backup or restore command.

  • backup

    Backs up each database to a subdirectory of the BACKUP_DIR directory.

  • restore

    Restores the data from each database in the BACKUP_DIR directory.

Arguments and options

  • RELEASE_NAME

    Required. The release name that was specified when the Knowledge Studio Helm chart was installed in your cluster.

    You can find the release name as the prefix of the pod name, for example, {release_name}-ibm-watson-ks-yyy-xxx. For version 4.3.0, the value is always wks.

  • BACKUP_DIR

    Required. The base directory of each database where backups are stored to or restored from. Each database is stored in a subdirectory of the backup directory. A new folder with timestamp wks-backup-yyyymmdd_hhmmss will be created under [backupDir], for example: [backupDir]/wks-backup-yyyymmdd_hhmmss/mongodb

    For restore: please set [backupDir] with wks-backup-yyyymmdd_hhmmss, for example: [backupDir]/mongodb

  • -n NAMESPACE

    Namespace for the pods. The default value is zen.

Output

The script returns the following output, indicating either the backup or restore command:

[SUCCESS] MongoDB,PostgreSQL,Minio,PVC (backup|restore)

If the process fails, the following message is displayed, indicating either the backup or restore command:

[FAIL] MongoDB,PostgreSQL,Minio,PVC (backup|restore)

If the script fails, the data is corrupted. Do not use the corrupted data to restore.

Database-specific scripts

If you need to back up or restore a single database, use one of the database-specific scripts. However, make sure that you deactivate the pods before you run the script, and reactivate the pods after the script completes successfully.

MongoDB

Use this script instead of all-backup-restore.sh to back up or restore only the MongoDB database.

mongodb-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE

Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.

Backing up MongoDB

Back up your MongoDB data. Databases named WKSDATA, ENVDATA, and escloud_sbsep store data for Knowledge Studio.

  1. Deactivate Knowledge Studio.
  2. Run the mongodb-backup-restore.sh script with the backup command. The script runs the following operations:
    • Creates a remote temporary file under the mongoDB pod and extracts the following data: WKSDATA,ENVDATA, and escloud_sbsep.
    • Copies the WKSDATA, ENVDATA, and escloud_sbsep data to the BACKUP_DIR that you specify and deletes the temporary file.
  3. Reactivate Knowledge Studio.

Restoring MongoDB data

Restore the backed-up data to MongoDB.

  1. Deactivate Knowledge Studio.
  2. Run the mongodb-backup-restore.sh script with the restore command. The script runs the following operations:
    • Create a remote temporary file under the mongoDB pod
    • Copies the WKSDATA ENVDATA escloud_sbsep data from the BACKUP_DIR that you specify to the remote temporary file.
    • Restores the data from the temporary file and deletes the temporary file.
  3. Reactivate Knowledge Studio.

PostgreSQL

Use this script instead of all-backup-restore.sh to back up or restore only the PostgreSQL database.

postgresql-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE

Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.

Backing up PostgreSQL

Back up your PostgreSQL data by getting a data dump.

  1. Deactivate Knowledge Studio.
  2. Run the postgresql-backup-restore.sh script with the backup command. The script runs the following operations:
    • Creates a job for the postgresql backup.
    • Dumps the databases. The filenames are the database names, such as jobq_{release_name_underscore}, model_management_api , model_management_api_v2 and awt, with the .custom extension appended.
    • Copies the dump files to the local [backupDir] that you specify.
    • Deletes the .pgpass file.
  3. Reactivate Knowledge Studio.

Restoring PostgreSQL data

Restores the backed-up data to PostgreSQL.

  1. Deactivate Knowledge Studio.
  2. Run the postgresql-backup-restore.sh script with the restore command. The script runs the following operations:
    • Creates a job for postgresql restore.
    • Restores the databases (jobq_{release_name_underscore}, model_management_api , model_management_api_v2 and awt) by loading the .custom files from the [backupDir] that you specify.
    • Deletes the .pgpass file.
  3. Reactivate Knowledge Studio.

MinIO

Use this script instead of all-backup-restore.sh to back up or restore only the MinIO database.

minio-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR -n NAMESPACE

Use either the backup or restore command. For more information about the arguments and options, see the all-backup-restore script.

Backing up MinIO

Back up your MinIO database by taking a snapshot of the data. A bucket named wks-icp stores data for Knowledge Studio.

  1. Deactivate Knowledge Studio.
  2. Run the minio-backup-restore.sh script with the backup command. The script runs the following operations:
    • Establishes a connection to the pod RELEASE_NAME-ibm-minio by running kubectl -n NAMESPACE port-forward.
    • Configures a MinIO alias named wks-minio.
    • Copies data from wks-minio/wks-icp to the BACKUP_DIR you specify.
    • Closes the port-forward connection.
  3. Reactivate Knowledge Studio.

Restoring MinIO data

Restores the snapshot data to MinIO. Deletes the existing data in the MinIO server, and then restores the backup data.

  1. Deactivate Knowledge Studio.
  2. Run the minio-backup-restore.sh script with the backup command. The script runs the following operations:
    • Establishes a connection to the pod RELEASE_NAME-ibm-minio by running kubectl -n NAMESPACE port-forward.
    • Configures a MinIO alias named wks-minio.
    • Copies data from the BACKUP_DIR you specify to wks-minio/wks-icp.
    • Closes the port-forward connection.
  3. Reactivate Knowledge Studio.

PVC

Use this script instead of all-backup-restore.sh to back up or restore only the Persistent volume claim (PVC) data.

pvc-backup-restore.sh backup|restore RELEASE_NAME BACKUP_DIR DOCKERREGISTRY PVC_USER_ID -n NAMESPACE

Arguments for PVC

  • DOCKERREGISTRY

    The same Docker registry as the RELEASE_NAME-ibm-watson-ks-aql-web-tooling pod.

  • PVC_USER_ID

    The user ID for the running containers in the RELEASE_NAME-ibm-watson-ks-aql-web-tooling pod.

Use either the backup or restore command. For more information about the other arguments and options, see the all-backup-restore script.

Backing up PVC

  1. Identify the name of Docker registry and user ID before you deactivate Knowledge Studio.
  2. Deactivate Knowledge Studio.
  3. Run the pvc-backup-restore.sh script with the backup command. The script runs the following operations:
    • Creates a temporary pod at RELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup. Compresses /opt/ibm/watson/aql-web-tooling/target/sandbox and saves it as sandbox.tgz to RELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup.
    • Copies sandbox.tgz to the BACKUP_DIR that you specify.
    • Deletes the temporary pod
  4. Reactivate Knowledge Studio.

Restoring PVC data

Restores data to the PVC. Deletes the existing data in the sandbox, and then restores the backup data.

  1. Deactivate Knowledge Studio.
  2. Run the pvc-backup-restore.sh script with the restore command. The script runs the following operations:
    • Copies sandbox.tgz from the BACKUP_DIR that you specify to a temporary pod at RELEASE_NAME-ibm-watson-ks-aql-web-tooling-backup.
    • Deletes the data in /opt/ibm/watson/aql-web-tooling/target/sandbox.
    • Decompresses sandbox.tgz to /opt/ibm/watson/aql-web-tooling/target/sandbox.
    • Deletes the temporary pod
  3. Reactivate Knowledge Studio.

Deactivate Knowledge Studio

You don't need to deactivate when you run the all-backup-restore.sh script because the script handles the process.

To ensure that users don't have access to Knowledge Studio when you back up or restore a single database, stop the Knowledge Studio front-end pods before you back up or restore data.

  1. Make sure that no training and evaluation processes are running. You can check job status with the following command:

    kubectl -n NAMESPACE get jobs
    

    Training jobs of Knowledge Studio are named in the format wks-train-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx, and evaluation jobs are named in the format wks-batch-apply-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx. If the COMPLETIONS column of a training job reads 0/1, that job is still running. Wait until all of the training jobs finish.

  2. Deactivate Knowledge Studio using the folowing commands:

    kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"global":{"quiesceMode":true}}}'
    kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"mma":{"replicas":0}}}'
    
  3. Ensure that no Knowledge Studio pods exist, except datastore pods, by running the following command (this may takes few minutes):

    kubectl -n NAMESPACE get pod | grep -Ev 'minio|etcd|mongo|postgresql|gw-instance|Completed' | grep wks
    

Reactivate Knowledge Studio

You don't need to reactivate when you run the all-backup-restore.sh script because the script handles the process.

To reactivate Knowledge Studio, use the following command:

kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"global":{"quiesceMode":false}}}'
kubectl -n NAMESPACE patch --type=merge wks wks -p '{"spec":{"mma":{"replicas":2}}}'