Managing Virtualization Service clusters

Virtual Private Cloud 4.21 and later Bare metal worker nodes only RHCOS only

Learn how to manage your OpenShift Virtualization Service cluster, including working with pre-configured components, managing worker nodes, and performing maintenance tasks.

This service is currently available as a beta release. Access is controlled by an allowlist. During the beta period, only console-based cluster creation is supported.

Understanding managed components

Virtualization Service clusters include several pre-configured components that are managed differently than in standard OpenShift clusters.

Core components (cannot be disabled)

The following components are essential to Virtualization Service and cannot be disabled:

OpenShift Virtualization add-on
The openshift-virtualization add-on is automatically enabled on all ROVS clusters and cannot be disabled. This add-on manages the installation and updates of the OpenShift Virtualization, NMState, and Node Maintenance operators. For more information, see Managing the OpenShift Virtualization add-on.
OpenShift Virtualization Operator
Provides virtual machine management capabilities. This operator is automatically installed by the add-on and updated as part of the cluster lifecycle. Installation from Red Hat OperatorHub is blocked.
NMState Operator
Manages network configuration for virtual machines and nodes. This operator is automatically installed by the add-on.
Node Maintenance Operator
Handles node maintenance operations for virtual machine workloads. This operator is automatically installed by the add-on.
OpenShift Data Foundation (ODF)
Provides storage for VM disks and enables live migration. ODF is pre-configured to use local NVME storage on bare metal nodes.

Viewing managed add-ons

List all add-ons in your cluster:

ibmcloud ks cluster addon ls --cluster <cluster-name>

Example output:

Name                         Version   Health State   Health Status
ibm-storage-operator         1.0       normal         Addon Ready. For more info: http://ibm.biz/addon-state (H1500)
openshift-virtualization     4.21      normal         Addon Ready. For more info: http://ibm.biz/addon-state (H1500)

The openshift-virtualization add-on is automatically enabled and cannot be disabled on ROVS clusters.

For detailed information about managing the OpenShift Virtualization add-on, including viewing details, checking versions, and updating, see Managing the OpenShift Virtualization add-on.

Managing worker nodes

Viewing worker nodes

List all worker nodes in your cluster:

ibmcloud ks workers --cluster <cluster-name>

Or use the OpenShift CLI:

oc get nodes

Adding worker nodes

Add worker nodes to an existing worker pool:

ibmcloud ks worker-pool resize --cluster <cluster-name> \
  --worker-pool default \
  --size-per-zone <number-of-workers>

All worker nodes in a Virtualization Service cluster must use supported bare metal flavors.

Replacing worker nodes

Replace a worker node:

ibmcloud ks worker replace --cluster <cluster-name> --worker <worker-id>

The replacement worker is provisioned with the same configuration as the original.

Reloading worker nodes

Reload a worker node to apply updates or fix issues:

ibmcloud ks worker reload --cluster <cluster-name> --worker <worker-id>

Before reloading a worker node, ensure that any running VMs are migrated to other nodes or can tolerate downtime.

Managing worker pools

Viewing worker pools

ibmcloud ks worker-pool ls --cluster <cluster-name>

Creating additional worker pools

Create a new worker pool with a different bare metal flavor:

ibmcloud ks worker-pool create vpc-gen2 \
  --name <pool-name> \
  --cluster <cluster-name> \
  --flavor <bare-metal-flavor> \
  --size-per-zone <number-of-workers>

All worker pools in a Virtualization Service cluster must use bare metal flavors that support the openshift-vs offering.

Adding zones to worker pools

Add a zone to an existing worker pool:

ibmcloud ks zone add vpc-gen2 \
  --cluster <cluster-name> \
  --zone <zone> \
  --subnet-id <subnet-id> \
  --worker-pool <pool-name>

Updating the cluster

Checking for updates

Check if updates are available for your cluster:

ibmcloud ks cluster get --cluster <cluster-name> | grep "Master Version"

View available versions:

ibmcloud ks versions --show-version openshift

Updating the cluster master

Update the cluster master to a new version:

ibmcloud ks cluster master update --cluster <cluster-name> --version <version>

The master update typically takes 30-60 minutes. During this time, you cannot access the Kubernetes API or OpenShift console.

Updating worker nodes

After updating the master, update worker nodes:

ibmcloud ks worker update --cluster <cluster-name> --worker <worker-id>

Or update all workers in a worker pool:

ibmcloud ks worker-pool update --cluster <cluster-name> --worker-pool <pool-name>

Before updating worker nodes, migrate VMs to other nodes or ensure they can tolerate downtime.

Monitoring cluster health

Checking cluster status

ibmcloud ks cluster get --cluster <cluster-name>

Look for:

  • State: Should be normal
  • Master Status: Should be Ready
  • Master Health: Should be normal

Monitoring component health

Check OpenShift Virtualization health:

oc get hyperconverged -n openshift-cnv

Check ODF health:

oc get storagecluster -n openshift-storage

Viewing cluster logs

View cluster activity:

ibmcloud ks cluster get --cluster <cluster-name> --show-resources

For detailed logging, configure IBM Log Analysis. See Logging for clusters.

Managing virtual machines

Viewing virtual machines

List all VMs in the cluster:

oc get vms -A

View VMs in a specific namespace:

oc get vms -n <namespace>

Live migrating VMs

Before performing maintenance on a worker node, migrate VMs to other nodes:

oc get vmi -n <namespace>

Trigger a live migration:

virtctl migrate <vm-name> -n <namespace>

Live migration is supported only within the same zone.

Stopping and starting VMs

Stop a VM:

virtctl stop <vm-name> -n <namespace>

Start a VM:

virtctl start <vm-name> -n <namespace>

Storage management

Monitoring storage capacity

Check ODF storage capacity:

oc get cephcluster -n openshift-storage -o jsonpath='{.items[0].status.ceph.capacity}'

View storage usage:

oc get cephblockpool -n openshift-storage

Managing persistent volume claims

List PVCs used by VMs:

oc get pvc -A | grep virtualmachine

View PVC details:

oc describe pvc <pvc-name> -n <namespace>

Troubleshooting

For troubleshooting common issues with Virtualization Service clusters, see the following topics:

Next steps