Updating or replacing VPC worker nodes that use OpenShift Data Foundation
Virtual Private Cloud
For VPC clusters with a storage solution such as OpenShift Data Foundation you must cordon, drain, and replace each worker node sequentially. If you deployed OpenShift Data Foundation to a subset of worker nodes in your cluster, then after you
replace the worker node, you must then edit the ocscluster
resource to include the new worker node.
The following tutorial covers both major and minor updates and worker replacement.
- Major update
- Complete the steps with this label to apply a major update; for example, if you are updating your worker nodes to a new major version, such as from
4.11
to4.12
as well as OpenShift Data Foundation from4.11
to4.12
. - Minor update
- Complete the steps with this label to apply a patch update, for example if you are updating from
4.12.15_1542_openshift
to4.12.16_1544_openshift
while keeping OpenShift Data Foundation at version4.12
. You must repeat these steps for each node you want to update. - Worker replace
- Complete the steps with this label steps if you are replacing a worker node at the same patch version. You must repeat these steps for each node you want to replace.
Skipping versions during an upgrade, such as from 4.8 to 4.12 is not supported.
Before updating your worker nodes, make sure to back up your app data. Also, plan to complete the following steps for one worker node at a time. Repeat the steps for each worker node that you want to update.
Update the cluster master
Major update
- If you are updating your worker nodes to a new major version, such as from
4.11
to4.12
, update the cluster master first.
Example command:ibmcloud oc cluster master update --cluster CLUSTER [--version MAJOR.MINOR.PATCH] [--force-update] [-f] [-q]
ibmcloud oc cluster master update --cluster mycluster --version 4.16.19 --force-update
- Wait until the master update finishes.
Determine which storage nodes you want to update or replace
Major update Minor update Worker replace
-
List your worker nodes by using
oc get nodes
and determine which storage nodes you want to update.oc get nodes
Example output
NAME STATUS ROLES AGE VERSION 10.241.0.4 Ready master,worker 106s v1.21.6+4b61f94 10.241.128.4 Ready master,worker 22d v1.21.6+bb8d50a 10.241.64.4 Ready master,worker 22d v1.21.6+bb8d50a
Scale down OpenShift Data Foundation
Major update Minor update Worker replace
- For each worker node that you found in the previous step, find the
rook-ceph-mon
androok-ceph-osd
deployments.oc get pods -n openshift-storage -o wide | grep -i <node_name>
- Scale down the deployments that you found in the previous step.
oc scale deployment rook-ceph-mon-c --replicas=0 -n openshift-storage
oc scale deployment rook-ceph-osd-2 --replicas=0 -n openshift-storage
oc scale deployment --selector=app=rook-ceph-crashcollector,node_name=NODE-NAME --replicas=0 -n openshift-storage
Cordon and drain the worker node
Major update Minor update Worker replace
-
Cordon the node. Cordoning the node prevents any pods from being scheduled on this node.
oc adm cordon NODE_NAME
Example output
node/10.241.0.4 cordoned
-
Drain the node to remove all the pods. When you drain the worker node, the pods move to the other worker nodes ensuring there is no downtime. Draining also ensures that there is no disruption of the pod disruption budget.
oc adm drain NODE_NAME --force --delete-emptydir-data --ignore-daemonsets
Example output
evicting pod "managed-storage-validation-webhooks-7fd79bc9f7-pdpv6" evicting pod "calico-kube-controllers-647dbbd685-fmrp9" evicting pod "certified-operators-2v852" evicting pod "csi-snapshot-controller-77fbf474df-47ddt" evicting pod "calico-typha-8574d89b8c-7f2cc" evicting pod "dns-operator-6d48cbff67-vrrsw" evicting pod "router-default-6fc798b98b-9m6kh" evicting pod "prometheus-adapter-5b77ffdd5f-hzqrp" evicting pod "alertmanager-main-1" evicting pod "prometheus-k8s-0" evicting pod "network-check-source-66c7fbb86-2r78z"
-
Wait until draining finishes, then complete the following steps to replace the worker node.
Replace the worker node
Major update Minor update Worker replace
-
List your worker nodes by using the
ibmcloud oc worker ls
command and find the worker node that you cordoned and drained in the previous step.ibmcloud oc worker ls -c CLUSTER
Example output
ID Primary IP Flavor State Status Zone Version kube-c85ra07w091uv4nid9ug-vpcoc-default-000001c1 10.241.128.4 bx2.4x16 normal Ready us-east-3 4.8.29_1544_openshift* kube-c85ra07w091uv4nid9ug-vpcoc-default-00000288 10.241.0.4 bx2.4x16 normal Ready us-east-1 4.8.29_1544_openshift* kube-c85ra07w091uv4nid9ug-vpcoc-default-00000352 10.241.64.4 bx2.4x16 normal Ready us-east-2 4.8.29_1544_openshift*
-
Replace the worker node.
Minor update Example command to replace the worker node and apply the latest patch update.
ibmcloud oc worker replace -c CLUSTER --worker kube-*** --update
Worker replace Example command to replace the worker node without applying the latest patch update.
ibmcloud oc worker replace -c CLUSTER --worker kube-***
Example output
The replacement worker node is created in the same zone with the same flavor, but gets new public or private IP addresses. During the replacement, all pods might be rescheduled onto other worker nodes and data is deleted if not stored outside the pod. To avoid downtime, ensure that you have enough worker nodes to handle your workload while the selected worker nodes are being replaced. Replace worker node kube-c85ra07w091uv4nid9ug-cluster-default-00000288? [y/N]> y Deleting worker node kube-c85ra07w091uv4nid9ug-cluster-default-00000288 and creating a new worker node in cluster
-
Wait for the replacement node to be provisioned and then list your worker nodes. Note that this process might take 20 minutes or more.
oc get nodes
Example output
NAME STATUS ROLES AGE VERSION 10.241.0.4 Ready master,worker 22d v1.21.6+bb8d50a 10.241.128.4 Ready master,worker 22d v1.21.6+bb8d50a 10.241.64.4 Ready master,worker 22d v1.21.6+bb8d50a
Clean up the resources from the old node
Major update Minor update Worker replace
-
Verify OSD pod has come up on the replaced node in a
running
state, if the pod is running, continue to step 7. If the pod has failed, perform the following steps. -
Navigate to the
openshift-storage
project.oc project openshift-storage
-
Remove the failed OSD from the cluster. You can specify multiple failed OSDs if required:
oc process -n openshift-storage ocs-osd-removal -p FAILED_OSD_IDS=<failed_osd_id> -p FORCE_OSD_REMOVAL=true | oc create -f -
The
FAILED_osd_id
value is the integer in the pod name immediately after therook-ceph-osd
prefix. TheFORCE_OSD_REMOVAL
value must be changed totrue
in clusters that have only three OSDs, or clusters with insufficient space to restore all three replicas of the data after the OSD is removed. -
Verify that the OSD was removed successfully by checking the status of the
ocs-osd-removal-job
pod.oc get pod -l job-name=ocs-osd-removal-job -n openshift-storage
-
Verify that the OSD removal is completed.
oc logs -l job-name=ocs-osd-removal-job -n openshift-storage --tail=-1 | egrep -i 'completed removal'
Example output
2023-03-10 06:50:04.501511 I | cephosd: completed removal of OSD 0
Add the new storage node
Major update Minor update Worker replace
-
If you limited your ODF deployment to a subset of worker nodes by specifying node names during installation, you must update the
ocscluster
CRD to include the new name. If you did not limit your configuration to only certain worker nodes, you do not need to update theocscluster
CRD.oc edit ocscluster
apiVersion: ocs.ibm.io/v1 kind: OcsCluster metadata: name: ocscluster-auto spec: . . . osdSize: 250Gi osdStorageClassName: ibmc-vpc-block-metro-10iops-tier workerNodes: - NODE-NAME # Example 10.248.128.42 - NODE-NAME - NODE-NAME
-
Wait for the OpenShift Data Foundation pods to deploy to the new worker. Verify the new persistent volumes are created and that all pods are in a
Running
state.oc get pv oc get ocscluster oc get pods -n openshift-storage
-
Verify that all other required OpenShift Data Foundation pods are in Running state.
oc get pod -n openshift-storage | grep mon
Example output:
rook-ceph-mon-a-cd575c89b-b6k66 2/2 Running 0 38m rook-ceph-mon-b-6776bc469b-tzzt8 2/2 Running 0 38m rook-ceph-mon-d-5ff5d488b5-7v8xh 2/2 Running 0 4m8s
-
Verify that new OSD pods are running on the replacement node:
oc get pods -o wide -n openshift-storage| egrep -i <new_node_name> | egrep osd
Update the OpenShift Data Foundation add-on
Major update
- Check the existing version.
ibmcloud oc cluster addon ls --cluster CLUSTER
- Update the add-on.
ibmcloud oc cluster addon update openshift-data-foundation --cluster CLUSTER --version VERSION
- Verify the add-on is updated.
ibmcloud oc cluster addon ls --cluster CLUSTER
Update your cluster resource
Major update
-
Get the name of your
ocscluster
resource.oc get ocscluster
Example output
NAME AGE ocscluster-vpc 19d
-
Run the following command to edit your
ocscluster
resource.oc edit ocscluster OCS-CLUSTER-NAME
-
Set the
ocsUpgrade
parameter totrue
.... spec: billingType: hourly monSize: 20Gi monStorageClassName: ibmc-vpc-block-10iops-tier numOfOsd: 1 ocsUpgrade: true osdSize: 250Gi osdStorageClassName: ibmc-vpc-block-10iops-tier status: storageClusterStatus: Decreasing the capacity not allowed
-
Save and close the file.
-
Wait for the update to complete.
-
Verify that the
storagecluster
andcephcluster
resources are both deployed correctly.oc get storagecluster -n openshift-storage NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 43h Ready 2023-06-21T09:22:00Z 4.11.0
oc get cephcluster -n openshift-storage NAME DATADIRHOSTPATH MONCOUNT AGE PHASE MESSAGE HEALTH EXTERNAL ocs-storagecluster-cephcluster /var/lib/rook 3 43h Ready Cluster created successfully HEALTH_OK
oc get csv -n openshift-storage NAME DISPLAY VERSION REPLACES PHASE mcg-operator.v4.11.8 NooBaa Operator 4.11.8 mcg-operator.v4.11.7 Succeeded ocs-operator.v4.11.8 OpenShift Container Storage 4.11.8 ocs-operator.v4.11.7 Succeeded odf-csi-addons-operator.v4.11.8 CSI Addons 4.11.8 odf-csi-addons-operator.v4.11.7 Succeeded odf-operator.v4.11.8 OpenShift Data Foundation 4.11.8 odf-operator.v4.11.7 Succeeded