调整 Red Hat CoreOS 工作程序节点的性能
Satellite
- 受支持的工作程序节点操作系统
- Red Hat CoreOS (
RHCOS
)
您可以通过启用 CPU 锁定,非统一内存访问 (NUMA) 和超大页面配置来调整 Red Hat CoreOS 工作程序节点性能。 这些配置可能有利于具有严格性能要求的应用程序。 但是,这些定制可能会导致调度工作负载出现问题。
您可以使用 daemonset
文件来修改主机,而不是使用 Red Hat OpenShift中的 MachineConfig
文件来调整工作程序节点性能。 有关更多信息,请参阅 更改 Calico MTU 或 调整 Red Hat CoreOS 工作程序节点的性能。
部署 Node 功能部件发现操作程序
Satellite
必须先部署 Node Feature Discovery Operator,然后才能在工作程序节点上启用 NUMA,CPU 锁定和超大页面。 有关更多信息,请参阅 Node Feature Discovery Operator。
在工作程序节点上启用非统一内存访问 (NUMA),CPU 锁定和超大页面
Satellite
开始之前,请确保已部署 Node 功能部件发现操作程序。
-
将以下
DaemonSet
保存到名为customize.yaml
的文件中--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ibm-user-custom-configurator-privileged roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:openshift:scc:privileged subjects: - kind: ServiceAccount name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: ibm-user-custom-configurator namespace: kube-system data: 89-hugepages.conf: | vm.nr_hugepages=<<NUMBER_OF_HUGEPAGES>> configure.sh: | #!/usr/bin/env bash set -x cp -f /scripts/ibm-user-custom-configuration.sh /host-usr-local-bin/ibm-user-custom-configuration.sh chmod 0755 /host-usr-local-bin/ibm-user-custom-configuration.sh cp -f /scripts/ibm-user-custom-configuration.service /host-etc-systemd-dir/ibm-user-custom-configuration.service chmod 0644 /host-etc-systemd-dir/ibm-user-custom-configuration.service if [[ -f /scripts/89-hugepages.conf ]]; then cp -f /scripts/89-hugepages.conf /host-etc-systctld-dir/89-hugepages.conf fi nsenter -t 1 -m -u -i -n -p -- systemctl daemon-reload nsenter -t 1 -m -u -i -n -p -- systemctl enable ibm-user-custom-configuration.service nsenter -t 1 -m -u -i -n -p -- systemctl start ibm-user-custom-configuration.service ibm-user-custom-configuration.sh: | #!/usr/bin/env bash set -x GIGABYTES_RESERVED_MEMORY=$(echo $SYSTEM_RESERVED_MEMORY | awk -F 'Gi' '{print $1}') GIGABYTES_RESERVED_MEMORY_ROUNDED_UP=$(echo $GIGABYTES_RESERVED_MEMORY | awk '{print int($1+0.999)}') sed -i "s/SYSTEM_RESERVED_MEMORY=.*/SYSTEM_RESERVED_MEMORY=${GIGABYTES_RESERVED_MEMORY_ROUNDED_UP}Gi/g" /etc/node-sizing.env TOTAL_NUMA_MEMORY_TO_ALLOCATE=$(echo "$GIGABYTES_RESERVED_MEMORY_ROUNDED_UP" "1024" | awk '{print $1 * $2 + 100}') if cat /etc/kubernetes/kubelet.conf | jq -r .; then cat >/tmp/ibm-user-config.conf.json <<EOF { "topologyManagerPolicy": "<<TOPOLOGY_MANAGER_POLICY_VALUE>>", "memoryManagerPolicy": "Static", "cpuManagerPolicy": "static", "reservedMemory": [ { "numaNode": 0, "limits": { "memory": "${TOTAL_NUMA_MEMORY_TO_ALLOCATE}Mi" } } ] } EOF if ! cat /tmp/ibm-user-config.conf.json | jq -r .; then exit 1 fi if ! jq -s '.[0] * .[1]' /tmp/ibm-user-config.conf.json /etc/kubernetes/kubelet.conf > /etc/kubernetes/tmp-kubelet.conf; then exit 1 fi mv -f /etc/kubernetes/tmp-kubelet.conf /etc/kubernetes/kubelet.conf else cat >/tmp/ibm-user-config.conf <<EOF #START USER CONFIG topologyManagerPolicy: <<TOPOLOGY_MANAGER_POLICY_VALUE>> memoryManagerPolicy: Static cpuManagerPolicy: static reservedMemory: - numaNode: 0 limits: memory: ${TOTAL_NUMA_MEMORY_TO_ALLOCATE}Mi #END USER CONFIG EOF sed -i '/#START USER CONFIG/,/#END USER CONFIG/d' /etc/kubernetes/kubelet.conf cat /tmp/ibm-user-config.conf >>/etc/kubernetes/kubelet.conf fi ibm-user-custom-configuration.service: | [Unit] Description=Add custom user config to kubelet Before=kubelet.service After=kubelet-auto-node-size.service [Service] Type=oneshot RemainAfterExit=yes EnvironmentFile=/etc/node-sizing.env ExecStart=/usr/local/bin/ibm-user-custom-configuration.sh [Install] WantedBy=multi-user.target --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: ibm-user-custom-configurator name: ibm-user-custom-configurator namespace: kube-system spec: selector: matchLabels: app: ibm-user-custom-configurator template: metadata: labels: app: ibm-user-custom-configurator spec: nodeSelector: feature.node.kubernetes.io/memory-numa: "true" ibm-cloud.kubernetes.io/os: RHCOS tolerations: - operator: "Exists" hostPID: true serviceAccount: ibm-user-custom-configurator initContainers: - name: configure image: "registry.access.redhat.com/ubi8/ubi:8.6" command: ['/bin/bash', '-c', 'mkdir /cache && cp /scripts/configure.sh /cache && chmod +x /cache/configure.sh && /bin/bash /cache/configure.sh'] securityContext: privileged: true volumeMounts: - mountPath: /scripts name: script-config - mountPath: /host-etc-systemd-dir name: etc-systemd-dir - mountPath: /host-usr-local-bin name: usr-local-bin - mountPath: /host-etc-systctld-dir name: etc-systctld-dir containers: - name: pause image: registry.ng.bluemix.net/armada-master/pause:3.2 volumes: - name: etc-systemd-dir hostPath: path: /etc/systemd/system - name: etc-systctld-dir hostPath: path: /etc/sysctl.d - name: usr-local-bin hostPath: path: /usr/local/bin - name: script-config configMap: name: ibm-user-custom-configurator
-
编辑
DaemonSet
值以调整性能。NUMBER_OF_HUGEPAGES
- 输入要分配的超大页面数。 例如:
2048
。 如果不想启用超大页面,请输入0
。 分配的超大页面越多,可用于应用程序的总体内存就越少。 TOPOLOGY_MANAGER_POLICY_VALUE
- 输入要使用的拓扑管理器策略。 建议使用
best-effort
拓扑以确保最大调度可用性。 但是,您可以使用其他拓扑进行更严格的需求验证,同时降低工作负载调度可用性。 有关更多信息,请参阅 拓扑管理器。
您可以编辑
nodeSelector
部分,以仅将配置应用于工作程序节点的子集。 -
运行以下命令应用
DaemonSet
。kubectl replace --force -f customize.yaml
-
验证 pod 是否已进入
Running
状态。kubectl get pods -n kube-system -l app=ibm-user-custom-configurator -o wide
-
在 pod 运行后,重新引导每个工作程序节点。
- 在工作程序节点上部署调试 pod。
oc debug node/NODE_NAME
- 调试会话启动后,运行以下命令。
nsenter -t 1 -m -u -i -n -p -- reboot
- 对每个要重启的工作节点重复上述步骤。
- 在工作程序节点上部署调试 pod。
在工作程序节点上启用 CPU 锁定和超大页面
Satellite
开始之前,请确保已部署 Node 功能部件发现操作程序。
-
将以下
DaemonSet
保存到名为cpu-pinning.yaml
的文件中。--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ibm-user-custom-configurator-privileged roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:openshift:scc:privileged subjects: - kind: ServiceAccount name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: ibm-user-custom-configurator namespace: kube-system data: 89-hugepages.conf: | vm.nr_hugepages=<<NUMBER_OF_HUGEPAGES>> configure.sh: | #!/usr/bin/env bash set -x cp -f /scripts/ibm-user-custom-configuration.sh /host-usr-local-bin/ibm-user-custom-configuration.sh chmod 0755 /host-usr-local-bin/ibm-user-custom-configuration.sh cp -f /scripts/ibm-user-custom-configuration.service /host-etc-systemd-dir/ibm-user-custom-configuration.service chmod 0644 /host-etc-systemd-dir/ibm-user-custom-configuration.service if [[ -f /scripts/89-hugepages.conf ]]; then cp -f /scripts/89-hugepages.conf /host-etc-systctld-dir/89-hugepages.conf fi nsenter -t 1 -m -u -i -n -p -- systemctl daemon-reload nsenter -t 1 -m -u -i -n -p -- systemctl enable ibm-user-custom-configuration.service nsenter -t 1 -m -u -i -n -p -- systemctl start ibm-user-custom-configuration.service ibm-user-custom-configuration.sh: | #!/usr/bin/env bash set -x if cat /etc/kubernetes/kubelet.conf | jq -r .; then cat >/tmp/ibm-user-config.conf.json <<EOF { "cpuManagerPolicy": "static" } EOF if ! cat /tmp/ibm-user-config.conf.json | jq -r .; then exit 1 fi if ! jq -s '.[0] * .[1]' /tmp/ibm-user-config.conf.json /etc/kubernetes/kubelet.conf > /etc/kubernetes/tmp-kubelet.conf; then exit 1 fi mv -f /etc/kubernetes/tmp-kubelet.conf /etc/kubernetes/kubelet.conf else cat >/tmp/ibm-user-config.conf <<EOF #START USER CONFIG cpuManagerPolicy: static #END USER CONFIG EOF sed -i '/#START USER CONFIG/,/#END USER CONFIG/d' /etc/kubernetes/kubelet.conf cat /tmp/ibm-user-config.conf >>/etc/kubernetes/kubelet.conf fi ibm-user-custom-configuration.service: | [Unit] Description=Add custom user config to kubelet Before=kubelet.service After=kubelet-auto-node-size.service [Service] Type=oneshot RemainAfterExit=yes EnvironmentFile=/etc/node-sizing.env ExecStart=/usr/local/bin/ibm-user-custom-configuration.sh [Install] WantedBy=multi-user.target --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: ibm-user-custom-configurator name: ibm-user-custom-configurator namespace: kube-system spec: selector: matchLabels: app: ibm-user-custom-configurator template: metadata: labels: app: ibm-user-custom-configurator spec: nodeSelector: ibm-cloud.kubernetes.io/os: RHCOS tolerations: - operator: "Exists" hostPID: true serviceAccount: ibm-user-custom-configurator initContainers: - name: configure image: "registry.access.redhat.com/ubi8/ubi:8.6" command: ['/bin/bash', '-c', 'mkdir /cache && cp /scripts/configure.sh /cache && chmod +x /cache/configure.sh && /bin/bash /cache/configure.sh'] securityContext: privileged: true volumeMounts: - mountPath: /scripts name: script-config - mountPath: /host-etc-systemd-dir name: etc-systemd-dir - mountPath: /host-usr-local-bin name: usr-local-bin - mountPath: /host-etc-systctld-dir name: etc-systctld-dir containers: - name: pause image: registry.ng.bluemix.net/armada-master/pause:3.2 volumes: - name: etc-systemd-dir hostPath: path: /etc/systemd/system - name: etc-systctld-dir hostPath: path: /etc/sysctl.d - name: usr-local-bin hostPath: path: /usr/local/bin - name: script-config configMap: name: ibm-user-custom-configurator
-
编辑
DaemonSet
值以调整性能。NUMBER_OF_HUGEPAGES
- 输入要分配的超大页面数。 例如:
2048
。 如果不想启用超大页面,请输入0
。 分配的超大页面越多,可用于应用程序的总体内存就越少。
您可以编辑
nodeSelector
部分,以仅将配置应用于工作程序节点的子集。 -
运行以下命令应用
DaemonSet
。kubectl replace --force -f cpu-pinnning.yaml
-
验证 pod 是否已进入
Running
状态。kubectl get pods -n kube-system -l app=ibm-user-custom-configurator -o wide
-
在 pod 运行后,重新引导每个工作程序节点。
- 在工作程序节点上部署调试 pod。
oc debug node/NODE_NAME
- 调试会话启动后,运行以下命令。
nsenter -t 1 -m -u -i -n -p -- reboot
- 对每个要重启的工作节点重复上述步骤。
- 在工作程序节点上部署调试 pod。
启用 kernel-devel
软件包
Satellite
您可能需要启用 kernel-devel
软件包以使用 Satellite 服务或存储器,例如 Spectrum Scale Fusion。
完成以下步骤以通过将定制配置映射和机器配置应用于工作程序节点来启用 kernel-devel
。
-
运行以下命令应用
MachineConfig
。ibmcloud ks cluster config --cluster CLUSTERID cat >"/tmp/kernel-devel-payload.yaml" <<EOF apiVersion: v1 kind: List metadata: name: pvg-machine-config-tester annotations: items: - apiVersion: v1 kind: Namespace metadata: name: ibm-machine-config - apiVersion: v1 data: config: |+ apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: name: 97-kerneldevel labels: machineconfiguration.openshift.io/role: worker spec: config: ignition: version: 3.2.0 extensions: - kernel-devel kind: ConfigMap metadata: labels: ibm-cloud.kubernetes.io/user-specified-config: "true" name: user-ignition-config-97-kerneldevel namespace: ibm-machine-config EOF kubectl apply -f /tmp/kernel-devel-payload.yaml
-
等待资源部署。 这可能需要 5 分钟或更长时间。
-
请查看配置映射的详细信息以确认部署成功。
-
确认
config-validation="valid"
字段存在。kubectl get cm -n ibm-machine-config user-ignition-config-97-kerneldevel -o yaml | grep config-validation
-
确认配置映射中存在
user-ignition-config-97-kerneldevel
。kubectl get cm -n ibm-machine-config -l ibm-cloud.kubernetes.io/nodepoolfeedback="true" -o yaml | grep user-ignition-config-97-kerneldevel
-
-
为群集添加工作节点 您添加的工作程序节点已启用
kernel-devel
。 -
确认
kernel-devel
已启用。- 在其中一个节点上启动调试 pod。
oc debug node/NODEIP
- 运行以下
nsenter
命令。nsenter -t 1 -m -u -i -n -p -- rpm -qa | grep kernel-devel
- 在其中一个节点上启动调试 pod。
-
可选: 如果不再需要
kernel-devel
,可以通过运行以下命令将其除去。kubectl delete cm -n ibm-machine-config user-ignition-config-97-kerneldevel
除去性能定制
Satellite
如果要从工作程序节点中除去定制并将其重置为缺省配置,请应用以下 DaemonSet
。
-
将以下
DaemonSet
保存到名为remove-custom.yaml
的文件中。--- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ibm-user-custom-configurator-privileged roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:openshift:scc:privileged subjects: - kind: ServiceAccount name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ServiceAccount metadata: name: ibm-user-custom-configurator namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: ibm-user-custom-configurator namespace: kube-system data: 89-hugepages.conf: | vm.nr_hugepages=0 configure.sh: | #!/usr/bin/env bash set -x cp -f /scripts/ibm-user-custom-configuration.sh /host-usr-local-bin/ibm-user-custom-configuration.sh chmod 0755 /host-usr-local-bin/ibm-user-custom-configuration.sh cp -f /scripts/ibm-user-custom-configuration.service /host-etc-systemd-dir/ibm-user-custom-configuration.service chmod 0644 /host-etc-systemd-dir/ibm-user-custom-configuration.service if [[ -f /scripts/89-hugepages.conf ]]; then cp -f /scripts/89-hugepages.conf /host-etc-systctld-dir/89-hugepages.conf fi nsenter -t 1 -m -u -i -n -p -- systemctl daemon-reload nsenter -t 1 -m -u -i -n -p -- systemctl enable ibm-user-custom-configuration.service nsenter -t 1 -m -u -i -n -p -- systemctl start ibm-user-custom-configuration.service ibm-user-custom-configuration.sh: | #!/usr/bin/env bash set -x if cat /etc/kubernetes/kubelet.conf | jq -r .; then if ! jq 'del(.topologyManagerPolicy, .memoryManagerPolicy, .cpuManagerPolicy, .reservedMemory)' /etc/kubernetes/kubelet.conf > /etc/kubernetes/tmp-kubelet.conf; then exit 1 fi mv -f /etc/kubernetes/tmp-kubelet.conf /etc/kubernetes/kubelet.conf else sed -i '/#START USER CONFIG/,/#END USER CONFIG/d' /etc/kubernetes/kubelet.conf fi ibm-user-custom-configuration.service: | [Unit] Description=Add custom user config to kubelet Before=kubelet.service After=kubelet-auto-node-size.service [Service] Type=oneshot RemainAfterExit=yes EnvironmentFile=/etc/node-sizing.env ExecStart=/usr/local/bin/ibm-user-custom-configuration.sh [Install] WantedBy=multi-user.target --- apiVersion: apps/v1 kind: DaemonSet metadata: labels: app: ibm-user-custom-configurator name: ibm-user-custom-configurator namespace: kube-system spec: selector: matchLabels: app: ibm-user-custom-configurator template: metadata: labels: app: ibm-user-custom-configurator spec: nodeSelector: ibm-cloud.kubernetes.io/os: RHCOS tolerations: - operator: "Exists" hostPID: true serviceAccount: ibm-user-custom-configurator initContainers: - name: configure image: "registry.access.redhat.com/ubi8/ubi:8.6" command: ['/bin/bash', '-c', 'mkdir /cache && cp /scripts/configure.sh /cache && chmod +x /cache/configure.sh && /bin/bash /cache/configure.sh'] securityContext: privileged: true volumeMounts: - mountPath: /scripts name: script-config - mountPath: /host-etc-systemd-dir name: etc-systemd-dir - mountPath: /host-usr-local-bin name: usr-local-bin - mountPath: /host-etc-systctld-dir name: etc-systctld-dir containers: - name: pause image: registry.ng.bluemix.net/armada-master/pause:3.2 volumes: - name: etc-systemd-dir hostPath: path: /etc/systemd/system - name: etc-systctld-dir hostPath: path: /etc/sysctl.d - name: usr-local-bin hostPath: path: /usr/local/bin - name: script-config configMap: name: ibm-user-custom-configurator
-
运行以下命令将
DaemonSet
应用到群集。kubectl replace --force -f remove-custom.yaml
-
检查 pod 是否已进入
Running
状态。kubectl get pods -n kube-system -l app=ibm-user-custom-configurator -o wide
-
在 pod 运行后,重新引导每个工作程序节点。
- 在工作程序节点上部署调试 pod。
oc debug node/NODE_NAME
- 调试会话启动后,运行以下命令。
nsenter -t 1 -m -u -i -n -p -- reboot
- 对每个要重启的工作节点重复上述步骤。
- 在工作程序节点上部署调试 pod。