Private VPC clusters: Why can't I connect to the OpenShift console?
Troubleshoot problems connecting to the Openshift console on a cluster that has only a private service endpoint.
The information in this trouble shooting guide pertains to VPC clusters with only a private service endpoint.
1. Understanding the cluster connection flow
The following diagram shows the connection flow for a VPC cluster with private service endpoints to connect to the OpenShift web console. This diagram assumes that the cluster has the default OAuth settings. Review this diagram and the following descriptions to better understand what troubleshooting steps might be required.
- The web browser connects through the VPN to the cluster master API server. A signed certificate is exchanged and a redirect instructs the web browser to instead connect to the OpenShift console load balancer.
- (a) The web browser connects through the VPN to the OpenShift console load balancer that exposes the OpenShift console. (b) This request is sent to one of the two openshift-console pods.
- The openshift-console pod connects to the cluster master OAuth server port to check if the connection is already authenticated. If the request is already authenticated, the connection to the OpenShift web console is complete and the web console can be accessed. If the request is not authenticated, the user is redirected to the cluster OAuth service on the cluster master.
- The web browser connects through the VPN to the cluster's OAuth server port, which redirects the client to IAM.
- The web browser connects to IAM over the public network. The user enters their password and, if required, a 2FA verification. If this step is successful, the user is redirected back to the cluster's OAuth server.
- The web browser connects through the VPN to the cluster's OAuth server port again. The connection is redirected back to the OpenShift console load balancer.
- The web browser connects through the VPN to the OpenShift console load balancer, which exposes the OpenShift console. This request is sent to one of the two openshift-console pods, which again connect to the cluster master OAuth server port to check of the connection is already authenticated. If the user has entered their password and 2FA verification, the authentication is validated and the user is connected to the OpenShift console main web page.
2. Check your VPC and cluster configuration
Verify that your VPC and cluster are properly configured. An incorrect configuration might prevent you from accessing the OpenShift web console.
-
Ensure that your web browser is running in a client system that is either inside the same VPC as your cluster or has a VPN connection to that VPC. The OpenShift Console is exposed by a private VPC load balancer that is only accessible from the VPC's private network.
-
Ensure that the client system has access to the public service endpoints for IAM, which are accessible through
iam.cloud.ibm.com
andlogin.ibm.com
. -
For clusters that run version 4.13:
- If your cluster uses the default 4.13 Oauth configuration or if you have set your cluster to use the VPE Gateway for Oauth, ensure that your client is using the private DNS for the VPC and that this DNS traffic is routed through the VPN
to the VPC. The private DNS is typically
161.26.0.7
and161.26.0.8
, unless you use a custom DNS resolver. This is required so that the VPE Gateway for theapiserver
andOauth
, which does not exist in any public DNS, can be found in the VPC's private DNS.
- If your cluster uses the default 4.13 Oauth configuration or if you have set your cluster to use the VPE Gateway for Oauth, ensure that your client is using the private DNS for the VPC and that this DNS traffic is routed through the VPN
to the VPC. The private DNS is typically
-
For clusters that run any supported version other than 4.13:
- If you are using the default Oauth cluster settings, ensure that routes exist in the VPN configuration so that all
166.8.0.0/14
traffic is routed through either the same VPN or another VPN that connects to IBM Cloud. This is required to connect to the cluster's API server and OAuth server ports.
- If you are using the default Oauth cluster settings, ensure that routes exist in the VPN configuration so that all
3. Gather cluster data
Follow these steps to gather the cluster information needed for troubleshooting. The outputs you gather with these commands are used in later steps.
-
Find the cluster API server URL. In later commands, this URL is referred to as
${CLUSTER_APISERVER_URL}
.- Run the
ibmcloud ks cluster get -c <cluster-id>
command.ibmcloud oc cluster get -c <cluster-id>
- In the
Master
section of the output, find theURL
. The URL should be in the following format:https://c<XXX>-e.private.<REGION>.containers.cloud.ibm.com:<YYYYY>
.
- Run the
-
Find the cluster OAuth URL. In later commands, this URL is referred to as
${CLUSTER_OAUTH_URL}
.- Run the
kubectl get --raw /.well-known/oauth-authorization-server | grep issuer
command. Do not use theibmcloud oc cluster get -c <CLUSTER-ID>
, as this command might return a different URL.kubectl get --raw /.well-known/oauth-authorization-server | grep issuer
- In the output, find the URL with one of the following formats.
- If the VPE gateway is not used for OAuth:
https://c<XXX>-e.private.<REGION>.containers.cloud.ibm.com:<ZZZZZ>
. - If the VPE gateway is used for OAuth:
https://<CLUSTERID>.vpe.private.<REGION>.containers.cloud.ibm.com:<ZZZZZ>
.
- If the VPE gateway is not used for OAuth:
- Run the
-
Find the Ingress subdomain. In later commands, this subdomain is referred to as
${CONSOLE_LOAD_BALANCER}
.- Run the
ibmcloud oc cluster get -c <CLUSTER-ID>
command.ibmcloud oc cluster get -c <CLUSTER-ID>
- In the output, find the subdomain that matches the following format:
<CLUSTER-NAME-PLUS-RANDOM-UNIQUE-STRING>.<REGION>.containers.appdomain.cloud
. Note that if you have configured a custom Ingress subdomain, the format will instead match your custom configuration.
- Run the
4. Verify connections and troubleshoot
Follow these steps to check the connections described in the connection flow. If you find a problem with a connection, use the information to troubleshoot the issue.
-
Verify that Ingress is healthy and that the router and console pods are healthy.
- Run the commands.
ibmcloud oc cluster get -c <CLUSTERID> ibmcloud oc ingress status-report get -c <CLUSTERID>
- If the output shows an error status, use the Ingress troubleshooting documentation to resolve the issue.
- Run the commands.
-
Verify that the OpenShift cluster operators are healthy.
- Run the command.
oc get clusteroperators
- If the output shows that any operators are not healthy or are not running at the current version, use the OpenShift cluster version troubleshooting documentation to resolve the issue. Or, you can search the IBM and Red Hat documentation for any specific errors that are shown.
- If the console operator specifically is not healthy, check the
openshift-console/console...
andopenshift-console-operator/console-operator...
pod logs to see if a security group, ACL or DNS customization is preventing the pods from connecting to either the OAuth port or the OpenShift console URL. A security group, ACL or DNS might be configured in a way that prevents the connection.
- Run the command.
-
Verify that the connection to the cluster master API server is successful.
- Run the command. Specify the cluster apiserver URL you found in previous steps.
curl -k -vvv ${CLUSTER_APISERVER_URL}/version
- If the connection is not successful, make the following checks and resolve any issues you find.
- Check that the cluster master is healthy by running the
ibmcloud oc cluster get -c <CLUSTER-ID>
command. See Reviewing master health for information on resolving cluster master issues. - Check that the hostname portion of the URL is resolved via DNS. Use the
dig $(echo ${CLUSTER_APISERVER_URL} | cut -d/ -f3 | cut -d: -f1)
command and specify the cluster API server URL. - Verify that there is a route through your VPN that connects to the cluster API server URL.
- If the cluster apiserver URL contains your cluster ID (which indicates that the cluster uses the VPE gateway for the connection to the cluster apiserver) check that the VPE gateway security group for the cluster master allows traffic from your VPN client subnet. Follow the steps in Accessing the OpenShift console when OAuth access is set to VPE gateway.
- Check if any security groups, ACLs, or custom VPC routes applied to the VPN prevent traffic between the VPN and the cluster API server. You can test this by temporarily allowing all inbound and outbound traffic through the VPN security group and ACL and then checking if this resolves the issue. If so, make the necessary changes to your security groups, ACLs, or custom routes to allow the traffic.
- Check if any Context Based Restriction (CBR) rules or private service endpoint allowlist rules on the cluster prevent the client from connecting to the cluster API server. You can test this by temporarily adding a network zone to your CBR rule that allows all IPs and subnets. If this temporary change resolves the issue, make the necessary changes to the rule or allowlist to allow the traffic.
- Check that the cluster master is healthy by running the
- Run the command. Specify the cluster apiserver URL you found in previous steps.
-
Verify that the connection to the cluster load balancer exposing the OpenShift console is successful.
- Run the command. Specify the Ingress subdomain you found in the previous steps.
curl -k -vvv https://console-openshift-console.${CONSOLE_LOAD_BALANCER}/
- If the connection is not successful, make the following checks and resolve any issues you find.
- Check that the hostname portion of the subdomain is resolved via DNS. Use the
dig console-openshift-console.${CONSOLE_LOAD_BALANCER}
command. - Verify that there is a route through your VPN to this load balancer subdomain. Verify that the route includes all of the IPs or subnets that the load balancer uses. The output for the
dig console-openshift-console.${CONSOLE_LOAD_BALANCER}
command includes the current load balancer IPs and subnets, but note that these can change as the load balancer scales up or down. - If you have modified any security groups, ACLs, or custom VPC routes that are applied to the load balancer, check if any changes or rules you applied prevent the connection. If you have not modified any of these components and they use the default values, you can skip this step.
- Check if any security groups, ACLs, or custom VPC routes applied to the VPN prevent the traffic between the VPN and the load balancer. You can test this by temporarily allowing all inbound and outbound traffic through the VPN security group and ACL and then checking if this resolves the issue. If so, make the necessary changes to your security groups, ACLs, or custom routes to allow the traffic.
- Check that the hostname portion of the subdomain is resolved via DNS. Use the
- Run the command. Specify the Ingress subdomain you found in the previous steps.
-
Verify that the connection to the cluster OAuth server is successful.
- Run the command. Specify the cluster OAuth URL you found in the previous steps.
curl -k -vvv ${CLUSTER_OAUTH_URL}/healthz
- If the connection is not successful, make the following checks and resolve any issues you find.
- Check that your cluster master is healthy by running the
ibmcloud oc cluster get -c <CLUSTER-ID>
command. See Reviewing master health for information on resolving cluster master issues. - Check that the hostname portion of the cluster OAuth URL is resolved via DNS. Use the
dig $(echo ${CLUSTER_OAUTH_URL} | cut -d/ -f3 | cut -d: -f1)
and specify the cluster OAuth URL. - Verify that there is a route through your VPN to the cluster master OAuth URL.
- Check if any security groups, ACLs, or custom VPC routes applied to the VPN prevent traffic between the VPN and the cluster OAuth server. You can test this by temporarily allowing all inbound and outbound traffic through the VPN security group and ACL and then checking if this resolves the issue. If so, make the necessary changes to your security groups, ACLs, or custom routes to allow the traffic.
- Check if any Context Based Restriction (CBR) rules or private service endpoint allowlist rules on the cluster prevent the client from connecting to the cluster OAuth server. You can test this by temporarily adding a network zone to your CBR rule that allows all IPs and subnets. If this temporary change resolves the issue, make the necessary changes to the rule or allowlist to allow the traffic.
- Check that your cluster master is healthy by running the
- Run the command. Specify the cluster OAuth URL you found in the previous steps.
-
Verify that the connection to IAM is successful.
- Run the commands.
curl -vvv https://iam.cloud.ibm.com/healthz curl -vvv -o /dev/null -s https://login.ibm.com/
- If either of these commands fail, check that the client system is able to connect the these URLs reliably and that the URLs are not blocked by any client or corporate firewalls. Note that these URLs require access to the public internet.
- Run the commands.
5. Contact support
If you completed all the above steps and have not resolved the issue, contact support. Open a support case. In the case details, be sure to include any relevant log files, error messages, or command outputs..