Setting up an operational logging solution

This tutorial shows you one way that can be used to meet the IBM Cloud Framework for Financial Services requirements that are related to operational logging. This approach uses Red Hat OpenShift on IBM Cloud Container Platform Elasticsearch, Fluentd, and Kibana EFK stack by installing the Cluster Logging Operator. The log data remains within your environment so that you retain full control over it.

We provide guidance, but you are solely responsible for installing, configuring, and operating IBM third-party software in a way that satisfies IBM Cloud Framework for Financial Services requirements. In addition, IBM does not provide support for third-party software.

The final result is a general logging solution that can be used for:

Logs from Red Hat OpenShift on IBM Cloud applications
Red Hat OpenShift on IBM Cloud platform logs
Logs from applications that are running on virtual server instances
System logs from virtual server instances

Logging solution architecture

The capability of the Red Hat OpenShift on IBM Cloud Cluster Logging Operator allows administrators to aggregate logs across a Red Hat OpenShift on IBM Cloud cluster. These logs include application container logs, infrastructure component logs, and system audit logs. Administrators deploy cluster logging by deploying the Elasticsearch Operator and the Cluster Logging Operator. These operators deploy Elasticsearch servers for collecting and indexing logs, Fluentd for sending logs from worker nodes, and Kibana for searching and displaying the logs.

When using Red Hat OpenShift on IBM Cloud logging for financial service environments, you must install the Elasticsearch Operator and Cluster Logging Operator in every Red Hat OpenShift on IBM Cloud cluster that requires logging. The logs are retained by the Elasticsearch servers that are running in that cluster and stored on the attached VPC block storage (which must be encrypted by using keys that are managed by Hyper Protect Crypto Services). The logs can be searched and viewed by using the Kibana server that is deployed in that cluster. The following diagram shows the Elasticsearch pods that are deployed to three zones within both the management and workload VPCs. Kibana is shown deployed to a single zone.

IBM Cloud for Financial Services reference architecture with logging — Single-region VPC reference architecture using EFK for logging

The Red Hat OpenShift on IBM Cloud pods that are running the Elasticsearch, Kibana, and Fluentd components are assigned to a worker pool that is dedicated to logging. A dedicated worker pool prevents logging operations from stealing resources from pods that are running other applications. Worker pools are not shown in the diagram.

Instances of the Elasticsearch server are configured to run in all three of the region's zones. The Elasticsearch cluster is configured to replicate data, which is stored in shards, to a second server in the cluster.

The logging stack that is running in each VPC can also be used to collect application logs from virtual server instances that are using Fluentd. You can collect logs from virtual server instances by using Fluentd's Elasticsearch plug-in to forward logs to the logging stack that is running in the Red Hat OpenShift on IBM Cloud cluster.

When you configure Red Hat OpenShift on IBM Cloud for both operational logging and operational monitoring, the worker nodes can be shared. You can use the same worker pool for both logging and monitoring. You can use the same taint tag to steer monitoring and logging pods to the shared worker pool.

To implement your operational logging solution, you need to complete the following high-level steps:

Provision an instance of Red Hat OpenShift on IBM Cloud
Configure the worker pool in your Red Hat OpenShift on IBM Cloud cluster.
Install the Elasticsearch Operator within your Red Hat OpenShift on IBM Cloud cluster.
Install the Cluster Logging Operator within your Red Hat OpenShift on IBM Cloud cluster.
Create a cluster logging instance.
Set up virtual server instance logging with Fluentd.

Before you begin

You have a VPC provisioned.
Subnets are provisioned across three zones within a region.

Provision Red Hat® OpenShift® on IBM Cloud®

To capture logs from workloads that are running outside of Red Hat OpenShift on IBM Cloud (such as a virtual server instance), you need to provision an instance of Red Hat OpenShift on IBM Cloud if you don't already have one.

Provision Red Hat® OpenShift® on IBM Cloud® within the workload VPC where you plan to install the monitoring service. Use the following configuration:

Red Hat OpenShift on IBM Cloud version: 4.6.x
Worker zones: User defined subnet in each zone of the region
Worker nodes per zone: 1
Flavor: mx2.4x32 - 4 vCPU, 32 GB Memory
Master service endpoint: Private endpoint only

Provision a Red Hat OpenShift on IBM Cloud worker pool

Use a separate worker pool for the logging stack to keep the logging stack resources distinct from other workload resources. Provision a new worker pool with the following configuration:
- Worker zones: User defined subnet in each zone of the region
- Worker nodes per zone: 1
- Flavor: mx2.4x32 - 4 vCPU, 32 GB Memory
After provisioning completes, access the Red Hat OpenShift on IBM Cloud cluster.
Taint the worker pool. By providing a taint on the worker pool, it ensures that only the logging stack runs on the worker pool. For more information on taints and tolerations, see the Red Hat OpenShift on IBM Cloud documentation. A taint can be set on a worker pool with the following ibmcloud CLI command:
```
ibmcloud oc worker-pool taint set --worker-pool <WORKER_POOL> --cluster <CLUSTER> --taint KEY=VALUE:EFFECT
```
An example taint setting of logging-monitoring=node:NoExecute, can be set by using the following ibmcloud cli command:
```
ibmcloud oc worker-pool taint set --worker-pool <WORKER_POOL> --cluster <CLUSTER> --taint logging-monitoring=node:NoExecute
```

Install the Elasticsearch Operator

The Elasticsearch Operator creates and manages the Elasticsearch cluster that is used by Red Hat OpenShift on IBM Cloud cluster logging. It can be installed from the Red Hat OpenShift on IBM Cloud console.

Install the Elasticsearch Operator.
Ensure that the following settings are selected or configured:
- All namespaces on the cluster are selected under Installation Mode.
- openshift-operators-redhat is selected under Installed Namespace.
- Enable operator recommended cluster monitoring on this namespace is enabled.
- Automatic is selected for Automatic Approval Strategy.
Ensure that Elasticsearch Operator is listed in all projects with a Status of Succeeded.

Install the Cluster Logging Operator

The Cluster Logging operator creates and manages the components of the logging stack.

Install the Cluster Logging Operator provided by Red Hat).
Ensure that the following settings are selected or configured:
- The A specific namespace on the cluster is selected under Installation Mode.
- The Operator recommended namespace is openshift-logging is selected under Installed Namespace.
- Enable operator recommended cluster monitoring on this namespace is selected.
- Automatic is selected for Automatic Approval Strategy.
Ensure that Cluster Logging is listed in the openshift-logging project with a Status of Succeeded.

Create a cluster logging instance

You can create a cluster logging instance to deploy pods for Elasticsearch and Kibana servers. It can also deploy Fluentd for collecting infrastructure logs from the worker nodes.

The AU-11 control requires a minimum of 90 days of online audit records and one (1) year's worth of records offline. The default retention periods are lengthened to 90 days in the following example YAML file to reflect this.

Create a cluster logging instance that uses the Red Hat OpenShift on IBM Cloud documentation.

In the YAML field, replace the code with the following example code. The sample code configures the logging stack to complete the following tasks:

Run the logging stack on the provisioned worker pool.
Add a retention period of 90 days.
Add a 200 GB persistent volume to Elasticsearch.

apiVersion: "logging.openshift.io/v1"
kind: "ClusterLogging"
metadata:
  name: "instance"
  namespace: "openshift-logging"
spec:
  managementState: "Managed"
  logStore:
    type: "elasticsearch"
    retentionPolicy:
      application:
        maxAge: 90d
      infra:
        maxAge: 90d
      audit:
       maxAge: 90d
    elasticsearch:
      nodeCount: 3
      tolerations:
      - key: "logging-monitoring"
        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: 600
      storage:
        storageClassName: "ibmc-vpc-block-general-purpose"
        size: 200G
      resources:
        requests:
          memory: "24Gi"
      proxy:
        resources:
          limits:
            memory: 256Mi
          requests:
             memory: 256Mi
      redundancyPolicy: "MultipleRedundancy"
  visualization:
    type: "kibana"
    kibana:
      replicas: 1
      tolerations:
      - key: "logging-monitoring"
        operator: "Exists"
        effect: "NoExecute"
        tolerationSeconds: 600
  curation:
    type: "curator"
    curator:
      schedule: "30 3 * * *"
  collection:
    logs:
      type: "fluentd"
      fluentd: {}

The redundancyPolicy is set to MultipleRedundancy, which causes Elasticsearch to replicate shards to half of the data nodes (in this case, 2). To replicate shards to all three worker nodes, set redundancyPolicy to FullRedundancy.

You might need to adjust the following items:

Application retention policy
Infrastructure retention policy
Audit retention policy
The size of the block storage created for each Elasticsearch server (200 GB by default) to hold the logs expected to accumulate over the retention period

Verify that the logging instance is created by completing the following tasks.

Switch to the Workloads → Pods page.
Select the openshift-logging project. A list of several pods displays. The list of pods looks similar to the following list and includes pods for cluster logging, Elasticsearch, Fluentd, and Kibana:
- cluster-logging-operator-cb795f8dc-xkckc
- elasticsearch-cdm-b3nqzchd-1-5c6797-67kfz
- elasticsearch-cdm-b3nqzchd-2-6657f4-wtprv
- elasticsearch-cdm-b3nqzchd-3-588c65-clg7g
- fluentd-2c7dg
- fluentd-9z7kk
- fluentd-br7r2
- fluentd-fn2sb
- fluentd-pb2f8
- fluentd-zqgqx
- kibana-7fb4fd4cc9-bvt4p

Starting Kibana

You can start Kibana from the menu bar of the Red Hat OpenShift on IBM Cloud console.

Click the 3x3 matrix icon in the menu bar.
Select Observability → Logging.
The first time that you access Kibana you must use your IBM Cloud credentials to log in.

Set up virtual server instance logging with Fluentd

The EFK stack that is deployed in Red Hat OpenShift on IBM Cloud can also be used to aggregate application and infrastructure logs from virtual server instances. The following instructions describe how to install and configure Fluentd on a virtual server instance so that it can forward its logs to an Elasticsearch cluster.

Preparing the Elasticsearch cluster to receive logs

Fluentd should use a Red Hat OpenShift on IBM Cloud private endpoint when sending log data to the Elasticsearch cluster.

To enable the Elasticsearch cluster to receive logs from a Fluentd client running outside of the Red Hat OpenShift on IBM Cloud cluster, you must create a route to the Elasticsearch service.

Navigate to the Red Hat OpenShift on IBM Cloud web console.
Select Networking → Routes.
Select Project openshift-logging
Click Create Route.
1. Select an intuitive name, such as elasticsearch.
2. Select Elasticsearch for the service.
3. Select 9200 → restapi (TCP) as the Target Port
4. Select Secure route.
5. Select Re-encrypt for TLS Termination.
6. Select None for Insecure Traffic.
7. Set the Destination CA Certificate to the ElasticSearch CA (reference). You can retrieve the certificate with the following command:
```
$ oc extract secret/elasticsearch --to=. --keys=admin-ca -n openshift-logging
```
8. The other fields, including the other certificate and key fields, can be left blank.
Click Create.
After the route is created, note the hostname for the route. You can find the hostname at following path: Networking → Routes → Elasticsearch. The hostname is in this format: Host elasticsearch-openshift-logging.workaround-logging-caf539b8d718205a334907f986dcee2a-0000.us-south.containers.appdomain.cloud.

Retrieving the bearer token for Elasticsearch

When forwarding logs to the Elasticsearch cluster, the Fluentd client authenticates to the Elasticsearch service by using a bearer token. You can retrieve the bearer token for the Elasticsearch Operator with the following commands:

$ oc project openshift-operators-redhat
$ oc sa get-token elasticsearch-operator

Save the bearer token to use in the next section.

Installing Fluentd on a virtual server instance

You can install Fluentd on a virtual server instance. For more information about installing on different operating systems, see the instructions for installing Fluentd. For Linux, use the td-agent.

Log forwarding to the Red Hat OpenShift on IBM Cloud cluster is done by the Elasticsearch Fluentd plug-in. The Elasticsearch plug-in is installed as part of the td-agent installation and does not require any additional installation steps. Log forwarding is configured in the td-agent.conf file.

The following example is a td-agent.conf file that forwards logs to an Elasticsearch server. You must make the appropriate changes where noted with <variables>.

<system>
  log_level debug
</system>

<source>
  @type tail
  path /var/log/td-agent/td-agent.log
  tag tail.messages
  <parse>
    @type none
  </parse>
</source>

<source>
  @type syslog
  port 5140
  tag system.local
</source>

<match **>
  @type copy
  <store>
    @type elasticsearch
    @id remote_elasticsearch
    host <elasticsearch route hostname>
    port 443
    logstash_format true
      
    ssl_verify true
    verify_es_version_at_startup false
    scheme https
    ssl_version TLSv1_2
      
    custom_headers {"Authorization":"Bearer <Bearer Token>"}
  </store>
</match>

For more information on the custom_headers configuration, see the Elasticsearch plug-in for custom_headers.

This sample td-agent.conf file includes two <source> sections:

The @type tail source plug-in captures log entries that are written to the Fluentd log file (/var/log/td-agent/td-agent.log) and forwards them.
The @type syslog source plug-in enables the local Fluentd client to retrieve records by using the syslog protocol on UDP or TCP.

These <source> sections can be modified and new ones added to collect both application and system log files and forward them. For more information, see Fluentd input plug-ins.

Edit the file /etc/td-agent/td-agent.conf and replace its contents with the preceding example. Make sure to set the host to the Elasticsearch route hostname and the bearer token to the token that was retrieved in the previous section.

After the td-agent.conf file is updated, you can start the td-agent with the following command:

$ systemctl start td-agent.service

The td-agent starts forwarding log messages to the Red Hat OpenShift on IBM Cloud Elasticsearch cluster. You can view them there by using Kibana.