IBM Cloud Docs
Analyze logs and monitor application health

Analyze logs and monitor application health

This tutorial may incur costs. Use the Cost Estimator to generate a cost estimate based on your projected usage.

This tutorial shows how the IBM Log Analysis service can be used to configure and access logs of a Kubernetes application that is deployed on IBM Cloud. You will deploy a Python application to a cluster provisioned on IBM Cloud Kubernetes Service, configure a logging agent, generate different levels of application logs and access worker logs, pod logs or network logs. Then, you will search, filter and visualize those logs through Log Analysis Web UI.

Moreover, you will also setup the IBM Cloud Monitoring service and configure monitoring agent to monitor the performance and health of your application and your IBM Cloud Kubernetes Service cluster.

Objectives

  • Deploy an application to a Kubernetes cluster to generate log entries.
  • Access and analyze different types of logs to troubleshoot problems and pre-empt issues.
  • Gain operational visibility into the performance and health of your app and the cluster running your app.

Architecture diagram
Figure 1. Architecture diagram of the tutorial

  1. User connects to the application and generates log entries.
  2. The application runs in a Kubernetes cluster from an image stored in the Container Registry.
  3. The user will configure IBM Log Analysis service agent to access application and cluster-level logs.
  4. The user will configure IBM Cloud Monitoring service agent to monitor the health and performance of the IBM Cloud Kubernetes Service cluster and also the app deployed to the cluster.

Before you begin

This tutorial requires:

  • IBM Cloud CLI,
    • IBM Cloud Kubernetes Service plugin (kubernetes-service),
    • Container Registry plugin (container-registry),
  • kubectl to interact with Kubernetes clusters,
  • git to clone source code repository.

You will find instructions to download and install these tools for your operating environment in the Getting started with solution tutorials guide.

To avoid the installation of these tools you can use the Cloud Shell from the IBM Cloud console.

In addition if you are not the Admin of your account you will require IAM privileges to create the resources, and an Admin will need to:

Create a Kubernetes cluster

Kubernetes Service provides an environment to deploy highly available containerized applications that run in Kubernetes clusters.

A minimal cluster with one (1) zone, one (1) worker node and the smallest available size (Flavor) is sufficient for this tutorial. The name mycluster will be used in this tutorial.

Open the Kubernetes clusters and click Create cluster. See the documentation referenced below for more details based on the cluster type. Summary:

  • Click Standard tier cluster
  • For Kubernetes on VPC infrastructure see reference documentationCreating VPC clusters.
    • Click Create VPC:
      • Enter a name for the VPC.
      • Chose the same resource group as the cluster.
      • Click Create.
    • Attach a Public Gateway to each of the subnets that you create:
      • Navigate to the Virtual private clouds.
      • Click the previously created VPC used for the cluster.
      • Scroll down to subnets section and click a subnet.
      • In the Public Gateway section, click Detached to change the state to Attached.
      • Click the browser back button to return to the VPC details page.
      • Repeat the previous three steps to attach a public gateway to each subnet.
  • For Kubernetes on Classic infrastructure see reference documentation Creating classic cluster.
  • Choose a resource group.
  • Uncheck all zones except one.
  • Scale down to 1 Worker nodes per zone.
  • Choose the smallest Worker Pool flavor.
  • Enter a Cluster name.
  • Click Create.

Deploy and configure a Kubernetes app to forward logs

The ready-to-run code for the logging app is located in this GitHub repository. The application is written using Django, a popular Python server-side web framework. Clone or download the repository, then deploy the app to Kubernetes Service on IBM Cloud.

Clone the application

In a terminal window:

  1. Clone the GitHub repository:
    git clone https://github.com/IBM-Cloud/application-log-analysis
    
  2. Change to the application directory
    cd application-log-analysis
    

Deploy the application

  1. Gain access to your cluster as described under the Access section of your cluster.

    For more information on gaining access to your cluster and to configure the CLI to run kubectl commands, check the CLI configure section

  2. Define an environment variable named MYCLUSTER with your cluster name:

    MYCLUSTER=mycluster
    
  3. Make sure to be logged in. Retrieve the cluster ingress subdomain:

    ibmcloud ks cluster get --cluster $MYCLUSTER
    
  4. Define a variable pointing to the subdomain:

    MYINGRESSSUBDOMAIN=<Ingress Subdomain value>
    
  5. Initialize the kubectl cli environment

    ibmcloud ks cluster config --cluster $MYCLUSTER
    
  6. Edit app-log-analysis.yaml and replace the placeholder ($MYINGRESSSUBDOMAIN) with the value captured in the previous step. Check the table in this section below for more details.

  7. Once the yaml is updated, deploy the app with the following command:

    kubectl apply -f app-log-analysis.yaml
    
  8. You can now access the application at http://$MYINGRESSSUBDOMAIN/.

    Example and description for the environment variable MYINGRESSSUBDOMAIN
    Variable Value Description
    $MYINGRESSSUBDOMAIN mycluster-1234-d123456789.us-south.containers.appdomain.cloud Retrieve from the cluster overview page or with ibmcloud ks cluster get --cluster $MYCLUSTER.

Validate Log Analysis instance configuration

Applications deployed to an IBM Cloud Kubernetes Service cluster in IBM Cloud will likely generate some level of diagnostic output, i.e. logs. As a developer or an operator, you may want to access and analyze different types of logs such as worker logs, pod logs, app logs, or network logs to troubleshoot problems and pre-empt issues.

By using the Log Analysis service, it is possible to aggregate logs from various sources and retain them as long as needed. This allows to analyze the "big picture" when required and to troubleshoot more complex situations.

During creation of the IBM Cloud Kubernetes Service cluster, it is expected that you completed the steps to also connect to a Log Analysis service.

  1. From the Kubernetes clusters, click on the name of the Kubernetes cluster you just created and click Overview on the left pane.

  2. Scroll down to Integrations and open the Log Analysis UI by clicking Launch. It may take a few minutes before you start seeing logs.

    If instead of Launch you see a Connect button, you can click on it to create the integration if it was not done during the creation of the cluster. It simplifies the installation of logdna-agent pod on each node of your cluster. The logging agent reads log files from the pod where it is installed, and forwards the log data to your logging instance.

  3. To check whether the logdna-agent pods on each node of your cluster are in Running status, run the below command in a shell:

    kubectl get pods --namespace ibm-observe
    

    You should see an output similar to the one below, with one logdna-agent running per worker node that you have deployed in your cluster.

    NAME                 READY   STATUS    RESTARTS   AGE
    logdna-agent-4nlsw   1/1     Running   0          39s
    logdna-agent-lgq9f   1/1     Running   0          39s
    logdna-agent-ls6dc   1/1     Running   0          39s
    

Generate and access application logs

In this section, you will generate application logs and review them in Log Analysis.

Generate application logs

The application deployed in the previous steps allows you to log a message at a chosen log level. The available log levels are critical, error, warn, info and debug. The application's logging infrastructure is configured to allow only log entries on or above a set level to pass. Initially, the logger level is set to warn. Thus, a message logged at info with a server setting of warn would not show up in the diagnostic output.

Take a look at the code in the file views.py. The code contains print statements as well as calls to logger functions. Printed messages are written to the stdout stream (regular output, application console / terminal), logger messages appear in the stderr stream (error log).

  1. Open the web app at http://$MYINGRESSSUBDOMAIN/ and click on the Logging tab.
  2. Generate several log entries by submitting messages at different levels. The UI allows to change the logger setting for the server log level as well. Change the server-side log level in-between to make it more interesting. For example, you can log a "500 internal server error" as an error or "This is my first log entry" as an info.

Access application logs

You can access the application specific log in the Log Analysis UI using the filters.

  1. On the top bar, click on Apps.
  2. Under containers, check app-log-analysis. A new unsaved view is shown with application logs of all levels. You can also type app:app-log-analysis in the Search... field.
  3. To see logs of specific log level(s), Click on Levels and select multiple levels like Error, info, warning etc.,

Search and filter logs

The Log Analysis UI, by default, shows all available log entries(Everything). Most recent entries are shown on the bottom through an automatic refresh. In this section, you will modify what and how much is displayed and save this as a View for future use.

Search logs

  1. In the Search input box located at the bottom of the page in the Log Analysis UI,

    • you can search for lines that contain a specific text like "This is my first log entry" or 500 internal server error.
    • or a specific log level by entering level:info where level is a field that accepts string value.

    For more search fields and help, click the syntax help icon next to the search input box

  2. To jump to a specific timeframe, enter 5 mins ago in the Jump to timeframe input box. Click the icon next to the input box to find the other time formats within your retention period.

  3. To highlight the terms, click on Toggle Viewer Tools icon.

  4. Type error as your highlight term in the first input box and hit Enter on your keyboard. Check the highlighted lines with the terms.

  5. Type container as your highlight term in the second input box and hit Enter on your keyboard. Check the highlighted lines with the terms.

  6. Click on Toggle Timeline icon to see lines with logs at a specific time of a day.

Filter logs

You can filter logs by tags, sources, apps or levels.

  1. On the top bar, click Sources and select the name of the host (worker node) you are interested in checking the logs. Works well if you have multiple worker nodes in your cluster.
  2. To check other container or file logs, click app-log-analysis or Apps and select the checkbox(s) you are interested in seeing the logs.

Create a view

Views are saved shortcuts to a specific set of filters and search queries.

As soon as you search or filter logs, you should see Unsaved View in the top bar. To save this as a view:

  1. Click Apps and select the checkbox next to app-log-analysis
  2. Click on Unsaved View > click save as new view and name the view as app-log-analysis-view. Leave the Category as empty.
  3. Click Save View and new view should appear on the left pane showing logs for the app.

Visualize logs with graphs and breakdowns

In this section, you will create a board and then add a graph with a breakdown to visualize the app level data. A board is a collection of graphs and breakdowns.

  1. On the left pane, click on the board icon (above the settings icon) > click New Board.
  2. Set Name to app-log-analysis-board. Click Save.
  3. Under Select a field to graph:
    1. Select app as the field
    2. Select app-log-analysis as the field value
  4. Click Add Graph.
  5. Select Counts as your metric to see the number of lines in each interval over last 24 hours.
  6. To add a breakdown, click on the arrow below the graph:
    • Choose Histogram as your breakdown type.
    • Choose level as your field type.
    • Click Add Breakdown to see a breakdown with all the levels you logged for the app.

Validate IBM Cloud Monitoring instance configuration and monitor your cluster

During creation of the IBM Cloud Kubernetes Service cluster, it is expected that you completed the steps to also connect to a Monitoring service. In the following, you are going to add IBM Cloud Monitoring to the application. The service regularly checks the availability and response time of the app.

  1. From the Kubernetes clusters, click on the name of the Kubernetes cluster you just created and click Overview on the left pane.

  2. Scroll down to Integrations and open the Monitoring UI by clicking on the Launch button. It may take few minutes for the monitoring information to appear. If instead of Launch you see a Connect button, you can click on it to create the integration if it was not done during the creation of the cluster. It simplifies the installation of sysdig-agent pod on each node of your cluster. The agent captures metrics and forwards them to your monitoring instance.

  3. To check whether the sysdig-agent pods on each node of your cluster are in Running status, run the below command in a shell:

    kubectl get pods --namespace ibm-observe
    

    You should see an output similar to the one below, with one sysdig-agent running per worker node that you have deployed in your cluster.

    sysdig-agent-m6k9w   1/1     Running   0          73s
    sysdig-agent-mp4d6   1/1     Running   0          73s
    sysdig-agent-q2s55   1/1     Running   0          73s
    

Small clusters will result in pods in a Pending state. This can be adjusted by changing the daemon set:

kubectl edit daemonset/sysdig-agent -n ibm-observe

Change the values in the requests section to cpu: "200m" and memory to "200Mi" and check the pods again.

     resources:
       limits:
         cpu: "1"
         memory: 1Gi
       requests:
         cpu: "200m"
         memory: 200Mi

Note: The agent installation as provided by the IBM Cloud script includes the enablement of the Prometheus metrics feature by default. The deployment configuration app-log-analysis.yaml used for the example Python application in this tutorial here includes the appropriate annotations to scrape for Prometheus metrics.

spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8002"

The application includes a Prometheus library prometheus_client, which is used by the sample app in this tutorial to generate custom metrics. You can find a Prometheus client to use for most programming languages. See the Prometheus metrics for details.

Monitor your cluster

To check the health and performance of your app and cluster you can review the default (out-of-the-box) and/or custom application generated metrics that are captured.

Note: Change the interval to 10 M on the bottom bar of the UI.

  1. Go back to the application running at http://$MYINGRESSSUBDOMAIN/ and click on the Monitoring tab, generate several metrics.
  2. Back to the Monitoring UI, select Dashboards.
  3. Search for the predefined dashboard named Kubernetes > Workload Status & Performance.
  4. In this dashboard note the set of filters cluster in, namespace in, ... Set the workload in filter to app-log-analysis-deployment to focus on the metrics generated by the application deployed earlier.
  5. Scroll through the dashboard to discover all the predefined graphs like HTTP Requests Count per Workload or HTTP Requests Latency per Workload, Resource usage by Pod.

The sample application that was deployed includes code to generate custom metrics. These custom metrics are provided using a Prometheus client and mock multiple access to API endpoints.

Dashboard showing API counter metrics
Figure 2. Dashboard showing API counter metrics

  1. Under Explore, select All workloads.
  2. Expand your cluster name on the left pane, then expand default namespace and then click on app-log-analysis-deployment.
  3. In the list of metrics, expand wolam.
  4. Select wolam_api_counter_total to monitor the calls to API endpoints.
  5. Configure the query
    1. Set Metric to sum
    2. Set Group by to sum
    3. Set label to endpoint
  6. Go back to the application running at http://$MYINGRESSSUBDOMAIN/ and click on the Monitoring tab, generate a few metrics after changing the region.
  7. To monitor the calls to a given API endpoint of the application by region, change the Group by label to region.

Create a custom dashboard

Along with the pre-defined dashboards, you can create your own custom dashboard to display the most useful/relevant views and metrics for the containers running your app in a single location. Each dashboard is comprised of a series of panels configured to display specific data in a number of different formats.

To create a dashboard with a first panel:

  1. Under Dashboards, click New Dashboard.
  2. In the New Panel:
    1. Set the Metric to sysdig_container_net_http_request_time.
    2. Set Group by to container_id.
  3. Edit the Dashboard scope, set the filter to container_image, is and icr.io/solution-tutorials/tutorial-application-log-analysis:latest.
  4. Save the dashboard.

New Dashboard
Figure 3. New dashboard

To add another panel:

  1. Use the Add Panel button in the dashboard.
  2. Change the panel type from Timechart to Number
  3. Click the Metrics and Grouping and set the Metric to sysdig_container_net_request_count.
  4. Set the Time Aggregation to Rate.
  5. Set the Group by to Sum.
  6. Enable Compare To and set the value to 1 Hour ago.
  7. Save the panel.

Remove resources

  • If you created them as part of this tutorial, remove the logging and monitoring instances from Observability page.
  • Delete the cluster including worker node, app and containers. This action cannot be undone.
    ibmcloud ks cluster rm --cluster $MYCLUSTER -f --force-delete-storage
    

Depending on the resource it might not be deleted immediately, but retained (by default for 7 days). You can reclaim the resource by deleting it permanently or restore it within the retention period. See this document on how to use resource reclamation.

Expand the tutorial

  • Use the IBM Cloud® Activity Tracker service to track how applications interact with IBM Cloud services.
  • Add alerts to your view.
  • Export logs to a local file.
  • Examine views.py in the sample application and experiment updating the application to capture additional custom metrics. Create an updated image version and update and apply app-log-analysis.yaml to redeploy your updates.

Related content