Supporting multiline logs for the Logging agent in orchestrated environments

Errors and stack traces can span several lines with each line being sent as a separate log entry. To support the ingestion of multiline logs by IBM® Cloud Logs from applications, such as Java or Python, running in orchestrated environments, such as Red Hat OpenShift on IBM Cloud or IBM Cloud Kubernetes Service, you must make changes to the Logging agent configuration. The changes include the parsing required to group log lines that are supposed to be together as a single log record.

About multiline

In OpenShift and Kubernetes clusters, the logging system captures logs from application stdout and stderr streams. It then adds a prefix to each log line with metadata before storing the logs in files, following the Container Runtime Interface (CRI) logging format.

This Kubernetes log line prefix contains:

Timestamp: In ISO 8601 format.
Stream name: stdout or stderr.
Tag: F or P.

The CRI logging format uses tags to define if a log line is a single log line or a multiline log entry. Valid values for the tag are:

Partial (P): This tag is included in log lines that are the result of spliting a single log line into multiple lines by the runtime and the log entry has not ended yet.
Full (F): This tag is used to indicate that the log entry is completed. It is used for a single log line entry or to indicate that it is the last line of the multiple-line entry.

2024-03-15T10:30:45.123456789Z stdout F This is a complete log line
2024-03-15T10:30:45.123456789Z stderr P This is the first part of a
2024-03-15T10:30:45.123456789Z stderr F multiline error message

By default, the Logging agent includes the configuration of the Tail plugin with the cri multiline parser to support concatenation of these CRI-formatted logs from stdout and stderr into a single log line.

It is recommended in Kubernetes environments using CRI-based logging, that the Multiline.Parser setting (set to cri by default) is used to correctly parse and reassemble multiline logs generated by the containers.

You might also have applications, such as Java or Python, where errors and stack traces can span several lines, and each line is sent as a separate log entry. These applications can generate multiple log lines that can be associated with one another into a single log line. To handle these multiline logs through the Logging agent, you must configure an additional Multiline parser.

Default multiline configuration

The default multiline configuration for CRI logs is configured and enabled when you deploy the Logging agent.

In Fluent Bit, you can configure the Multiline parser by using the built-in multiline parser or by using a custom multiline parser.

By default, the Tail plugin that is configured with the Logging agent is configured with the built-in multiline cri parser. This parser processes logs that are generated by the CRI-O container engine and supports concatenation of log entries.

For example, the Logging agent has this default configuration for multiline support:

    [INPUT]
        Name              tail
        Tag               kube.*
        .....
        Buffer_Chunk_Size 32KB
        Buffer_Max_Size   256KB
        Multiline.parser  cri
        Skip_Long_Lines   On
        Refresh_Interval  10
        storage.type      filesystem
        storage.pause_on_chunks_overlimit on

Configuring additional multiline support for applications

If you have applications, such as Java or Python, where errors and stack traces can span several lines, and each line is sent as a separate log entry, you must configure the Multiline parser in the Logging agent.

Choose one of the following options to configure the Logging agent with the Multiline parser:

Deploy the Logging agent by adding a new value enableMultiline in the logs-values.yaml file that you use to deploy the agent by using a Helm chart. For more information, see Configuring the Helm chart values file for the Logging agent or Configuring the Helm chart values file for the Logging agent.
Update the Logging agent to version 1.4.1 or above. You must update the Helm chart logs-values.yaml file with the agent version and set enableMultiline to true to enable multiline support. For more information, see Update the Helm chart values file for the Logging agent.

Adding a custom multiline parser

To create a custom multiline parser for use with the Logging agent, follow the instructions in Configurable Multiline Parsers. You will define a custom regex to determine the multiline pattern.

The Logging agent configuration must also include a FILTER after the INPUT plug-in. The filter will apply the pattern of the configured MULTILINE_PARSER. The Name value of the MULTILINE_PARSER must match the Multiline.parser value in the FILTER.

filter-multiline.conf: |
    [FILTER]
        Name              multiline
        Match             *
        Multiline.parser  INSERT_CUSTOM_PARSER_NAME
        Multiline.key_content log

Adding multiline support for apps

If you have the Logging agent deployed and you have apps such as Java or Python, where errors and stack traces can span several lines, and each line is sent as a separate log entry, you can update your agent configuration and configure the Multiline parser.

Complete the following steps to add multiline support in the Logging agent:

In the Logging agent config map (inputs.conf), add the Multiline parser.

The Logging agent configuration must have the @INCLUDE for the multiline filter right after the input plug-in.

fluent-bit.conf: |
[SERVICE]
  Flush                   1
  Log_Level               info
  Daemon                  off
  Parsers_File            parsers.conf
  Plugins_File            plugins.conf

  ...


@INCLUDE input-kubernetes.conf
@INCLUDE filter-multiline.conf

...

input-kubernetes.conf: |
[INPUT]
    Name              tail
    Tag               kube.*
    .....
    Buffer_Chunk_Size 32KB
    Buffer_Max_Size   256KB
    Multiline.parser  cri
    Skip_Long_Lines   On
    Refresh_Interval  10
    storage.type      filesystem
    storage.pause_on_chunks_overlimit on

filter-multiline.conf: |
[FILTER]
    Name              multiline
    Match             *
    Multiline.parser  multiline-java-example
    Multiline.key_content log

parsers.conf: |
...
[MULTILINE_PARSER]
    Name            multiline-java-example
    Type            regex
    Flush_timeout   500
    Rule            "start_state"     "/^(\d+-\d+-\d+ \d+:\d+:\d+\.\d+)(.*)$/"     "cont"
    Rule            "cont"            "/^(?!\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}).*$/"     "cont"
...

Restart the agent pods.

For Kubernetes clusters, run:

kubectl -n ibm-observe rollout restart ds/logs-agent

For OpenShift clusters, run:

oc -n ibm-observe rollout restart ds/logs-agent

Configuring multiline support for clusters with multiple runtimes and parsing requirements

If you have clusters running applications with multiple different languages or runtimes (for example, Java, Go, and Python), you might need to handle multiline logs from various sources. If you already have a custom multiline parser for Java, you can combine it with built-in parsers for other runtimes such as Go and Python. This ensures all logs are parsed correctly and forwarded as correctly grouped entries to IBM® Cloud Logs.

By specifying multiple parsers (built-in and custom) in a comma-separated list, the Logging agent will try each parser in sequence until a match is found.

Configuring multiple parsers using Helm

If you are using Helm to configure your orchestrated environment, update the multilinePreprocessor section to reference both built-in parsers (for example, go, python) with your custom parsers in a comma-separated list.

For example:

multilinePreprocessor:
  - name: multiline
    multiline.parser: go, python, nodejs, ruby, multiline-java-example, multiline-nodejs-winston
    multiline.key_content: log

If you have installed a previous version of the Logging agent and have updated the agent configuration by modifying the config map directly in the cluster, make a copy of your config map from the cluster before running the helm upgrade command. When the Logging agent is updated, any changes made to the config map will be overwritten.

After updating the values.yaml file, run helm upgrade to apply the changes. Verify your configuration by checking logs from different runtimes in IBM Cloud Logs to ensure multiline grouping works across all configurations.

More information and examples

For more information and tutorials with example scenarios for configuring multiline processing, see the following topics.

Additional resources for the Logging agent multiline processing
For information about	See
Configuring multiline support for the Logging agent in Linux	Topic
Configuring multiline support for the Logging agent in Windows	Topic
Multiline parsing for Java applications with Log4j	Tutorial
Multiline parsing using Helm for Java applications with Log4j	Tutorial
Multiline parsing for Node.js applications using Winston	Tutorial
Multiline parsing using Helm for Node.js applications using Winston	Tutorial