IBM Cloud Docs
Monitoring a Linux bare metal server

Monitoring a Linux bare metal server

You can monitor a Bare Metal server with IBM Cloud Monitoring by configuring a monitoring agent in your server. The monitoring agent uses an access key (token) to authenticate with the IBM Cloud Monitoring instance. The monitoring agent acts as a data collector. It automatically collects metrics. You view metrics via the web-based user interface. You can monitor Bare metals in IBM Cloud, on-prem, and in other clouds.

By default, this agent collects core infrastructure and network time series that you can use to monitor the host. For a list of collected metrics, see Metrics Available for non-orchestrated environments.

The Monitoring agent automatically collects the following types of system metrics per host:

  • System hosts metrics provide information about CPU, memory, and storage usage metrics, that you can use to analyze the performance and resource utilization of all your processes.

  • File and File System metrics provide information about files and file system that you can use to analyze file interactions that occur in your system. For example, you can find information about your open files, bytes going in and out, or the percentage of usage of a given file system.

  • Process metrics provide information about the processes that run in your servers. For example, you can use these metrics to explore the number of processes, or get client or server information.

  • Network metrics provide information about the network. They offer insight to the connections that are established between your applications, containers, and servers. For example, you can find information about the bytes that are being sent or received, or the number of HTTP requests, connections, and latency. In addition, for SQL or MongoDB, the agent collects additional information when it is configured in troubleshooting mode.

Through the Monitoring UI, you can analyze data in the Advisor tab, the Explore tab, and in the Dashboard tab. You monitor the data through metric views and dashboards.

Consider the following information when monitoring your data:

  • In the Explorer tab, you can monitor individual metrics.

  • In the Advisor tab, you can monitor Red Hat OpenShift or host level metrics.

    This tab is only available for users that belong to a team that has access to monitor Red Hat OpenShift or host level metrics.

  • In the Dashboard tab, you can monitor through panels predefined dashboards or custom ones and get a specialized insight into network data, application data, topology, services, hosts, and containers. A panel displays a metric or group of metrics in a dashboard.

For each metric view and dashboard, you can define the scope of the data, how to aggregate data, and what time and group filters to apply to the data. For more information, see Managing panels.

You can configure a dashboard as the default entry point for a team, unifying a team's experience, and allowing users to focus their immediate attention on the most relevant information for them.

For more information, see Viewing metrics.

Before you begin

  1. Read about IBM Cloud Monitoring.

  2. Install the IBM Cloud CLI. For more information, see Installing the IBM Cloud CLI.

  3. Provision an IBM Cloud Monitoring instance from the catalog.

  4. Provision a bare metal server.

    To complete the steps in this topic, ensure you have internet access from the bare metal. This is needed for configuring the monitoring agent.

  5. Configure a VPN connection between your terminal and the bare metal server

    Virtual Private Networking (VPN) access enables users to manage all servers remotely and securely over the IBM Cloud® private network. A VPN connection from your location to the private network allows out-of-band management and server rescue through an encrypted VPN tunnel. VPN tunnels can be initiated to any IBM Cloud data center or PoP allowing you geographic redundancy.

    Complete the following steps to configure a VPN connection between your terminal and the bare metal server:

    1. Enable VPN access on each account that needs VPN access.

    2. Depending on your operating system, download the latest MotionPro 32-bit or 64-bit files from the Array Networks Clients and Tools download site. Learn more.

    3. Configure a standalone SSL VPN client and open a connection:

    For example, if you use the MotionPro Plus client for MacOS, to add a profile, click Add.

    In the Basic section, enter a Title. Enter a Gateway, for example, for a bare metal in Dallas 10, enter vpn.dal10.softlayer.com. Enter your VPN user name. Check that the Port is set to 443. Then, click OK.

    To open a secure connection, click Login.

  6. Connect to a bare metal server by using SSH

    You might require a VPN to access your system depending on your security setup and ssh configuration on the bare metal host.

    You must ssh to the host by using your credentials, or the root credentials that are available from the IBM Cloud Console.

    You will require root permissions in order to install the monitoring agent.

    For example, you can complete the following steps to get the bare metal information that you need to ssh into the server:

    1. Log in to your IBM Cloud account.

    2. Click the Menu icon Menu icon > Classic Infrastructure > Device List.

    3. Identify the bare metal server that you want to monitor. Copy the Public IP.

    4. Click the bare metal server device name.

    5. Select Passwords. Copy the password for the root user.

      Then, from a terminal, run the following command:

      ssh <USER_ID>@<IP_ADDRESS>
      

      Where:

      <USER_ID> is the user ID that you use to log in to the bare metal server. For example, use root.

      <IP_ADDRESS> is the public IP address of the bare metal server.

      For example: ssh root@45.123.122.12

Configure a monitoring agent to collect metrics from the bare metal server

You must install a monitoring agent to collect and forward metrics from a bare metal server to an IBM Cloud Monitoring instance.

Complete the following steps from the command line to install a monitoring agent:

  1. Obtain the access key. For more information, see Getting the access key through the IBM Cloud UI.

  2. Obtain the ingestion URL. For more information, see collector endpoints.

  3. Deploy the monitoring agent. Run the following command:

    curl -sL https://ibm.biz/install-sysdig-agent | sudo bash -s -- --access_key ACCESS_KEY --collector COLLECTOR_ENDPOINT --collector_port 6443 --secure true --tags TAG_DATA --additional_conf 'sysdig_capture_enabled: false'
    

    Where

    • ACCESS_KEY is the ingestion key for the instance.

    • COLLECTOR_ENDPOINT is the ingestion URL for the region where the monitoring instance is available.

    • TAG_DATA are comma-separated tags that are formatted as TAG_NAME:TAG_VALUE. You can associate one or more tags to your monitoring agent. For example, role:serviceX,location:us-south. Later on, you can use these tags to identify metrics from the environment where the agent is running.

    • The SECURE flag must be set to true to use a secure SSL/TLS connection to send metrics to the collector.

    • Set sysdig_capture_enabled to false to disable the capture feature. By default is set to true. For more information, see Working with captures.

    If cURL is not available, you must install it. For example, for an Ubuntu bare metal, run the following command: sudo apt-get update. Then, run the install command: sudo apt-get install curl.

    For example, see the following sample command to install a monitoring agent that forwards metrics to a monitoring instance in US South (Dallas):

    curl -sL https://ibm.biz/install-sysdig-agent | sudo bash -s -- -a xxxxxxxxxxxxx -c ingest.us-south.monitoring.cloud.ibm.com --collector_port 6443 --secure true -ac "sysdig_capture_enabled: false" --tags sourceType:baremetal,location:dallas
    
  4. Configure the agent for non-orchestrated environments.

    Open the dragent.yaml file that is located in /opt/draios/etc/.

    Add the following configuration parameter:

    feature:
      mode: monitor_light
    

    Restart the agent. Run the following command:

    service dragent restart
    

Launch the monitoring UI to verify that you are getting data to monitor the bare metal server

Complete the following steps to launch the web UI:

  1. Log in to your IBM Cloud account.

    After you log in with your user ID and password, the IBM Cloud console opens.

  2. Click the Menu icon Menu icon > Observability.

  3. Select Monitoring.

    The list of instances that are available on IBM Cloud is displayed.

  4. Select your instance. Then, click Open dashboard.

It may take some time before you see the bare metal entry while the information is initally collected and processed by the monitoring agent.

You only can monitor one instance per browser. You could have multiple tabs for the same instance.

Monitor your bare metal

In the Advisor tab, you can monitor and troubleshoot the health, risk, and capacity of hosts and Kubernetes clusters.

Advisor tab
Advisor tab

  • Data is refreshed every 10 minutes.
  • Metrics are prioritized by event count and severity.
  • For more information, see Advisor.

In the Advisor section, choose to monitor by host. Check out the predefined dashboards that you can use to monitor the health of your resources.

When you choose to monitor by host, you can choose any of the following dashboards:

  • Host Resource Usage
  • File System Usage & Performance
  • Memory Usage
  • Network
  • Sysdig Agent Health & Status

[Optional] Configure the Prometheus IPMI Exporter to monitor sensor metrics

In addition to the set of metrics that are automatically collected by the monitoring agent, you might want to collect other metrics such as sensor metrics. You can use the Prometheus IPMI Exporter to perform the collection of Intelligent Platform Management Interface (IPMI) device sensor metrics from the bare metal server.

  • The Prometheus IPMI Exporter exporter supports local IPMI devices and remote devices that can be accessed by using Remote Management Control Protocol (RMCP).
  • When you use RMCP to access remote devices, you can use an IPMI exporter to monitor multiple IPMI devices. You identify each device by passing the target hostname as a parameter.
  • The IPMI exporter relies on tools from the FreeIPMI suite.

You can collect the following metrics when you configure the IPMI exporter in a bare metal server:

  • IPMI admin metrics

    The metric ipmi_up {collector="<NAME>"} reports 1 when data from a remote host is collected successfully. It reports 0 for collection of data in a local host.

    The metric ipmi_scrape_duration_seconds reports the amount of time that it takes the collector to retrieve the data.

  • IPMI System event log (SEL) metrics

    The metric ipmi_sel_entries_count reports the number of entries in the system event log.

    The metric ipmi_sel_free_space_bytes reports the number of free bytes for new ystem event log entries.

  • IPMI sensor data

    The IPMI exporter collects 2 metrics per sensor type: state and value. A value of 0 reports a normal state. A value of 1 reports a warning state. A value of 2 reports a critical state. A value of NaN reports information not available. For example, see the metrics for different sensors:

    Temperature sensor metrics: ipmi_temperature_celsius, ipmi_temperature_state

    Fan speed sensor metrics: ipmi_fan_speed_rpm, ipmi_fan_speed_state

    Voltage sensor metrics: ipmi_voltage_state, ipmi_voltage_volts

  • IPMI chassis power state of the machine

    The metric ipmi_chassis_power_state informs about the current state of the chassis of the machine. It has a value of 1 when the power is on. It has a value of 0 when the power is off.

  • DCMI data

    The metric ipmi_dcmi_power_consumption_current_watts informs about the live power consumption of the machine in Watts.

  • BMC details

    The metric ipmi_bmc_info includes information about the firmware revision and manufacturer in labels and has a value of 1.

For more information, see Prometheus IPMI Exporter.

Complete the following steps to configure the Prometheus IPMI Exporter:

Install the Prometheus IPMI exporter

Complete the following steps:

  1. From a local terminal, download the Prometheus IPMI exporter.

  2. In the bare metal server, from the shh session, create the directory /usr/monitor. Run the following commands:

    cd /usr
    
    mkdir monitor
    
  3. Copy the file to the bare metal. From the directory where the file is available, run the following command:

    scp ipmi_exporter-v1.2.0.linux-amd64.tar.gz  root@<IP_ADDRESS>:/usr/monitor/
    

    Where <IP_ADDRESS> is the public IP address of the bare metal server.

    If the command fails, check that your VPN connection is still open.

  4. In the bare metal server, from the shh session, uncompress the file. Run the following commands:

    cd /usr/monitor/
    
    tar -xvf ipmi_exporter-v1.2.0.linux-amd64.tar.gz
    
  5. In the bare metal server, from the shh session, install the FreeIPMI suite. Run the following commands:

    sudo apt-get update
    
    sudo apt-get install freeipmi
    
  6. In the bare metal server, from the shh session, check the ipmi_local.yml file. Optionally, you can update the file to exclude sensors that you do not want to monitor.

    Change to the directory where you have extracted the IPMI exporter:

    cd ipmi_exporter-v1.2.0.linux-amd64/
    

    Check the configuration file. Run the command: more ipmi_local.yml You should see a file with similar content.

    # Configuration file for ipmi_exporter
    
    # This is an example config for scraping the local host.
    # In most cases, this should work without using a config file at all.
    modules:
            default:
                    # Available collectors are bmc, ipmi, chassis, dcmi, and sel
                    collectors:
                    - bmc
                    - ipmi
                    - dcmi
                    - chassis
                    - sel
                    # Got any sensors you don't care about? Add them here.
                    exclude_sensor_ids:
                    # - 2
    
  7. In the bare metal server, from the shh session, run the IPMI exporter.

    ./ipmi_exporter --config.file=ipmi_local.yml &
    
  8. Check the IPMI exporter is running. Run the command:

    ps -aux | grep ipmi
    

    You should see the IPMI exporter running.

Install the Prometheus exporter

The monitoring agent automatically collects metrics from Prometheus exporters. Therefore, to collect metrics from your IPMI exporter, you must also configure the Prometheus exporter.

Complete the following steps to run the Prometheus exporter:

  1. From a local terminal,download the Prometheus exporter.

  2. In the bare metal server, from the shh session, change to the directory /usr/monitor/. Run the following command:

    cd /usr/monitor/
    
  3. Copy the file to the bare metal. From the directory where the file is available, run the following command:

    scp prometheus-2.18.1.linux-amd64.tar.gz root@<IP_ADDRESS>:/usr/monitor/
    

    Where <IP_ADDRESS> is the public IP address of the bare metal server.

    If the command fails, check that your VPN connection is still open.

  4. In the bare metal server, from the shh session, uncompress the file. Run the following commands:

    cd /usr/monitor/
    
    tar -xvf prometheus-2.18.1.linux-amd64.tar.gz
    
  5. Modify the prometheus.yml file to include information about the scrape_configuration for the IPMI exporter.

    Change to the Prometheus directory:

    cd prometheus-2.18.1.linux-amd64/
    

    Edit the prometheus.yml file and add the section scrape_configs:

    # my global config
    global:
      scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).
    
    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets:
          # - alertmanager:9093
    
    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"
    
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: ipmi
    
        metrics_path: '/metrics'
        scheme: http
    
        static_configs:
        - targets: ['localhost:9290']
          labels:
            instance: baremetal01
            region: us-south
    
  6. Run the Prometheus exporter:

    ./prometheus &
    

Configure network settings

If you want to collect metrics from remote servers, complete the following steps:

  1. Enable the firewall to allow access to the ipmi_exporter.

  2. [Optional] Update the VPC rules

    If you use private endpoints, add an inbound rule to the security group for port 9290 with source type = Security Group and choose the security group for the bare metal server.

Update the monitoring agent that is running in the bare metal server

Complete the following steps:

  1. In the bare metal server, from the shh session, change to the directory /opt/draios/etc/. Run the following command:

    cd /opt/draios/etc/
    
  2. Update the /opt/draios/etc/dragent.yaml.

    Append the following section to the dragent.yaml file:

    prometheus:
     enabled: true
     interval: 30
     log_errors: true
     max_metrics: 3000
     max_metrics_per_process: 3000
     max_tags_per_metric: 20
     process_filter:
       - include:
           port: 9090
           conf:
             port: 9090
             path: "/metrics"
       - include:
           port: 9290
           conf:
             port: 9290
             path: "/metrics"
    
  3. Restart the monitoring agent. Run the following command:

    service dragent restart
    

Verify that you can see the prometheus ipmi metrics

Complete the following steps:

  1. Click the Menu icon Menu icon > Observability.

  2. Select Monitoring.

  3. Identify the monitoring instance that you created. Then, click Open dashboard.

  4. In the Explore view, select Hosts and Containers. Then, select the bare metal server that you want to monitor.

    Hosts and Containers view
    Hosts and Containers view

  5. Open the option to select more Dashboards and Metrics . Then, enter in the search bar ipmi. The list of IPMI metrics is displayed.

    IPMI metrics
    IPMI metrics

Configure a dashboard to analyze the IPMI status of your Bare metal

To create a dashboard to monitor the IPMI metrics, complete the following steps:

  1. Select the ipmi_up metric.

    ipmi_up metrics
    ipmi_up metrics

  2. Select the 3 dots icon. Then, select Copy to dashboard.

    Copy dashboard
    Copy dashboard

  3. Enter the name [Bare Metal] IPMI monitoring. Then, click Copy and Open.

    Copy and open a dashboard
    Copy and open a dashboard

    The dashboard opens.

    IPMI custom dashboard
    IPMI custom dashboard

  4. Add more IPMI metrics to the [Bare Metal] IPMI monitoring custom dashboard. Repeat the steps for each of the IPMI metrics that you want to monitor.

  5. Drag and drop, and resize panels to get the dashboard layout that you want. Save the layout.

Next steps