Centralized Logging and Monitoring with Elastic Stack

Elasticsearch Feb 13, 2022

Elasticsearch comes with very handy features in the new versions for collecting logs and also monitoring the hosts or docker containers. We can run elasticsearch and kibana with the following docker-compose file.

version: '2.2'
    image: docker.elastic.co/elasticsearch/elasticsearch:7.17.0
    container_name: es01
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - xpack.security.enabled=true
      - xpack.security.authc.api_key.enabled=true
      - xpack.security.audit.enabled=true
      - ELASTIC_PASSWORD=somethingsecret
        soft: -1
        hard: -1
      - data01:/usr/share/elasticsearch/data
      - 9200:9200
      - elastic

    image: docker.elastic.co/kibana/kibana:7.17.0
    container_name: kib01
      - 5601:5601
      ELASTICSEARCH_URL: http://es01:9200
      ELASTICSEARCH_HOSTS: '["http://es01:9200"]'
      ELASTICSEARCH_PASSWORD: somethingsecret
      SERVER_PUBLICBASEURL: http://localhost:5601
      XPACK_ENCRYPTEDSAVEDOBJECTS_ENCRYPTIONKEY: "something_at_least_32_characters"
      XPACK_REPORTING_ENCRYPTIONKEY: "something_at_least_32_characters"
      XPACK_SECURITY_ENCRYPTIONKEY: "something_at_least_32_characters"
      - es01
      - elastic

    driver: local

    driver: bridge

Reference on configuring the kibana security:

Configure security in Kibana | Kibana Guide [8.0] | Elastic
A list of the supported authentication mechanisms in Kibana.

Collecting Metrics with Metricbeat

We can run the following docker container in each host that we want to collect metrics from.

version: '3.8'

    image: docker.elastic.co/beats/metricbeat:7.17.0
    user: root
      - /var/run/docker.sock:/var/run/docker.sock
      - ./metricbeat.docker.yml:/usr/share/metricbeat/metricbeat.yml:ro
      - /sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro
      - /proc:/hostfs/proc:ro
      - /:/hostfs:ro
      - /var/log:/var/log:rw
      - /var/lib/docker/containers:/var/lib/docker/containers:rw
    network_mode: "host"

metricbeat configuration is provided with the metricbeat.docker.yml file:

metricbeat.max_start_delay: 10s
# setup.ilm.enabled: true
setup.dashboards.enabled: true
setup.dashboards.beat: metricbeat

#==========================  Modules configuration =============================

#-------------------------------- System Module --------------------------------
- module: system
    - cpu             # CPU usage
    - load            # CPU load averages
    - memory          # Memory usage
    - network         # Network IO
    - process         # Per process metrics
    - process_summary # Process summary
    - uptime          # System Uptime
    - socket_summary  # Socket summary
    #- core           # Per CPU core usage
    #- diskio         # Disk IO
    #- filesystem     # File system usage for each mountpoint
    #- fsstat         # File system summary metrics
    #- raid           # Raid
    #- socket         # Sockets and connection info (linux only)
    #- service        # systemd service information
  enabled: true
  period: 10s
  processes: ['.*']

  # Configure the mount point of the host’s filesystem for use in monitoring a host from within a container
  #system.hostfs: "/hostfs"

  # Configure the metric types that are included by these metricsets.
  cpu.metrics:  ["percentages","normalized_percentages"]  # The other available option is ticks.
  core.metrics: ["percentages"]  # The other available option is ticks.

#-------------------------------- Docker Module --------------------------------
- module: docker
    - "container"
    - "cpu"
    - "diskio"
    - "event"
    - "healthcheck"
    - "info"
    #- "image"
    - "memory"
    - "network"
    #- "network_summary"
  hosts: ["unix:///var/run/docker.sock"]
  period: 10s
  enabled: true

# ---------------------------- Elasticsearch Output ----------------------------
  hosts: ["localhost:9200"]
  username: "elastic"
  password: "somethingsecret"

    host: "localhost:5601"
    username: "elastic"
    password: "somethingsecret"

We use system module to collect the host metrics and docker module to collect stats via docker engine. Metricbeat supports many modules that can be integrated and used to collect metrics, such as we can collect metrics from docker, kubernetes, many databases, message queues like rabbitmq, kafka, and many more applications. You can find the modules and their configuration from here:

Modules | Metricbeat Reference [8.0] | Elastic

It is very easy to configure these modules and collect logs from various systems. For example we can collect metrics from rabbitmq server with the following configuration:

#------------------------------ Rabbit Module ---------------------------------

- module: rabbitmq
  metricsets: ["node", "queue", "connection", "exchange"]
  enabled: true
  period: 10s
  hosts: ["rabbithost:15672"]
  username: admin
  password: rabbit-password

As we enabled setup.dashboards.enabled config, metricbeat loads ready dashboards to kibana. We can customise or create new views or dashboards. Some of the ready dashboards looks as follows:

From the observability -> inventory we can view hosts and/or containers with metrics.

We can check the the logs (if we are collecting) or metrics of any host or container by clicking on the views here.

As mentioned before we can collect metrics from the databases i.e. postgresql and view the logs as follows:

Index Lifecycle Management

One of the nice features of elasticsearch in the new versions is the ILM (Index Lifecycle Management) which provides out of box feature to clean up documents from the indexes and storage. As far as I know, for the older versions, people had to run a separate process (curator) to clean old records.

At the moment Elasticsearch provides the following phases for ILM:

  • Hot: The index is actively being updated and queried.
  • Warm: The index is no longer being updated but is still being queried.
  • Cold: The index is no longer being updated and is queried infrequently. The information still needs to be searchable, but it’s okay if those queries are slower.
  • Frozen: The index is no longer being updated and is queried rarely. The information still needs to be searchable, but it’s okay if those queries are extremely slow.
  • Delete: The index is no longer needed and can safely be removed.

ILM moves indices through the lifecycle according to their age. You can set how many days the records should stay in a phase and then it will go to the next phase and finally will be deleted (if configured so).

As we configured the metricbeat to collect the metrics per 10 second, it will pretty fast increase the disk size and collect GBs per day, in order to get rid of the older ones we can clean up via ILM policies. For further details please check the documentation:

Index lifecycle | Elasticsearch Guide [8.0] | Elastic

Setting Alerts For Metrics

To create an alert on metrics, we can go to Stack Management -> Alerts and Insights -> Rules and Connectors -> Create Rule. There are multiple rule types we can use, but I would like to show Metric Threshold type. We need to set a name for the rule, an interval to define how often to check, as the notify config we can select only on status change. We need to define which metrics to collect, for example: system.cpu.total.norm.pct average above 80%. We can also group by a field.

The actions in the free version is limited to Index and Server Log. In the paid versions of subscriptions there are many more options as Email, Slack, Jira, Webhook etc. As I am using the free version for now, I will select the Index action, which will write alert details to an index. When we select the alert action we need to define which fields from the alert event we want to record as a json document.

    "actionGroup": "{{alert.actionGroup}}",

We created new index to write the alert and specified that with a connector as follows:

If you check the alert-logs index we can see that triggered alerts are saved to this index.

If you are using free license and want to receive email alerts, you can use the following tool that I was preparing to trigger email alerts.

GitHub - turkogluc/elastic-email-alerts: Elasticsearch alerts email action
Elasticsearch alerts email action. Contribute to turkogluc/elastic-email-alerts development by creating an account on GitHub.

There are few environment variables to configure and run it. It is very light-weighted tool and can be used to send email alerts from indexes. An example email looks like as follows:

Collecting Logs

Metricbeat is used to collect the metrics, and to collect the logs we can use filebeat as follows:

version: '3.8'

    image: docker.elastic.co/beats/filebeat:7.17.0
    user: root
      - ./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro
      - /var/log/custom:/var/log/custom:ro

We can provide the configuration filebeat.docker.yml as follows:

setup.dashboards.enabled: true
setup.dashboards.beat: filebeat

#=========================== Filebeat inputs =============================


- type: log
  enabled: true
    - /var/log/custom/*.json

  json.keys_under_root: true
  json.overwrite_keys: true
  json.add_error_key: true
  json.expand_keys: true

  # Decode JSON options. Enable this if your logs are structured in JSON.
  # JSON key on which to apply the line filtering and multiline settings. This key
  # must be top level and its value must be string, otherwise it is ignored. If
  # no text key is defined, the line filtering and multiline features cannot be used.

  # By default, the decoded JSON is placed under a "json" key in the output document.
  # If you enable this setting, the keys are copied top level in the output document.
  #json.keys_under_root: false

  # If keys_under_root and this setting are enabled, then the values from the decoded
  # JSON object overwrite the fields that Filebeat normally adds (type, source, offset, etc.)
  # in case of conflicts.
  #json.overwrite_keys: false

  # If this setting is enabled, then keys in the decoded JSON object will be recursively
  # de-dotted, and expanded into a hierarchical object structure.
  # For example, `{"a.b.c": 123}` would be expanded into `{"a":{"b":{"c":123}}}`.
  #json.expand_keys: false

  # If this setting is enabled, Filebeat adds a "error.message" and "error.key: json" key in case of JSON
  # unmarshaling errors or when a text key is defined in the configuration but cannot
  # be used.
  #json.add_error_key: false

# ---------------------------- Elasticsearch Output ----------------------------
  enabled: true
  hosts: ["localhost:9200"]
  username: "elastic"
  password: "somethingsecret"

  host: "localhost:5601"
  username: "elastic"
  password: "somethingsecret"

We can run this on every host that we want to collect the logs and it will ship the logs. Note that logs are collected from /var/log/custom folder and only json files. The applications should write the logs within this folder.

Application Level Logging for Java Application

For the spring or java applications we can configure the application logging compatible with log collectors and elasticsearch with ecs-logging.

Get started | ECS Logging Java Reference [1.x] | Elastic

We need to add the following dependency:

implementation 'co.elastic.logging:logback-ecs-encoder:1.3.2'

And add the following logback.xml configuration:

<?xml version="1.0" encoding="UTF-8"?>
    <property name="LOG_FILE" value="${LOG_FILE:-spring.log}"/>
    <include resource="org/springframework/boot/logging/logback/defaults.xml"/>
    <include resource="org/springframework/boot/logging/logback/console-appender.xml" />
    <include resource="org/springframework/boot/logging/logback/file-appender.xml" />
    <include resource="co/elastic/logging/logback/boot/ecs-file-appender.xml" />
    <root level="INFO">
        <appender-ref ref="CONSOLE"/>
        <appender-ref ref="ECS_JSON_FILE"/>
        <appender-ref ref="FILE"/>

The LOG_FILE environment variable can be used to specify the path and file name. For example export $LOG_FILE=/var/log/custom/app.log and then there will be app.log.json file will be created and written as well. Therefore filebeat can collect the logs from the specific folder.