Why should we learn how to process application logs with Elastic Stack? After all, the default logging mechanism in Spring Boot allows us to start working on our POC in no time. However, we must be aware that inadequate logging makes debugging and monitoring difficult in a production environment.
What we are going to build
In this example we are going to work with the project described in the Spring Boot Log4j 2 advanced configuration #2 – add a Rollover Strategy for log files post and available in the spring-boot-log4j-2-scaffolding repository. To enhance the project with the Elastic Stack we’re going to add:
- FileBeat to read from a log file and pass entries to Logstash;
- Logstash to parse and send logs to Elasticsearch;
- Elasticsearch to keep indexed logs accessible to Kibana;
- Elastichq to monitor Elastic.
As a result, we will be able to process Spring Boot logs with Elastic Stack.
Process logs in Elastic Stack run with Docker
All services are configured in the docker-compose.yml
file which is attached to the project. Meanwhile, you can clone the repository and run $ docker-compose up
on your machine to verify results. Remember to start the Spring Boot app first, so that there are logs for Elastic Stack to process.
The example configuration is based on the documentation. I store all sensitive and configurable properties as environmental variables. In the same directory as the docker-compose.yaml
resides, create the file that contains the default values for the environment:
1 2 3 4 5 6 7 |
# .env COMPOSE_PROJECT_NAME=spring-boot-elastic_stack ELASTIC_STACK_VERSION=7.7.0 ELASTIC_USER=elastic ELASTIC_PASSWORD=test ELASTIC_HOST=elasticsearch:9200 JAVA_OPTS=-Xmx256m -Xms256m |
Using the COMPOSE_PROJECT_NAME variable is totally up to you – I just wanted to shorten the service names in the command line output. Browse image tags to see other available versions.
Run Elasticsearch with Docker
The container config is shown in the following snippet from the docker-compose.yaml
file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
# docker-compose.yml version: "3.3" services: elasticsearch: image: elasticsearch:$ELASTIC_STACK_VERSION volumes: - elasticsearch:/usr/share/elasticsearch/data environment: ES_JAVA_OPTS: $JAVA_OPTS ELASTIC_USER: $ELASTIC_USER ELASTIC_PASSWORD: $ELASTIC_PASSWORD xpack.security.enabled: "true" # Change discovery type to enable the production mode and bootstrap checks discovery.type: single-node ports: - "9200:9200" networks: - internal networks: internal: volumes: elasticsearch: |
Volume
To keep data between container restarts I set up a named volume on my machine. I mounted the content of the /usr/share/elasticsearch/data
(recommended in the docs and in this issue) to my elasticsearch
volume.
Environment variables
You can read about the details concerning the ES_JAVA_OPTS in the Setting JVM options for an ElasticSearch service run in a Docker container post. Let’s explore the rest in the following sections.
Security
By default the security features are disabled. We want to run secured communication within the services. Therefore, we set the xpack.security.enabled
property to true and provide the credentials.
Production and development mode
When an Elasticsearch node is using the single-node
discovery it can’t form a cluster with another machine via a non-loopback address. Configuring the internal communication in this way means that the node is in the development node. We wan’t to work in this mode in order to disable bootstrap checks. In the development mode any failed check will be logged as a warning while in the production mode it will prevent the start of the application.
These bootstrap checks inspect a variety of Elasticsearch and system settings and compare them to values that are safe for the operation of Elasticsearch.
https://www.elastic.co/guide/en/elasticsearch/reference/current/bootstrap-checks.html
Ports
Elasticsearch uses the http and transport ports. The former support incoming HTTP requests and the latter serves for communication between nodes. We’re going to run only one elasticsearch container, therefore we’ll expose only the http port - 9200
to allow communication with Logstash and Kibana (to expose the APIs over HTTP). Check out the documentation on configuring the transport modules if you need to set up communication between nodes.
Networks
I’m going to keep all services running in the example project within one network – internal
. Feel free to configure networking according to your needs.
Run Elastichq with Docker
For monitoring Elasticsearch nodes we’re going to use ElasticHQ. It’s an opensource application that we can run using its docker image. This tool provides the REST API for managing clusters on the http://localhost:5000/api
url. To run the service with Docker I updated the docker-compse.yaml
file below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# docker-compose.yml … services: … elastichq: image: elastichq/elasticsearch-hq:latest environment: - HQ_DEFAULT_URL=http://$ELASTIC_USER:$ELASTIC_PASSWORD@$ELASTIC_HOST ports: - "5000:5000" networks: - internal depends_on: - elasticsearch … |
To make sure that the elasticsearch
service will start before elastichq
, we use the depends_on property.
Connecting to Elasticsearch
After starting the container we can verify the results by visiting the default address http://localhost:5000. You can see the page on the screenshot below:
The default url visible in the input takes the value from the HQ_DEFAULT_URL
environment variable. The ElasticHQ format for Basic Auth requires adding the credentials for Elasticsearch. To make the default port (5000) available it is exposed in the docker-compose.yaml
file. After successful connection with the elasticsearch node we can see the following view:
You can also apply connection with SSL, change logging setup or externalize the configuration.
Run Logstash with Docker
Furthermore, to ensure that we process logs properly within our Elastic Stack, we are going to transfer data through a Logstash pipeline.
Pipeline
Create the logstash.conf
file in which we’re going to specify and configure plugins for each pipeline section:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
# logstash/pipeline/logstash.conf input { beats { port => 5044 } } filter { grok { match => { "message" => "%{JAVACLASS:exception}:\s%{GREEDYDATA:stacktrace}" } add_tag => ["stacktrace"] } grok { patterns_dir => ["/usr/share/logstash/pipeline/patterns"] match => { "message" => "%{TIMESTAMP_ISO8601:log_timestamp}\s*%{LOGLEVEL:log_level}\s*%{POSINT:logged_for_pid}\s*--- \[+%{NOTSPACE:logged_for_thread}+\]\s*%{JAVACLASS:logger}%{GREEDYDATA:loggercd}\s*%{MSG:log_message}" } add_tag => ["spring_boot_log"] } if [loggercd] { mutate { replace => { "logger" => "%{logger}%{loggercd}" } strip => ["logger"] remove_field => ["loggercd"] } } if "stacktrace" in [tags] or "spring_boot_log" in [tags] { mutate { remove_tag => ["_grokparsefailure"] } } } output { elasticsearch { hosts => ["${ELASTIC_HOSTS}"] user => "${ELASTIC_USER}" password => "${ELASTIC_PASSWORD}" index => "spring-boot-app-logs-%{+YYYY.MM.dd}" } } |
Logstash documentation contains other example confiturations to illustrate how you can create a more advanced setup. Let’s take a look at the structure of our config file.
input
Logstash will expect incoming Beats connections on the 5044 port. We have to remember this when configuring the Filebeat output.
filter
Applying filters allows us to parse and customise unstructured log data. With Grok filter plugin we can configure syntax and semantic to pull out useful fields from a log entry. Feel free to browse available patterns or create custom patterns if needed.
match
Every log entry is going to be matched against regular expressions and mapped according to the parts we want to extract. You don’t need to start your application to verify whether the pattern will work. Visit Grok Debugger, paste an example log line and your pattern to see the matches. As you can see I defined two matches – one for java exceptions and one for Spring Boot logs. You can learn more about it in the How to parse exceptions and normal logs with Grok filters post.
Custom grok patterns
In the match
section you can see that I declared path to the file containing my custom grok patterns:
1 2 |
# logstash/pipeline/patterns/custom.txt MSG \s:.* |
In my project the patterns
directory is located in the same place as the logstash.conf
file. Later, in the docker configuration you’ll see that I mount the ./logstash/pipeline
directory to the /usr/share/logstash/pipeline
location in the container. Therefore, in the patterns_dir
option I put the resulting path to this file.
Conditions
In my example I want to impose control over how events are processed by the filter:
- I merge two parts of the logger field as described in the Parsing logs with Grok #1 What to do when part of one field got caught in a different pattern post.
- I remove grok failure tag when an entry was successfully processed by one of my matches as described in the Parsing logs with Grok #2 How to parse exceptions alongside regular logs post.
output
This is the final step in our pipeline. We can declare multiple outputs to push data to different destinations and we have a wide range of available plugins to assist us in this task. In our example I’m going to use the elastisearch plugin. You can explore documentation to learn about all accessible options for this plugin. I decided to set the output options with environmental variables.
hosts
We use this parameter to reference either data or client nodes in Elasticsearch. I could pass an array of hosts to distribute requests across them, but for the sake of simplicity let’s just use one – elasticseach:9200
.
user, password
We enabled the Elasticsearch security in the container config. Therefore, my logstash service has to use the username and password that will be set for elasticsearch.
index
Logstash will write logs under this index. I decided to make it dynamic by combining the spring-boot-app-logs
prefix with event timestamp formatted according to Joda format.
Debugging in console
You can add another output to see Logstash logs in the console:
1 2 3 4 5 6 7 8 |
logstash/pipeline/logstash.conf … output { stdout { codec => rubydebug } … } |
If you use IntelliJ with Docker support you will be able to see the parsed log entries alongside regular Logstash logs:
Docker container
To run this service we’re going to add the following lines to our docker-compose.yml
file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
# docker-compose.yml … services: … logstash: image: logstash:$ELASTIC_STACK_VERSION ports: - "5044:5044" - "9600:9600" environment: LS_JAVA_OPTS: $JAVA_OPTS ELASTIC_USER: $ELASTIC_USER ELASTIC_PASSWORD: $ELASTIC_PASSWORD ELASTIC_HOSTS: $ELASTIC_HOST XPACK_MONITORING_ENABLED: "false" volumes: - ./logstash/pipeline:/usr/share/logstash/pipeline:ro networks: - internal depends_on: - elasticsearch … |
Applying pipeline config to the docker service using a volume
This service has to use the pipeline in the logstash.conf
file and our custom grok patterns. We’re going to mount the ./logstah/pipeline
directory as a read-only volume to the /usr/share/logstash/pipeline
location in the container. Thanks to that, our container will be able to use our configuration file and custom patterns.
Ports
We need to expose the default Logstash port for Elasticsearch – 9600
as well as the 5044
port we already defined in the logstash.conf
file as the input port for data send by Filebeat.
Environment variables
We’re going to set the heap size with the same JAVA_OPTS
variable as in the Elasticsearch container. Furthermore, we have to set credentials
and define hosts
that will be applied in the logstash.conf
file. I decided to disable X-Pack Monitoring to keep this example as simple as possible. If you leave it enabled but not configured you will get the Unable to retrieve license information from license server
error.
Run Filebeat with Docker
Filebeat can read and forward log lines reliably even when it’s interrupted. Once everything works again, it starts from where it was when the failure occurred. Therefore, we can be sure that Elastic stack will process all logs.
In this example it will read log entries from the all.log
file. Remember that Filebeat doesn’t read the last line in a file if there is no new line after it. You can explore all configuration options in the filebeat.reference.yml file and learn more about this tool in the How Filebeat works docs.
Configuration
Firstly, we need to configure our filebeat service to:
- read log entries from the
all.log
file, - concatenate lines of a stacktrace into one entry,
- send entries to our logstash service.
You can see the full setup in the snippet below:
1 2 3 4 5 6 7 8 9 10 11 12 |
# filebeat/filebeat.yml filebeat: inputs: - type: log paths: - /logs/all.log multiline: pattern: '^[[:space:]]+(at|\.{3})[[:space:]]+\b|^Caused by:' match: after output: logstash: hosts: ["logstash:5044"] |
inputs
In this section we tell Filebeat how it should locate and process data. We are enabling the log input that will read from the file that we specified in the paths option. Additionally, we’re concatenating Java stack trace into one entry by using the multiline option. Check out the Reading from rotating logs and Log rotation results in lost or duplicate events articles if you want to configure Filebeat to read from rotating log files.
output
We’re going to configure Filebeat to use Logstash. For the sake of simplicity I specified only one entry in the hosts options and used the default 5044
port.
Set config file permissions properly
We can read in the documentation, that:
The owner of the configuration files must be either
https://www.elastic.co/guide/en/beats/libbeat/current/config-file-permissions.html#config-file-permissionsroot
or the user who is executing the Beat process. The permissions on each file must disallow writes by anyone other than the owner.
To comply with this requirements for our filebeat.yml
file, we are going to customize the image configuration. Create the Dockerfile
file with the follwing content:
1 2 3 4 5 6 7 8 |
# filebeat/Dockerfile ARG ELASTIC_STACK_VERSION FROM docker.elastic.co/beats/filebeat:${ELASTIC_STACK_VERSION} COPY filebeat.yml /usr/share/filebeat/filebeat.yml USER root RUN chown root:filebeat /usr/share/filebeat/filebeat.yml RUN chmod go-w /usr/share/filebeat/filebeat.yml USER filebeat |
- Our configuration will be copied to the
/usr/share/filebeat/filebeat.yml
location in the container. - We switch temporarily to
root
to change ownership of this file: theuser
ownership is changed to theroot
and thegroup
ownership is changed to thefilebeat
. - We want to remove write privilege for anyone except the owner.
- In the end, we can switch to the
filebeat
user.
Don’t disable strict permission check and don’t run the container as root to fix the ownership issue. Use the custom image instead. |
Configure the docker container
The filebeat
service is the last one we’re going to set up in the docker-compose.yml
file in this article.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
# docker-compose.yml … services: … filebeat: build: context: ./filebeat args: ELASTIC_STACK_VERSION: $ELASTIC_STACK_VERSION volumes: - ./logs/all.log:/logs/all.log:ro networks: - internal depends_on: - logstash … |
In the filebeat.yml
file we specified the path to our log file: /logs/all.log
. In the container config we have to mount this file to the given path. Thanks to this volume, Filebeat can access the logs as you can see on the screenshot below:
Verify that Elastic Stack process application logs
Let’s assume that we have the all.log
file with the following content:
1 2 3 4 5 6 |
2020-05-12 08:29:20.290 ERROR 10197 --- [main] s.SpringBootLog4j2ScaffoldingApplication : Logger exception message java.lang.IllegalArgumentException: Exception message at in.keepgrowing.springbootlog4j2scaffolding.SpringBootLog4j2ScaffoldingApplication.main(SpringBootLog4j2ScaffoldingApplication.java:14) [classes/:?] 2020-05-12 08:31:26.530 INFO 10197 --- [SpringContextShutdownHook] o.s.s.c.ThreadPoolTaskExecutor : Shutting down ExecutorService 'applicationTaskExecutor' |
I’m going to start all services with the following command run in the directory where my docker-compose.yml
file is located:
1 |
$ docker-compose up |
When all services are running I have to do the following:
- visit the Elastichq app running on the http://localhost:5000/,
- connect with the
http://elastic:test@elasticsearch:9200
url in the form input, - choose
Query
option from the menu (top right corner), - select the
spring-boot-app-logs-YYYY.MM.dd
index and execute the default query.
The results looks like on the screenshot below:
We can verify that filtering, parsing and mutating log entries work correctly. As you can see, we configured the Elastic Stack services to process our logs.
In addition, you can find the code that enables us to process logs with Elastic Stack in the commit 2a8068c2209c00605f2f24470c75b83ab712267c.
Learn more on how to take care of logs and process them with Elastic Stack
- Logging guide from OWASP
- Logging cheat sheet from OWASP
- Insufficient logging and monitoring impacts
- Using the grok debugger (YouTube video)
- Getting started with the Elastic Stack
- Getting started with Filebeat
- disabling Logstash monitoring: How to fix logstash error Unable to retrieve license information from license server
- enabling Logstash monitoring: Monitoring Logstash with X-Pack, Configuring Credentials for Logstash Monitoring
Photo by Alex Knight on StockSnap