Why should we learn how to process application logs with Elastic Stack? After all, the default logging mechanism in Spring Boot allows us to start working on our POC in no time. However, we must be aware that inadequate logging makes debugging and monitoring difficult in a production environment.
What we are going to build
In this example we are going to work with the project described in the Spring Boot Log4j 2 advanced configuration #2 – add a Rollover Strategy for log files post and available in the spring-boot-log4j-2-scaffolding repository. To enhance the project with the Elastic Stack we’re going to add:
- FileBeat to read from a log file and pass entries to Logstash;
- Logstash to parse and send logs to Elasticsearch;
- Elasticsearch to keep indexed logs accessible to Kibana;
- Elastichq to monitor Elastic.
As a result, we will be able to process Spring Boot logs with Elastic Stack.
Process logs in Elastic Stack run with Docker
All services are configured in the
docker-compose.yml file which is attached to the project. Meanwhile, you can clone the repository and run
$ docker-compose up on your machine to verify results. Remember to start the Spring Boot app first, so that there are logs for Elastic Stack to process.
The example configuration is based on the documentation. I store all sensitive and configurable properties as environmental variables. In the same directory as the
docker-compose.yaml resides, create the file that contains the default values for the environment:
Run Elasticsearch with Docker
The container config is shown in the following snippet from the
To keep data between container restarts I set up a named volume on my machine. I mounted the content of the
/usr/share/elasticsearch/data (recommended in the docs and in this issue) to my
You can read about the details concerning the ES_JAVA_OPTS in the Setting JVM options for an ElasticSearch service run in a Docker container post. Let’s explore the rest in the following sections.
By default the security features are disabled. We want to run secured communication within the services. Therefore, we set the
xpack.security.enabled property to true and provide the credentials.
Production and development mode
When an Elasticsearch node is using the
single-node discovery it can’t form a cluster with another machine via a non-loopback address. Configuring the internal communication in this way means that the node is in the development node. We wan’t to work in this mode in order to disable bootstrap checks. In the development mode any failed check will be logged as a warning while in the production mode it will prevent the start of the application.
These bootstrap checks inspect a variety of Elasticsearch and system settings and compare them to values that are safe for the operation of Elasticsearch.https://www.elastic.co/guide/en/elasticsearch/reference/current/bootstrap-checks.html
Elasticsearch uses the http and transport ports. The former support incoming HTTP requests and the latter serves for communication between nodes. We’re going to run only one elasticsearch container, therefore we’ll expose only the
http port - 9200 to allow communication with Logstash and Kibana (to expose the APIs over HTTP). Check out the documentation on configuring the transport modules if you need to set up communication between nodes.
I’m going to keep all services running in the example project within one network –
internal. Feel free to configure networking according to your needs.
Run Elastichq with Docker
For monitoring Elasticsearch nodes we’re going to use ElasticHQ. It’s an opensource application that we can run using its docker image. This tool provides the REST API for managing clusters on the
http://localhost:5000/api url. To run the service with Docker I updated the
docker-compse.yaml file below:
To make sure that the
elasticsearch service will start before
elastichq, we use the depends_on property.
Connecting to Elasticsearch
After starting the container we can verify the results by visiting the default address http://localhost:5000. You can see the page on the screenshot below:
The default url visible in the input takes the value from the
HQ_DEFAULT_URL environment variable. The ElasticHQ format for Basic Auth requires adding the credentials for Elasticsearch. To make the default port (5000) available it is exposed in the
docker-compose.yaml file. After successful connection with the elasticsearch node we can see the following view:
Run Logstash with Docker
Furthermore, to ensure that we process logs properly within our Elastic Stack, we are going to transfer data through a Logstash pipeline.
logstash.conf file in which we’re going to specify and configure plugins for each pipeline section:
Logstash will expect incoming Beats connections on the 5044 port. We have to remember this when configuring the Filebeat output.
Applying filters allows us to parse and customise unstructured log data. With Grok filter plugin we can configure syntax and semantic to pull out useful fields from a log entry. Feel free to browse available patterns or create custom patterns if needed.
Every log entry is going to be matched against regular expressions and mapped according to the parts we want to extract. You don’t need to start your application to verify whether the pattern will work. Visit Grok Debugger, paste an example log line and your pattern to see the matches. As you can see I defined two matches – one for java exceptions and one for Spring Boot logs. You can learn more about it in the How to parse exceptions and normal logs with Grok filters post.
Custom grok patterns
match section you can see that I declared path to the file containing my custom grok patterns:
In my project the
patterns directory is located in the same place as the
logstash.conf file. Later, in the docker configuration you’ll see that I mount the
./logstash/pipeline directory to the
/usr/share/logstash/pipeline location in the container. Therefore, in the
patterns_dir option I put the resulting path to this file.
In my example I want to impose control over how events are processed by the filter:
- I merge two parts of the logger field as described in the Parsing logs with Grok #1 What to do when part of one field got caught in a different pattern post.
- I remove grok failure tag when an entry was successfully processed by one of my matches as described in the Parsing logs with Grok #2 How to parse exceptions alongside regular logs post.
This is the final step in our pipeline. We can declare multiple outputs to push data to different destinations and we have a wide range of available plugins to assist us in this task. In our example I’m going to use the elastisearch plugin. You can explore documentation to learn about all accessible options for this plugin. I decided to set the output options with environmental variables.
We use this parameter to reference either data or client nodes in Elasticsearch. I could pass an array of hosts to distribute requests across them, but for the sake of simplicity let’s just use one –
Debugging in console
You can add another output to see Logstash logs in the console:
If you use IntelliJ with Docker support you will be able to see the parsed log entries alongside regular Logstash logs:
To run this service we’re going to add the following lines to our
Applying pipeline config to the docker service using a volume
This service has to use the pipeline in the
logstash.conf file and our custom grok patterns. We’re going to mount the
./logstah/pipeline directory as a read-only volume to the
/usr/share/logstash/pipeline location in the container. Thanks to that, our container will be able to use our configuration file and custom patterns.
We need to expose the default Logstash port for Elasticsearch –
9600 as well as the
5044 port we already defined in the
logstash.conf file as the input port for data send by Filebeat.
We’re going to set the heap size with the same
JAVA_OPTS variable as in the Elasticsearch container. Furthermore, we have to set
credentials and define
hosts that will be applied in the
logstash.conf file. I decided to disable X-Pack Monitoring to keep this example as simple as possible. If you leave it enabled but not configured you will get the
Unable to retrieve license information from license server error.
Run Filebeat with Docker
Filebeat can read and forward log lines reliably even when it’s interrupted. Once everything works again, it starts from where it was when the failure occurred. Therefore, we can be sure that Elastic stack will process all logs.
In this example it will read log entries from the
all.log file. Remember that Filebeat doesn’t read the last line in a file if there is no new line after it. You can explore all configuration options in the filebeat.reference.yml file and learn more about this tool in the How Filebeat works docs.
Firstly, we need to configure our filebeat service to:
- read log entries from the
- concatenate lines of a stacktrace into one entry,
- send entries to our logstash service.
You can see the full setup in the snippet below:
In this section we tell Filebeat how it should locate and process data. We are enabling the log input that will read from the file that we specified in the paths option. Additionally, we’re concatenating Java stack trace into one entry by using the multiline option. Check out the Reading from rotating logs and Log rotation results in lost or duplicate events articles if you want to configure Filebeat to read from rotating log files.
Set config file permissions properly
We can read in the documentation, that:
The owner of the configuration files must be eitherhttps://www.elastic.co/guide/en/beats/libbeat/current/config-file-permissions.html#config-file-permissions
rootor the user who is executing the Beat process. The permissions on each file must disallow writes by anyone other than the owner.
To comply with this requirements for our
filebeat.yml file, we are going to customize the image configuration. Create the
Dockerfile file with the follwing content:
- Our configuration will be copied to the
/usr/share/filebeat/filebeat.ymllocation in the container.
- We switch temporarily to
rootto change ownership of this file: the
userownership is changed to the
groupownership is changed to the
- We want to remove write privilege for anyone except the owner.
- In the end, we can switch to the
|Don’t disable |
Configure the docker container
filebeat service is the last one we’re going to set up in the
docker-compose.yml file in this article.
filebeat.yml file we specified the path to our log file:
/logs/all.log. In the container config we have to mount this file to the given path. Thanks to this volume, Filebeat can access the logs as you can see on the screenshot below:
Verify that Elastic Stack process application logs
Let’s assume that we have the
all.log file with the following content:
I’m going to start all services with the following command run in the directory where my
docker-compose.yml file is located:
When all services are running I have to do the following:
- visit the Elastichq app running on the http://localhost:5000/,
- connect with the
http://elastic:test@elasticsearch:9200url in the form input,
Queryoption from the menu (top right corner),
- select the
spring-boot-app-logs-YYYY.MM.ddindex and execute the default query.
The results looks like on the screenshot below:
We can verify that filtering, parsing and mutating log entries work correctly. As you can see, we configured the Elastic Stack services to process our logs.
In addition, you can find the code that enables us to process logs with Elastic Stack in the commit 2a8068c2209c00605f2f24470c75b83ab712267c.
Learn more on how to take care of logs and process them with Elastic Stack
- Logging guide from OWASP
- Logging cheat sheet from OWASP
- Insufficient logging and monitoring impacts
- Using the grok debugger (YouTube video)
- Getting started with the Elastic Stack
- Getting started with Filebeat
- disabling Logstash monitoring: How to fix logstash error Unable to retrieve license information from license server
- enabling Logstash monitoring: Monitoring Logstash with X-Pack, Configuring Credentials for Logstash Monitoring