This article talks about best practices for logging in applications running in containerized environments.
Containers have been getting lot of attention and focus from IT companies for good reasons. Containers make it easy to deploy, manage and scale applications. These also provide us with reliability while moving our applications from one environment to another. Containers become even more handy in Microservices setup where each service can have its own container and can be deployed and scaled independently
However, while migrating to containers, one needs to revisit certain aspects of their applications or services such as Logging, Monitoring, Security etc.
In this article, we will talk about following approach to address Logging aspect of containerized applications to publish logs in a centralized logs system.
Container Logging Practices for Centralized Logging
While there will be more approaches for centralized logging in containers, we will just be focusing on below approaches -
Application based Logs Publishing
This approach is platform agnostic and does not get impacted directly by containerized environment. In this approach, one utilizes a logging framework to publish log events to a centralized system.
For example, if you are using Kafka and ELK based centralized system, and Logback logging framework in Java applications, you can utilize Logback Kafka Appender.
However, there are following challenges with this approach -
- If your centralized system is down, your application will either log on file system (as with Logback Kafka Appender) or ignore messages(very less likely). If your centralized system is not reliable enough, you will need to think about other approaches.
- It will add overhead to your containerized application, as an internal memory queue and a background thread will be running to publish logs over to Apache Kafka.
Mounted Data Volumes
Since containers are short-lived and transient in nature, all the data saved inside it will be lost if container goes down. In order to get persistent storage, data volumes mounted on host machine can be used to save logs.
This ensures that our logs are not removed with the containers. These mounted data volumes can then be configured in a log streaming service such as Loggly, Logstash etc to publish logs to centralized system.
However, this approach depends on features of container services as it may or may not (very less likely though) provide similar functionality. This approach also requires you to manage additional moving part (data volumes) if you are not already using it for some other purposes.
Dedicated Container per Host
This approach advises to have a dedicated container on a host for log collection and streaming. It couples your log streaming setup to a container service as it utilizes container service specific features to listen for log events from all other containers on a host.
This approach however reduces dependency on host and ensure that all logs processing is being handled in containerized environment itself.
Logs Handling Sidecar
Before we talk about log handling, let's understand what a Sidecar is. I personally like this simple yet elegant definition of Sidecar from this article here:
A sidecar application is deployed alongside each Microservice that you have
developed and deployed to a server/hosting instance. It is conceptually attached
to the "parent" service in the same manner a motorcycle sidecar is attached to
the motorcycle – hence the name.
A sidecar runs alongside your service as a second process and provides
'platform infrastructure features' exposed via a homogeneous interface such as
a REST-like API over HTTP.
"Platform infrastructure features" mentioned in above definition actually maps to our Logs handling problem in our context. Hence this approach calls for using another logs handling container for each of your application container. Your application container saves the log events and logs handling container publishes these to a centralized logs system.
However, this approach is only suitable for large deployments and make sense when you have more concerns to address in sidecar container such as logging, monitoring, inter-service communication etc. Moreover, setting up and maintaining sidecar containers along with application containers are complex in nature.
While there is no right absolute approach for logging in containerized applications, we can try to categorize recommendations based on applications nature.
If you would like to have more control on application side and have a pretty stable centralized system, you can go with Application based Logging approach. Likewise, if you would like to decouple your logs publishing from your application and doesn't mind any dependency on host, you can go ahead with Mounted Data Volumes approach.
On the other hand, if you need to remove dependency from host and doesn't have need very specialized local logs processing, you can go ahead with Dedicated Container per Host.
Lastly, if you have a rather large deployment and need to do specialized local logs processing, you can choose to go with Logs Handling Sidecar approach.
Thank you for reading through the tutorial. In case of any feedback/questions/concerns, you can communicate same to us through your comments and we shall get back to you as soon as possible.