Category: Kubernetes

Everything You Need To Know About Monitoring Kubernetes

It can be rather challenging for legacy monitoring tools to monitor ephemeral and fast moving environments like Kubernetes. The good news is, there are many new solutions that can help you with this.

When you are carrying out a monitoring operation for Kubernetes, it is crucial to make sure that all the components are covered, including pod, container, cluster, and node. Also, there should be processes in place to combine the results of the monitoring and creating reports in order to take the correct measures.

Before moving forward, your DevOps team has to understand that monitoring a distributed system like Kubernetes is completely different from monitoring a simple client-server network. That is because monitoring a network address or a server gives much less relevant information as compared to microservices.

Since Kubernetes is not self-monitoring, you need to come up with a monitoring strategy even before you choose the tool that can help you execute the strategy.

An ideal strategy will have highly tuned Kubernetes that can self-heal and protect itself against any downtime. The system will be able to use monitoring to identify critical issues before they even arise and then resort to self-healing.

Choosing Monitoring Tools And When To Monitor

Every monitoring tool available for Kubernetes has its own set of pros and cons. That is why there is no one right answer for all types of requirements. In fact, most DevOps teams prefer to use a combination of monitoring tool sets to be able to monitor everything simultaneously.

Another thing to note is that many DevOps teams often think about monitoring requirements much too late in the entire development process, which proves to be a disadvantage. To implement a healthy DevOps culture, it is best to start monitoring early in the development process and there are other factors that need to be considered as well.

Monitoring during development – All the features being developed should include monitoring, and it should be handled just like the other activities in the development phase.

Monitoring non-functionals – Non-functionals like requests per second and response times should also be monitored which can help you identify small issues before they become big problems.

There are two levels to monitor the container environment of Kubernetes.

  • Application process monitoring (APM) – This scans your custom code to find and locate any errors or bottlenecks
  • Infrastructure monitoring – This helps collect metrics related to the container or the load like available memory, CPU load, and network I/O

Monitoring Tools Available For Kubernetes

1. Grafana-Alertmanager-Prometheus

The Grafana-Alertmanager-Prometheus (GAP) is a combination of three open source monitoring tools, which is flexible and powerful at the same time. You can use this to monitor your infrastructure and at the same time create alerts as well.

Part of the Cloud Native Computing Foundation (CNCF), Prometheus is a time series database. It is able to provide finely grained metrics by scraping data from the data points available on hosts and storing that data in its time series database. Though the large amount of data it captures also becomes one of its downsides since you cannot use it alone for long-term reporting or capacity management. That is where Grafana steps in since it allows you to front the data which is being scraped by Prometheus.

To connect to Grafana, all you have to do is add the Prometheus URL as a data source and then import the databases from it. Once that integration is done, you can connect Alertmanager with it to get timely monitoring alerts.

2. Sysdig

Sysdig is an open source monitoring tool which provides troubleshooting features but it is not a full-service monitoring tool and neither can it store data to create trends. It is supported by a community, which makes Sysdig an affordable option for small teams.

Sysdig can implement role based access control RBAC into tooling, it has an agent-based solution for non-container platforms, and it also supports containerized implementation. Since it already has a service level agreement implementation, it allows you to check and get alerts according to response times, memory, and CPU. There are canned dashboards that you can use.

The only downside to using Sysdig is that to use it you need to install kernel headers. Though, the process has now been made much simpler with Sysdig able to rebuild a new module everytime a new kernel is present.

While Sysdig is the free and open source, there is premium support available for enterprise customers through Sysdig Monitor, and there are two types of solutions available- SaaS full-service monitoring and on-premise.

3. DataDog

DataDog is a SaaS-only monitoring tool which integrates APM into its services. It provides flexibility with alert monitoring, and you also get access to dashboards through a UI. It can also provide APM for Python, Go, Ruby, and there will soon be a Java support as well.

You can connect DataDog to cloud provider environments, and it can consume data from sources like New Relic and Nagios. With its dashboards, you can overlay graphs, and it can process other APIs which expose data.

There is a service discovery option which allows DataDog to monitor dockerized containers across environments and hosts continuously.

In conclusion

As we mentioned above, monitoring Kubernetes is not an option but a crucial requirement, and it should be implemented right from the development phase.

Why Companies Like Netflix, Uber, And Amazon Are Moving Towards Microservices

In case you have been living under a rock, let us break the news for you- Monolith is out, and now most internet companies have moved towards microservice architecture including Netflix, Uber, Amazon, and Apple.

There were many reasons for the shift, but the most important one was that monolithic is a big autonomous unit and handling it becomes more difficult by the day as the monolith grows bigger when you add new functionalities. Even the smallest change or bug fix requires a complete re-coding and re-deploying of a new version of the application. With microservices, all the processes get simplified, scalable, and streamlined as all the functionalities are divided into independent units.

When we focus on the industry disrupting companies like Airbnb, Uber, and Netflix, we see organizations that are continuously building custom software to get a competitive edge. In fact, many of the companies are not even core technology companies. Instead they use software to provide unique offerings. The results, as we know, drive great revenue for the companies.

Why Microservices Is A Better Option For Breaking Down Monoliths

Even with all the various open source tools and products, maintaining and deploying applications on the cloud is still difficult and time-consuming. Since most of these companies were launched over six to seven years ago, they had no other option but to create their own cloud platform on raw infrastructure.

There was a need for management layers between the applications and cloud infrastructure that they were being created. But it still proved to be better than the monolith architecture, since with microservices they can manage all the different operations of the application separately. So, even if a part of the application is down or needs bug fixes, the rest of the application will still be up and running with no downtime.

Let’s discover how companies like Netflix, Uber, and Amazon are moving towards microservices:

Netflix

In 2009, when Netflix started to migrate a monolithic infrastructure to a microservices one, the term ‘microservices’ didn’t even exist anywhere. Working on a monolithic architecture was proving to be difficult for the company with every passing day and the service would have outages whenever Amazon’s servers went down. After moving to microservices, Netflix’s engineers could deploy thousands of different code sections every day.

Forced to write their own entire platform on the cloud, the company has been pretty open about what they learned with the move, and they have even managed to open source many of the components and tools to help the community. Though Netflix hasn’t put up their entire platform code on Github, which could also help new companies. Overall, moving to microservices was incredibly beneficial for Netflix, and it has led to decrease their application’s downtime to a large extent.

Amazon

Back when Amazon was operating on a monolithic architecture, it was difficult for the company to predict and manage the fluctuating website traffic. In fact, the company was losing a lot of money as most of the server capacity was being wasted in downtime. Back in 2001, Amazon’s application was one big monolith.

Even though it was divided into different tiers and those tiers had different components, they were tightly coupled with each other, and they behaved like a monolith. The main focus of the developers was to simplify the entire process, and for that, they pulled out functional units from the code and wrapped them in a web service interface. For instance, there was a separate microservice was calculated the total tax at check out.

The company’s move to Amazon Web Services (AWS) cloud for microservices helped them scale up or down according to the traffic, handle outages better, and save costs as well. Since microservices allows to deploy code continuously, engineers at Amazon now deploy code every 11.7 seconds.

Uber

Just like any other startup, Uber too started with a monolithic architecture for their application. At that point, it seemed cleaner to go with a monolithic core since the company was just operating in San Francisco and only offered the UberBLACK option to users.

But as the ride-sharing startup grew multi-fold, they decided to follow the path of other companies like Amazon, Netflix, and Twitter and moved to microservices. The biggest advantage of migration was, of course, the fact that each microservice can have its own language and framework.

Now, with more than 1300 microservices, Uber focuses on applying microservices patterns that can improve scalability and reliability of the application. With so many microservices, a big focus is also on identifying the ones that are old and not in use anymore. That is why the team always ensures to decommission the old ones regularly.

In conclusion

While its natural for new companies to take the monolithic-first approach because its quick and you can deploy quickly as well, over time, as the monolith gets bigger, breaking it down into microservices becomes the most convenient solution.