Kubernetes and cost management - what do you need to know?

By Javier Martinez, DevOps Content Engineer, Sysdig.

  • Tuesday, 4th April 2023 Posted 1 year ago in by Phil Alsop

Containerisation has become increasingly popular for software deployment. With Docker containers now more than a decade old, and Gartner predicting that 90% of global organizations will be running containerized applications in production by 2026, running containers is now firmly mainstream. The firm also estimates that 20% of enterprise applications will run in containers by 2026. However, for an approach that is becoming so popular, there are still a lot of best practices teams are failing to adopt.

Because containers are designed to run independently of the hardware or cloud, they help developers to support running applications where the company wants. They also make it easier to scale an application up, as you can add more container nodes to a cluster and increase capacity. However, what is simple to describe is incredibly complex underneath. Using an orchestration tool like Kubernetes is necessary to take advantage of containers and manage them automatically.

In the rush to get all this working, there are common problems that can quickly crop up. These problems are not technical; instead, they can lead to you spending far more on your deployments than you need to.

What do containers cost to run?

When running containers, you can select the Limits and Requests that a container will have. Simply put, Limits and Requests provide a way of specifying the maximum and the guaranteed amount of a computing resource for a container, respectively. On top of these rules for each container, they can also indicate your intention for a particular process that the containers involved support. They can define the quality of service for the Pods running those containers, and how you will approach areas like eviction tier levels.

Setting these limits will make a difference to how your application performs, but it also affects how much it costs to run a set of application containers. However, according to a report the company I work for put out based on container deployments by customers, 49% of containers have no memory limits set while 59% of containers have no CPU limits set. In practice, setting CPU limits helps avoid starvation in processes or particular containers having drastic spikes of CPU consumption.

Some developers may choose to avoid setting a CPU limit as that can potentially lead to throttling. However, our analysis shows that on average, 69% of the purchased CPU was unused, suggesting that no capacity planning was in place. Similarly, containers will normally have a set amount of memory allocated to them - according to our data, 18% of memory is never used.

This is where there can be a big cost overhead compared to what you actually need. With so much capacity effectively unused, you will end up spending far more per container image than you need to. With hundreds or thousands of container images deployed, the additional cost per container can rapidly mount up.

Alongside the issue for individual containers, this can affect Kubernetes pods as well. If your pods are too big, you may spend extra effort debugging scheduling issues as it is harder for Kubernetes to schedule bigger Pods following your priorities.

Avoid problems around resource allocation

It’s also possible to over-spend based on how overall applications are implemented. If you don’t understand your capacity requirements, then you can easily increase your costs. However, it’s also possible to allocate resources just to avoid becoming saturated, which can lead to astronomical costs when the approach is wrong.

Why does this occur? One reason is the urge to scale the application quickly. This leads to more container images being created in response to demand, when actually they are not needed. Similarly, teams may not have monitoring or observability solutions in place for their applications, which can make it harder to see resource consumption compared to allocations. Another problem in this area is how to scale across multiple tenants, where it can be harder to know what number of container images are in place across each tenant compared to the number actually required.

Solving problems around container resources

In order to mitigate all these potential cost areas, working on your capacity planning skills is essential. Working on this can provide a very quick return on investment through reducing the costs for each container that you request and for your overall application infrastructure capacity. Looking at Limits and Requests can be useful to restrict usage, but it can be cumbersome as these tools can lead to Pod eviction or over-commitment. Alongside this, limitranges can be a useful tool to automatically assign a value range for limits and requests for all containers within a namespace. By allocating an average that is between 85 percent and 115 percent of your typical usage, you can set up your containers so they are at the right level for both resource and cost.

In order to make your life easier, you can automate scaling for your applications. This includes vertical autoscaling, so you can increase the resource size on demand, and horizontal autoscaling where you can increase or decrease the amount of Pods based on utilization. As an example, Kubernetes HorizontalPodAutoscaler (HPA) can dynamically adapt your deployment to what your application currently needs based on detailed performance information, aggregated using Kubernetes metadata across deployments, services and pods.

Lastly, if you want to run across multiple locations and providers with a multi-tenant approach, you will have to adapt your approach. For instance, you may then come across the problem that some of your projects are more demanding than others in terms of resources. Assigning the same resources across all your tenants might eventually cause overspending on one tenant compared to another that actually needs that level of resources. To manage this, ResourceQuotas can provide a simple way to set a maximum total amount of a resource to be consumed for all processes in a namespace.

Once you have looked at these approaches, you can then check the impact of the changes that you have made on your infrastructure and your applications, as well as on the cost that you incur to run those systems. Comparing your current CPU usage against the values from one week before can help you assess the impact of your optimizations, and check that you are getting the right balance of work completed against your spend.

With so much demand for containerised applications, developers and IT operations staff alike have to learn how to get the most out of this approach to running these applications in practice. Looking at capacity levels and how to improve performance is not just a case of adding more and more resources. Instead, checking on what your applications actually need can make your approach more efficient and more cost effective over time.