Kubernetes instance calculator

      • 64MB

      • 10m

      • 64MB

      • 10m

      • 64MB

      • 10m

    • Overcommitment

      • 99

        Mem

      • 99

        CPU

    • Efficiency %

      • 97

        Mem

      • 97

        CPU

    • Max pod count

      • 10

        Current

      • 110

        Limit

    • CostHourDayMonth
      Instance cost$0.00$0.00$0.00
      Cost per pod$0.00$0.00$0.00
      Kubelet cost$0.00$0.00$0.00
      Unused costs$0.00$0.00$0.00

    Instance

        • 7.0GiB or 1%

          0m or 0%

          MemoryCPU reserved to the eviction threshold

        • 7.0GiB or 1%

          0m or 0%

          MemoryCPU available to Pods

        • 7.0GiB or 1%

          0m or 0%

          MemoryCPU reserved for DeamonSets and agents

        • 7.0GiB or 1%

          0m or 0%

          MemoryCPU reserved to the kubelet

        • 7.0GiB or 1%

          0m or 0%

          MemoryCPU reserved to the operating system

        TL;DR: You can use the calculator to explore the best instance types for your cluster based on your workloads.

        Not all CPU and memory in your Kubernetes nodes can be used to run Pods.

        The node has to run processes such as the Kubelet, daemons such as Fluentd or kube-proxy, and the operating system.

        Kubernetes also reserved memory for the eviction threshold to evict workloads when there isn't enough space left on the node.

        Allocation of resources in a Kubernetes node

        But how much memory and CPU are reserved exactly?

        It depends, as every cloud provider has its own different rules.

        This calculator consolidates all of the settings for Azure, Google Cloud Platform and Amazon Web Services so that you can explore the resources available to your Pods.

        Table of contents

        Searching for the best Kubernetes node type

        The calculator lets you explore the best instance type based on your workloads.

        First, order the list of instances by Cost per Pod or Efficiency.

        Selecting Cost per Pod in the Kubernetes instance calculator

        Then, adjust the memory and CPU requests for your pod.

        Selecting the best instance type based on the current Pod request

        The best instance is the first on the list.

        What's the maximum number of Pods in AKS?

        Pods are limited to 250 per node in AKS.

        If you have 10 nodes, you can have up to 2500 Pods.

        Please note that the standard Pod limit in Kubernetes is 110 Pods.

        Why is that?

        Azure allocates a subnet with 254 hosts for each node (i.e. /24).

        Since IP addresses might be recycled when a Pod is deleted, Azure has 4 spare IP addresses that it can use while the IP is still in use.

        For example, if four pods are Terminating, four new pods can be created on the same node (the total is 254 — 246 pods Running: 4 Terminating and 4 in the Creating state).

        But you can't have 5 pods terminating and 5 creating.

        You can explore the limits for AKS and the rest of the Kubernetes managed services in this comparison.

        What's the maximum number of Pods in GKE?

        GKE nodes are limited to 110 Pods each.

        If you have 10 nodes, you can have up to 1100 Pods.

        Please note that this is also the default Pod limit in Kubernetes.

        Why does GKE allow only 110 pods per node?

        GKE allocates a subnet with 254 hosts for each node (i.e. /24).

        Since IP addresses might be recycled when a Pod is deleted, GKE has more than half of the IP addresses available while that happens.

        In GKE, you can have 110 pods in Terminating and 110 in Creating simultaneously (a total of 220 IP addresses).

        You can explore the limits for GKE and the rest of the Kubernetes managed services in this comparison.

        What's the maximum number of Pods in EKS?

        It depends.

        If you are not using the AWS-CNI or using a version older than 1.9.0, the max is 110 pods.

        If the AWS-CNI is 1.9.0 or greater, the number of Pods on an instance is dictated by the number of ENIs assigned to the instance.

        For example, an m5.large can have up to 58 pods.

        However, an m5.16xlarge can have up to 737 pods.

        You can find the list of ENIs for each EC2 instance here.

        You can explore the limits for EKS and the rest of the Kubernetes managed services in this comparison.

        Please note that the standard Pod limit in Kubernetes is 110 Pods.

        How do you compute overcommitment for a node?

        When you define your pods, you are encouraged to set requests and limits.

        1. The Kubernetes scheduler assigns (or not) a Pod to a Node based on its memory and CPU requests.
        2. The Kubelet uses the limits to terminate or throttle the Pod when it passes the threshold.

        While you can assign the same value for requests and limits, having requests lower than the actual limits is common.

        As a consequence, a process can grow its memory and CPU usage until it reaches the limit.

        Requests and limits constraints

        You are meant to assign CPU and memory requests that are close to the application usage.

        If the application occasionally burst into higher CPU or memory usage, that's fine.

        But what happens when all Pods use all resources to their limits?

        This scenario could lead to resources starvation.

        When the memory and CPU usage outgrows the resources on the node, it leads to:

        1. CPU throttling for the Pods and the processes on the node such as the kubelet.
        2. The total memory usage on the node will pass the eviction threshold, and the kubelet will start evicting pods.

        While this is designed to happen, it is something that you want to minimise to avoid disruptions.

        So what should you do?

        You can ensure that limits and requests are always in check and the ratio between limits and requests is not too high.

        The calculator uses the CPU and memory limits to calculate the memory and CPU usage in the worse possible case (i.e. what if all Pods use the max memory and CPU assigned?):

        OVERCOMMITMENT_CPU = (MAX_PODS * POD_LIMIT_CPU) / TOTAL_CPU_INSTANCE
        OVERCOMMITMENT_MEMORY = (MAX_PODS * POD_LIMIT_MEMORY) / TOTAL_MEMORY_INSTANCE

        Assigning proper requests and limits for your workloads is essential to control overcommitment.

        You can learn how to identify the correct values in this article.

        How do you compute efficiency for a node?

        The efficiency is computed as the total resources used by the pods over the memory and CPU available after removing reserved resources to Kubelet, operating system, etc.

        EFFICIENCY_CPU = TOTAL_CPU_PODs / (TOTAL_CPU - TOTAL_RESERVED_CPU)
        EFFICIENCY_MEMORY = TOTAL_MEMORY_PODs / (TOTAL_MEMORY - TOTAL_MEMORY_RESERVED)

        Efficiency is convenient to measure how effectively you are using memory and CPU.

        For example, with the efficiency of 50%, you use only 50% of the resource on your instance but still pay the total price.

        How do you compute the costs for Pods, kubelet and unused resources?

        The calculator tries to schedule as many Pods as possible in the current instance after reserving memory and CPU for deamonsets, kubelet, operating system, etc.

        The cost for a single Pod is computed as the total number of schedulable pods over the cost of the instance.

        POD_COST_$ = INSTANCE_COST_$ / MAX_SCHEDULABLE_PODs

        For example, if the instance costs $10 and you can schedule up to 5 pods, each pod costs $2 to run.

        The Kubelet cost is computed as the sum of all reserved resources and agents in the current instance.

        Memory OS
        Memory Kubelet             +
        Memory Eviction threshold  +
        Memory Daemonsets          +
        =============================
        Total reserved memory      =
        
        CPU OS
        CPU Kubelet             +
        CPU Daemonsets          +
        =============================
        Total reserved CPU      =
        
        KUBELET_CPU_% = (KUBELET_RESERVED_CPU / TOTAL_CPU)
        KUBELET_MEM_% = (KUBELET_RESERVED_MEMORY / TOTAL_MEMORY)
        
        KUBELET_COST = (KUBELET_CPU_% + KUBELET_MEM_%) / 2 * INSTANCE_COST

        The Unused cost tracks the cost of unutilised resources (i.e. resources that can't be used to run pods).

        The following formula explains how the number is derived:

        UNUSED_MEM = (MAX_SCHEDULABLE_PODs * POD_REQUEST_MEM) - MEMORY_AVAILABLE_PODS
        UNUSED_CPU = (MAX_SCHEDULABLE_PODs * POD_REQUEST_CPU) - CPU_AVAILABLE_PODS
        UNUSED_MEM_% = UNUSED_MEM / TOTAL_MEMORY
        UNUSED_CPU_% = UNUSED_CPU / TOTAL_CPU
        UNUSED_COST = (UNUSED_MEM_% + UNUSED_MEM_%) / 2 * INSTANCE_POST

        How do you compute the max number of Pods per node?

        First, from the instance memory and CPU, we remove all memory and CPU reserved for the kubelet, deamonsets, operating system, etc.

        OS
        Kubelet                +
        Eviction threshold     +
        Daemonsets             +
        =========================
        Used resources         =
        
        Instance
        Used resources         -
        =========================
        Available to Pods      =

        That should leave us with the memory and CPU available to pods.

        At this point, you can calculate the max number of pods as follows:

        MAX_PODS_CPU = CPU_AVAILABLE_PODS / POD_CPU_REQUESTs
        MAX_PODS_MEMORY = MEMORY_AVAILABLE_PODS / POD_MEMORY_REQUESTs

        Of those two numbers, you can only pick the lower.

        And not only that, but you also need to make sure that the value is less than the max number of Pods that the instance can run.

        MAX_PODS = MIN(MAX_PODS_CPU, MAX_PODS_MEMORY, INSTANCE_LIMIT_POD)

        Let's have a look at an example.

        Let's imagine that an instance type has enough memory available to pods to run 34 pods.

        If you consider the CPU available to the Pods, you can only have 13 pods.

        Finally, the instance can run at most 22 pods in total.

        The max number of schedulable pods is 13 in this case — i.e. MIN(34, 13, 22).

        How can I add more types of Pods?

        At the moment, you cannot.

        Adding more Pods will force you to choose how many pods of each type should be scheduled in the instance.

        Perhaps you want to have 3 pods of type A and 1 pod of type B.

        So why not approximate the above scenario with a single pod that has those constraints combined?

        As a workaround, you could have a dummy pod with the resources of Pod A combined with 1/3 resources of Pod B.

        The tool is meant to help you explore the trade-offs in selecting an instance size for your Kubernetes cluster.

        It's meant as an approximation — a production cluster might have more variables than what you can control in this calculator.

        Should I use only a type of node in the cluster?

        The tool helps you explore the trade-offs for different instance sizes, but it does not dictate how many other node pools you should have in your cluster.

        You could explore different instance types and decide to have node pools for each of them.

        How does a node affect scaling in Kubernetes?

        Choosing the right instance type and assigning proper requests and limits to your Pods directly impacts the scalability of your cluster.

        The Cluster Autoscaler doesn't look at memory or CPU available when it triggers the autoscaling.

        Instead, the Cluster Autoscaler reacts to events and checks for any unschedulable Pods every 10 seconds.

        When a pod is unschedulable, the cluster autoscaler triggers creating a new node.

        Triggering the autoscaler too often is usually an issue, so you want to:

        1. Provision nodes that are large enough for your workloads.
        2. Assign the correct requests to your workloads. Assign less or more will eventually lead to scaling too soon or too late.

        You can explore how the node instance size and requests affect scaling this article.

        Is there anything else I should consider when selecting a node type for Kubernetes?

        This calculator focuses on efficiency and costs.

        However, there are other features that you might need to consider, such as:

        1. If a node is lost, what's the impact on your workloads?
        2. What's the smallest increment when scaling up or down? What's the lead time?
        3. What's the availability for my application with fewer and larger nodes? What about a lot of smaller nodes?

        You can explore the answer to the above questions in this article on selecting a worker node size.

        Is there anything else I should know about AKS nodes?

        AKS does not recommend the following instances for an AKS cluster:

        How can I select the right instance type for my on-premise cluster?

        At the moment, this tool does not have a way to define custom instance sizes or prices.

        Is the data open-source?

        Of course, you can find the script that populates the dataset here.

        Contributing

        If you have an idea on how to improve or change the calculator, we'd be happy to hear it.

        Send an email to hello@learnk8s.io or chat with us on Telegram!

        Many thanks to everyone that contributed to the calculator!