Mastering the Challenges of Using ALB Ingress in Kubernetes

7 min readMay 24, 2023

In this blog post, we will discuss the reasons behind migrating from Istio to the Application Load Balancer (ALB) as the ingress controller in Kubernetes. We will highlight two challenges that arose during the transition:

Limited communication between ALB and Kubernetes
Kubernetes Service’s insufficient knowledge of pods

We will also explore the solutions we implemented to overcome these challenges and achieve a more efficient and maintainable architecture.

Why We Transitioned from Istio to ALB Ingress

Our previous cluster design involved a complex topology with Istio, which introduced significant complexity and numerous configurations. The lack of transparency in certain components made it difficult to maintain. As a result, we decided to move away from Istio and adopt native support from Kubernetes and AWS.

Old cluster design: Topologically complicated and configuration heavy

We chose the AWS Application Load Balancer (ALB) as our new solution. The new architecture is simpler, more controllable, and offers greater extensibility. With this design, we can easily establish communication and extend our clusters as needed.

New cluster design: Simple and extensible

However, during our setup with ALB as the ingress, we encountered two main challenges. In the following sections, we will delve into each of them and discuss the solutions we employed to address them.

Background: ALB and Kubernetes network abstraction

Before we start to talk about the challenges, let’s revisit how Kubernetes network handles the traffic. The following graph demonstrates the flow of traffic through different components in Kubernetes:

Traffic first enters the Kubernetes cluster through an Ingress controller, which manages external access to the services within the cluster. The Ingress controller then routes the traffic to the appropriate Kubernetes Service using Layer 7 protocols, which operate at the application layer. A Kubernetes Service is an abstraction layer that defines a set of Pods and provides a stable IP address and DNS name, making it easier to manage traffic distribution to the associated workloads. Finally, the traffic is directed to the Endpoint using Layer 4 protocols, which operate at the transport layer. The Endpoint represents one or more Pods in the Kubernetes cluster responsible for processing incoming requests.

Considering this abstraction model, let’s examine how the AWS ALB architecture integrates with Kubernetes. The ALB acts as an Ingress controller at the application layer, distributing incoming traffic to corresponding Target Groups. Each Target Group subscribes to a Kubernetes Service, ensuring the Endpoint events that are caused by pods’ scaling activities can be picked up. Finally, each Kubernetes Pod associated with a Service is represented as an IP target within the respective Target Group, allowing for efficient load balancing and management of traffic within the cluster. To maintain, the Target Groups and IP targets listen to Kubernetes events, enabling dynamic updates in response to changes in the Kubernetes environment.

AWS ALB implementaton of the network abstraction

Challenge 1: ALB and Kubernetes Communication

It is evident that the ALB components, i.e. the load balancer, target groups, and IP targets, are updated passively, giving rise to our first challenge: constrained communication between the ALB and Kubernetes. The distinct and independent management of target lifecycles by both systems complicates the alignment of lifecycle events, such as registration and deregistration. To tackle this challenge, we devised and implemented solutions that specifically address the unique issues arising at different stages of the process.

Registration Lifecycle Alignment

In Kubernetes, a pod may not be immediately ready to serve requests upon startup due to various reasons, such as initial configuration setup or startup tasks. To determine when a pod is ready to accept traffic, readiness probes are employed. A readiness probe is a user-defined check executed periodically by Kubernetes, often utilizing the application’s health check endpoint. Once the readiness probe succeeds, indicating that the pod is prepared to accept traffic, Kubernetes registers the pod to the corresponding service.

Similarly, in the ALB architecture, ALB periodically probes each target’s registered health check endpoint. Once the health check passes, the target is registered under the target group.

With the ALB Controller, pod startup events trigger target registration events in the ALB. However, it’s possible that ALB may take longer to register a target than Kubernetes, which can be problematic during rolling deployments. Consider the following simplified scenario:

There is only a single pod in the deployment;
The new pod is ready in Kubernetes but remains unregistered in the ALB;
Kubernetes shuts down the old pod since the new pod is said to be ready.

In this situation, there will be no available pods in the ALB’s target group.

To circumvent this issue, the ALB controller takes advantage of a feature in Kubernetes called Readiness Gate. By using a readiness gate, the ALB controller can link the end of the target registration process with the readiness probe. This ensures that the pod is ready to handle traffic before ALB starts routing traffic to it, preventing service unavailability during rollout deployment.

Illustration of aligned Pod and Target registration

Setting up the readinessGate injection for the ALB controller is quite straightforward. You just need to apply a label to the namespace hosting the ingress as follows:

kubectl label namespace your_namespace elbv2.k8s.aws/pod-readiness-gate-inject=enabled

For more information on configuring readinessGate with the ALB controller, you can refer to the ALB controller documentation.

Deregistration Lifecycle Alignment

During the deregistration process, we encountered a different alignment issue that seems to be an unresolved “bug.”

According to ALB documentation, the deregistration process in ALB begins by marking the target for deregistration and initiating the deregistration delay. This delay allows the target to complete processing any in-flight requests before it is ultimately removed. During this delay, the target should not receive any new traffic from the ALB.

However, our experience and an unresolved discussion on the ALB controller Github issue indicate that this is not the case: ALB still sends traffic even when a target is in the deregistration state. We are still uncertain about what is happening inside the ALB, but this behavior has been consistently reproducible to date.

To ensure a seamless shutdown process, we employed a preStop hook to delay the pod termination until after the ALB deregisters the target. Based on our observations, although the total time between the actual end of traffic acceptance termination and the start of the deregistration state varies, it generally falls within the range of 5 to 10 seconds. As a result, we set our preStop hook to 15 seconds to account for this delay:

lifecycle:
    preStop:
        exec:
            command:["/bin/sh", "-c", "sleep 15"]

Challenge 2: Kubernetes Service and Pod Knowledge

The second challenge we faced was the insufficient load balancing provided by Kubernetes Service. Kubernetes Service claims to provides round robin load balancing to the underlying pods. However, from our observation it was unable to efficiently distribute traffic across back-end services and their endpoints, leading to sticky connections and uneven traffic distributions. To address this issue, we introduced ALB ingress to our entire back-end service architecture. Below is a simplified example of how our architecture would treat requests before and after.

Before: ALB to entry services, Kubernetes Service to downstream services

With the help of ALB ingress, it was straightforward to add additional ingress rules to expose downstream services and their Target Groups, which directly manage its IP targets. Finally, we were able to achieve equal traffic distribution to each endpoint and manage traffic across different pods.

Conclusion

Migrating from Istio to ALB as our ingress controller in Kubernetes allowed us to simplify our architecture, improve control, and enhance extensibility. By addressing the challenges of limited communication between ALB and Kubernetes and insufficient pod knowledge in Kubernetes Service, we successfully implemented a more efficient and maintainable system. The introduction of ALB ingress provided better load balancing and traffic distribution across our back-end services, ultimately leading to a more reliable and robust infrastructure.

If you are interested in building Kubernetes and witnessing your talent being used by hundreds of engineers, thousands of merchants, and millions of users, we are the right place for you to unleash your potential. You can find us at talent@imprint.co or imprint.co/careers.