Horizontal Scaling on Kubernetes Clusters Based on AWS CloudWatch Metrics with KEDA

Ramazan Taplamacı
9 min readDec 31, 2022

Selam, bu makalenin Türkçe versiyonuna buradan ulaşabilirsin. İyi eğlenceler!!

Today, almost all product teams run or plan to run their product workloads on Kubernetes. At this point, to look at why teams choose Kubernetes; Kubernetes simplifies management and integrations for workloads, while providing fast and flexible action, easy scalability, flexibility and reliability. In short, it greatly reduces the effort of the management and maintain activities you will do.

In this article; Regarding the scalability provided by Kubernetes, we will be examining the KEDA tool, which gives Kubernetes the ability to horizontally scale the workloads running in the clusters by the metrics accessed via AWS CloudWatch.

What is KEDA?

KEDA
Image-1 KEDA

KEDA; stands for “Kubernetes Event-driven Autoscaling”, it is basically an open source tool that gives Kubernetes Clusters the ability to auto-scalate horizontally based on metrics on external resources. Those who are currently using HPA (Horizontal Pod Autoscaler) to provide horizontal scalability in K8s Clusters will be quite familiar with this tool, because the KEDA basically creates and manages the HPA. However, unlike HPA, KEDA can provide the scaling process based on the metrics of external sources as well as the metrics specific to the cluster.

Also KEDA has been accepted as a CNCF Incubation project as of March 12, 2020 and is at the maturity level.

What are KEDA Components and How Does It Work?

Image-2 KEDA Components and Architecture

KEDA; It basically consists of components called “keda-operator” and “keda-operator-metrics-apiserver”. It also consists of K8s CRDs (Custom Resource Definition) such as “ScaledObjects”, “ScaledJobs” and “TriggerAuthentication / ClusterTriggerAuthentication” in addition to the two related components. You can access the KEDA architecture via Image-2. If it is necessary to examine the relevant components;

“keda-operator”: It is simply responsible for dynamically scaling the workload to be scaled.

“keda-operator-metrics-apiserver”: Responsible for reading metrics from external sources and providing relevant metrics such as K8s “metric-server”. “keda-operator” scales the workload it tracks by this metrics.

“ScaledObjects”: Allows the external metric or event source to match the object to be scaled dynamically. This object can be a Deployment, StateFullSet or CRD.

“ScaledJobs”: Allows matching of K8s Job to be dynamically scaled with external metric or event source.

“TriggerAuthentication / ClusterTriggerAuthentication”: This CRD contains the authentication configurations or K8s “Secret” that should be used to monitor the external metric or event source.

Supported Event or Metric Sources by KEDA

KEDA can use various tools of major Cloud providers such as Amazon Web Services (a.k.a AWS), Microsoft Azure and Google Cloud Platform (a.k.a GCP) as event or metric source, as well as other popular tools such as Prometheus, Redis, RabbitMQ, PostgreSQL, MongoDB, MYSQL. You can find all currently supported sources at the following address:

Example Scenario and PoC

As mentioned earlier in this article, scaling will be done based on metrics on AWS CloudWatch. To be more specific; By using the AWS CloudWatch service, which is the service where you can monitor almost all services running on AWS, as a metric source, we will see that basically many scenarios are applicable. In this PoC, we will dynamically scale our Consumer workloads on Amazon EKS K8s cluster based on the number of items on Amazon SQS. Below you can find the requirements and the steps to be implemented:

Requirements:

  • Basic knowledge of AWS IAM,
  • AWS account and Amazon SQS,
  • K8s cluster (≥ Kubernetes 1.20),
  • “aws-cli” (for Amazon EKS users), “kubectl” and “Helm3” installed and configured in local environment

Steps to be taken:

  1. Making authentication configurations for AWS CloudWatch Metrics Server.
  2. Deployment of KEDA to Amazon EKS K8s cluster.
  3. Configuring the “ScaledObject” and deploying it to the “namespace” where the Consumers are running.
  4. Testing the created infrastructure.

AWS CloudWatch Metrics Server Access Configurations

Since I use Amazon EKS as K8s cluster and AWS Ec2 Instance as worker node in this PoC; In order to provide access to read metrics via AWS CloudWatch, I will be providing access to the AWS IAM Role that I have defined to the relevant instances. For this process, I create the following AWS IAM Policy and add it to the mentioned AWS IAM Role.

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"cloudwatch:GetMetricData"
],
"Resource": "*"
}
]
}

If you are going to do the related work on the K8s cluster running in the on-prem environment, you can provide the authentication and access processes with the AWS IAM User AccessKey on the “ScaledObject” or with the “TriggerAuthentication / ClusterTriggerAuthentication” CRD.

Deployment of KEDA to K8s Cluster

Before the deployment process, we create the K8s “namespace” that KEDA will run. You can do this with the following command:

$ kubectl create ns keda
Image-3 Creating namespace

After creating the relevant “namespace”, we deploy KEDA to the cluster. I used the Helm Chart provided by KEDA for the deploy process, you can do the deployment process by following the commands below:

$ helm repo add kedacore https://kedacore.github.io/charts
$ helm repo update
$ helm install keda kedacore/keda --namespace keda
Image-4 Deploy KEDA to K8s cluster

If you don’t want to use Helm, you can find the YAML configurations here (https://keda.sh/docs/2.8/deploy/#yaml).

With the command below, you can see the deployed “keda-operator” and “keda-operator-metrics-apiserver” “pods”.

$ kubectl get pod -n keda
Image-5 “keda-operator” and “keda-operator-metrics-apiserver” pods

Configuring and Deploying ScaledObject

As mentioned earlier, “ScaledObject” helps provide the operator with the matching of the K8s object to be dynamically scaled with the external metric or event source, HPA configurations, and definitions for authentication if necessary. Briefly; It is the CRD where customizations such as the object to be scaled, metric or event source, threshold values, scaling size and period are made. Below you can see a sample “ScaledObject” configuration.

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: aws-cloudwatch-keda-scaledobject
namespace: rtaplamaci
spec:
scaleTargetRef:
name: keda-poc-consumer
minReplicaCount: 0 # We don't want pods if the queue is empty
maxReplicaCount: 3 # We don't want to have more than 5 replicas
pollingInterval: 30 # How frequently we should go for metrics (in seconds)
cooldownPeriod: 10 # How many seconds should we wait for downscale
triggers:
- type: aws-cloudwatch
metadata:
# Required: namespace
namespace: AWS/SQS
# Optional: Dimension Name
dimensionName: QueueName
# Optional: Dimension Value
dimensionValue: keda-poc-rtaplamaci-sqs
# Optional: Expression query
expression: SELECT MAX(ApproximateNumberOfMessagesVisible) FROM "AWS/SQS" WHERE QueueName = 'keda-poc-rtaplamaci-sqs'
metricName: ApproximateNumberOfMessagesVisible
targetMetricValue: "10"
minMetricValue: "2"
# Required: region
awsRegion: "us-east-1"
identityOwner: operator # Optional. Default: pod
# Optional: Collection Time
metricCollectionTime: "60" # default 300
# Optional: Metric Statistic
metricStat: "Maximum" # default "Average"
# Optional: Metric Statistic Period
metricStatPeriod: "60" # default 300
# Optional: Metric Unit
metricUnit: "Count" # default ""
# Optional: Metric EndTime Offset
metricEndTimeOffset: "60" # default 0

You can find what the variables in this configuration do and the constraints in the configuration file below;

apiVersion: keda.sh/v1alpha1
kind: "Type of CRD(ScaledObject)",
metadata:
name: "The name to be given to the corresponding ScaledObject",
namespace: "The ScaledObject and the K8s namespace where the dynamically scaled object runs",
spec:
scaleTargetRef: "Object reference to scale dynamically",
apiVersion: "API version used by the object to be scaled dynamically (Optional, Default: apps/v1)",
kind: "Type of object to scale dynamically (Optional, Default: Deployment)",
name: "Name of the object to be scaled dynamically (Required, must be in the same namspace as ScaledObject)",
envSourceContainerName: "Container from which environment variables to be used in scaling are taken (Optional, Default: .spec.template.spec.containers[0])",
minReplicaCount: "Minimum number of pods that HPA can scale (Optional, Default: 0)",
maxReplicaCount: "Maximum number of pods that HPA can scale (Optional, Default: 100)",
pollingInterval: "Period to check the metric/event source to trigger the scaling action (Optional, Default: 30 seconds)",
cooldownPeriod: "The time that the ScaledObject was last specified as active based on the metric/event source and the period HPA will wait before reducing the pod count to 0 (Optional, Default: 300 seconds)",
triggers: "Metric/event source",
- type: "Type of trigger",
metadata:
namespace: "AWS CloudWatch Namespace (Required)",
dimensionName: "Key of Metric (Source) identifier (Optional, Mandatory if Expression is not specified)",
dimensionValue: "Value of Metric (Source) identifier (Optional, mandatory if Expression is not specified)",
expression: "Metric query (Optional, required if dimensionName and/or dimensionValue are not specified)",
metricName: "Metric name",
targetMetricValue: "Threshold for metric (Optional, Default: 0, can be Decimal)",
minMetricValue: "Value to be used in case of no metric response or null (Optional, Default: 0, can be Decimal)",
awsRegion: "Region (Required)",
identityOwner: "Identity source to be used for authentication operations (Optional, Default: pod, Possible Values: pod or operator)",
metricCollectionTime: "Period that the scaler will check retrospectively (must be greater than metricStatPeriod, setting the value to be a multiple of metricStatPeriod will improve performance)",
metricStat: "Statistic type to be used (Optional, Default: Average, Some of the Possible Values: SampleCount, Sum, Average, Minimum, Maximum)",
metricStatPeriod: "Frequency to be used for the relevant metric (Optional, Default: 300, Possible Values: 1, 5, 10, 30, 60 and multiples of 60)",
metricUnit: "Metric unit (Optional, Default: none, Possible Values: Bytes, Seconds, Count, and Percent)",
metricEndTimeOffset: "The last data to be used in the specified period (Optional, Default: 0)"

After preparing the “ScaledObject” file, you can deploy the related object to the K8s cluster with the following command;

$ kubectl apply -f ScaledObject.yaml
Image-6 Deploying ScaledObject

After the deployment process is completed, you can check the status of the object with the following command;

$ kubectl get scaledobject -n < namespace >
Image-7 Checking the ScaledObject

Performing Scaling Tests

I pushed some messages to Amazon SQS, which is used as the event source for the relevant scaling tests, and observed the behavior of the ScaledObject and the object to be scaled after the push.

Image-8 Amazon SQS Message count graph

In my observations, when the HPA object created by the ScaledObject passed the specified levels, it became “Active” and scaled the relevant object, and also increased the replica count of the object step by step by the specified period. You can see the status of ScaledObject and HPA using the following commands:

$ kubectl get hpa -n < namespace >
$ kubectl get scaledobject -n < namespace >
Image-8 Controlling ScaledObject and HPA behaviors

After the messages on the related Amazon SQS were consumed and the “cooldownPeriod” period was waited as expected, the number of related objects decreased to the minimum value determined in HPA. You can also use the following command to see the details about the related ScaledObject;

$ kubectl describe scaledobject < scaledobject-name > -n < namespace >
Image-8 ScaledObject details

Cleaning the Environment

You can delete the resources created within the scope of the PoC by following the order below without any issues;

  1. You can delete the created ScaledObject with one of the following commands;
$ kubectl delete scaledobject < scaledobject-name > -n < namespace >

or

$ kubectl delete -f ScaledObject.yaml

2. Then you can delete KEDA and its components from the K8s cluster with the following command;

$ helm delete keda -n keda

3. Finally, you can delete the namespace created for KEDA with the following command;

$ kubectl delete ns keda
Image-9 Cleaning the Environment

Conclusion

Within the scope of this article, we have seen that any workload running in the K8s cluster can be scaled with the help of KEDA by using the AWS CloudWatch service as a metric/event source, which is the service where you can monitor almost all services running on AWS. In this PoC, we dynamically scaled our Consumer workloads on Amazon EKS K8s cluster based on the number of items on Amazon SQS.

I had a lot of fun while writing this article and doing the PoC, KEDA was also very useful for scaling workloads with the correct metrics in the Production environment, I hope you had fun reading it and it was a useful article for you, see you in the next article.

This is the Way…

--

--

Ramazan Taplamacı

Computer Engineer | Cloud Native Engineer | DevOps Engineer | AWS Certified Solution Architect Professional | @rtaplamaci