Ephemeral Kubernetes Environments: A Cost-Effective Solution for Streamlining Minor Environments and Multi-Tenancy

14 min readMay 2, 2023

Introduction

In today’s fast-paced software development world, minor environments such as feature, development, and testing environments play a crucial role in ensuring that software is developed, tested, and released quickly and efficiently. However, managing these environments can be a challenge, especially when it comes to scaling, security, and isolation. Kubernetes ephemeral environments offer a solution to these challenges by allowing developers and testers to create and destroy environments on-demand, with minimal overhead and maximum flexibility. In this article, we’ll explore how Kubernetes ephemeral environments can revolutionize minor environments and improve the efficiency of software development and testing. and will provide a demo on how to automate the creation of our minor environment using GitHub actions.

Table Of Content

· Why Using Ephemeral Kubernetes Clusters for Minor Environments?
· How to create Ephemeral Kubernetes Environment?
· Working with Development Environments
· Pull Request Flow with Ephemeral Environments
· Demo
· Summary

Why Use Ephemeral Kubernetes Clusters for Minor Environments?

It is a common strategy to have permanent Kubernetes clusters for minor environments such as PreProd, feature, and development environments. However, this approach has some drawbacks:

Each environment has its own dedicated infrastructure and Kubernetes cluster, which can be time-consuming and costly to set up.
If we are not using managed cloud services like EKS and building and provisioning our own Kubernetes clusters in our data centre, we have additional operational work to manage minor environments, which can be resource-intensive and challenging to maintain.

There are different types of minor environments:

PreProd Environments

PreProd environments are typically used to perform integration tests before going into production. They should have production-like databases to test data integrity. However, PreProd environments are not always in use, and running them all the time can be costly. There are only specific use cases where they need to be running all the time:

To maintain the state of the deployed version on the production environment and test the migration to the new version. This is mainly because not all workloads are stateless.
When third parties integrate with our APIs and need to test the newer versions before going into production. These parties require the environment to be up and running all the time.

We should be able to create the PreProd environment only when needed before production releases and on-demand, and with the use of Infrastructure as Code (IaC) and manifest files, we can have the ability to automate this process.

Feature Environments

Feature environments are used to test a specific new feature before it takes its way for production release, these kinds of environments shouldn’t be running all the time because:

If we do not have multiple feature Kubernetes environments, then we will most likely face the issue of dependencies. Most features require another feature to be deployed, and there is a third feature that needs both features to be deployed, or we have a large team of developers that overwrite each other’s work to test their features.
On the other hand, if we have many permanent feature Kubernetes environments, these environments can be expensive and contribute to the issue of cluster sprawl.

one of the best ways to handle feature environments is to have ephemeral Kubernetes clusters on demand. For example, we can create an ephemeral Kubernetes cluster when a developer creates a pull request. After the developer finishes testing the new feature and closes the pull request, we will destroy the Kubernetes cluster. This way, we create a fully isolated environment that solves the issue of dependencies and prevents developers from overwriting each other’s deployments, which increases developers’ productivity. This approach also saves us money because these environments should be up for only a few hours, maybe days, but not weeks.

Development Environments

Dev environments are mainly used to test ongoing tasks or features developers working on.

If we have a single Kubernetes cluster for our development environment, our development team will likely experience productivity issues due to dependencies and conflicts. Moreover, a single developer might be working on different features on two different components, and it can be challenging for that developer to work on both features in parallel in the same Kubernetes environment.

Once again, the solution to such a problem is to create ephemeral Kubernetes clusters for dev environments on demand. A developer should be able to create an ephemeral environment for a branch, and that environment should be destroyed after finishing the development work. These environments should be up and running only for a few minutes or hours.

How to create Ephemeral Kubernetes Environment?

so we have come to the idea that we can use Ephemeral Kubernetes clusters for all minor environments, now we are going to discuss how we can create those Ephemeral Environments, we have 4 different ways to do that:

Kubernetes cluster per environment.
Namespace per environment.
Capsule Tenant per environment.
Virtual Cluster per environment.

Kubernetes cluster per environment

we can create a whole Kubernetes cluster for each environment as we will avoid having a cluster sprawl because we know that in the end those clusters will be created and destroyed in an automated way, while this way provides full isolation for all environments but it has some drawback though, to create a Kubernetes environment we have two options.

Creating clusters in our data centres will lead to keeping our virtual machines busy most of the time, also it will add some operational work and it will take some time for those clusters to be up and ready to host temporary workloads.
Creating clusters using managed services on the cloud (EKS) takes around 15 to 25 minutes and that’s not the best option, especially for ephemeral environments and we are paying for the cluster nodes.

Kubernetes namespace per environment

We can create a single permanent Kubernetes cluster for minor environments and isolate different ephemeral environments using Kubernetes namespaces for logical isolation, this will provide us with some isolation though it is not the best way because it has some drawbacks at well using this kind of isolation for ephemeral environments doesn’t allow us to implement advanced multi-tenancy. It’s easy for that way to become complicated easily

Capsule Tenant per environment

As per the K8S documentation, there’s no single definition for a tenant because it will vary depending on the use case, and in our use case, we need a tenant that provides us with full isolation and admission control that will help us set some resources limit and give us the ability to easily remove the tenant with all its resources with few commands or even a single one and the most important thing is those tenants are sharing the same Kubernetes cluster.

One of the well-known Kubernetes projects is Capsule, Capsule takes a different approach than the native Kubernetes namespaces, it provides a Capsule policy engine that keeps the different tenants isolated from each other, it gives the user the ability to operate their environment in autonomy in a much better way than Kubernetes namespaces but with some permission limits.

The only issue with that approach is that Capsule doesn’t support cluster-level isolation, for example, if we have some CRDs which are needed as part of our deployment or any cluster-level resources Capsule is not the best tool for that.

Virtual Cluster per Environment

As mentioned before the best scenario is to have a full dedicated Kubernetes cluster for each developer or each pull request but it’s hard to manage, has lots of operations work, and is very expensive, but what if we have the ability to create multiple Kubernetes clusters on the same Kubernetes cluster and that cluster could be created easily on demand and also could be destroyed easily after finishing our testing activities on them.

And Here comes the advantage of Kubernetes Virtual Clusters, a virtual cluster is a Kubernetes cluster that’s created on top of the actual Kubernetes cluster, and it’s built on K3S which is a lightweight (compared to the actual one) distribution designed for production workloads.

This makes it easy to manage a single (relatively big) Kubernetes cluster and partition it into smaller Kubernetes clusters with low costs and low operation headaches.

Virtual clusters provide the ability to deploy and use cluster-wide resources or Kubernetes CRDs, and it gives the developer (tenant owner) the luxury of owning a real cluster without restrictions on the permissions level.

Working with Development Environments

Let’s take an example of a developer who’s developing a new feature on a microservice and for simplicity let’s assume that this microservice depends on two other microservices, it needs a database for storage and a queuing service to push or consume some messages, now the developer after finishing some coding and writing or fixing some function needs to test this feature locally and to be able to do that the developer needs to have those dependencies up and running locally but as we see here those are huge workloads to run locally on the developer machine. and if we have an environment where the developer could deploy those workloads this will be much better.

but for the developer to be able to deploy his new feature on some remote Kubernetes cluster, first he needs to create an Image and after that deploy his workloads, but here comes the question what if the developer did some changes after that? the traditional answer to such a question is the developer needs to push the new changes and create a new image and redeploy with a newer version that has those changes or fixes, but as this solution has some productivity drawbacks we need another solution.

A solution that will give us the ability to update our local changes to remote workloads.

We need something that gives us the ability to synchronize our local changes to remote Kubernetes pods, and for that, we can use tools like DevSpace, DevSpace is an open source command line tool that lets us develop and deploy cloud native software, that gives us the ability to develop with hot reloading and updating our running containers without a rebuild or even a restart, it communicates with Kubernetes using the local Kubernetes context we provide.
Now with the combination of creating Ephemeral Kubernetes environments and tools like DevSpace we increase developer productivity and improve the developer experience.

Pull Request Flow with Ephemeral Environments

Now that we are done with our development process, we have created a feature and pushed the changes to git and we have created a Pull request for that feature, now we need to test those changes before going to production or PreProd Environment, there are different ways to do that testing we may deploy this new feature on a staging environment this is fine but if we are working with a team of other developers another pull request may overwrite our deployment, it will be better if we have a full isolated ephemeral environment for only testing this new feature, and after finishing testing we can destroy the whole environment when the Pull request is merged using git hooks for that.

So it seems to be clear that we will use the CICD pipeline to create this testing or preview environment, in combination with V-Cluster or Capsule to create a Kubernetes isolated environment that’s easy to destroy when there’s a pull request that’s merged the diagram below clarify that in more details.

Demo

you can find the Github Repository for that demo here

GitHub - AmrAlaaYassen/ephemeral-kubernetes-environments: A Demo for Kubernetes Ephemeral…

nodejs-microservices-example │ package-lock.json │ package.json │ README.md │ ├───.github │ └───workflows │…

github.com

in this demo we will have a system that consists of 2 microservices and a single Mongo database for storage, we will open a Pull request on one of the microservices and by opening this PR a new Environment should be created on Kubernetes We will use vcluster for creating and destroying the ephemeral environment and after finishing our testing we should easier close or merge the pull request which will trigger another pipeline to destroy the ephemeral environment.

I have deployed those workloads locally on a Kubernetes cluster, the pipelines are created using GitHub actions workflows, and kustomize to easily deploy the whole system.

I’m going to make use of the below GitHub repository as it provides a sample microservices system to do the demo

GitHub - ashleydavis/nodejs-microservices-example: Example of a monorepo with multiple Node.js…

Example of a monorepo with multiple Node.js microservices (using Docker, Docker-Compose and Kubernetes) that have…

github.com

For development environments We need to create pipelines to automate the creation of the ephemeral environment for each microservice, our pipeline should have the following steps:

checkout the code
build docker image
push the new docker image
update our Kubernetes manifests
deploy a new ephemeral environment
deploy the whole system on this environment
provide access to this environment

this is not the ultimate pipeline for your development workflow but those steps are just for demonstration.

we will use vcluster to create fully isolated Kubernetes environments, check GitHub action workflow file below:

name: Development Environment gateway
on:
  workflow_dispatch:
    
jobs:
  dev-deploy:
    runs-on: self-hosted
    timeout-minutes: 10

    env:
      VERSION: ${GITHUB_REF##*/}
      CONTAINER_REGISTRY: ${{ secrets.CONTAINER_REGISTRY }}
      REGISTRY_UN: ${{ secrets.DOCKER_USERNAME }}
      REGISTRY_PW: ${{ secrets.DOCKER_PASSWORD  }}
    steps:
      - uses: actions/checkout@v2
        name: checkout
        with:
          persist-credentials: false

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      - name: Build
        run: ./scripts/build-image.sh gateway ${{github.sha}}

      - name: Publish
        run: ./scripts/push-image.sh gateway ${{github.sha}}

    
      - name: install kubectl
        run: |
          sudo curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl";
          sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
          sudo rm -rf kubectl
          kubectl get pods
      - name: install vcluster
        run: |
          curl -L -o vcluster "https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64" && sudo install -c -m 0755 vcluster /usr/local/bin && sudo rm -f vcluster
      - name: create vcluster for preview env
        run: |
          vcluster create gateway-dev-${{ github.sha }} --namespace preview-${{github.sha}} &
          sleep 20
          vcluster connect gateway-dev-${{ github.sha }} --namespace preview-${{github.sha}} &

          sleep 20
          kubectl create ns gateway-dev-${{ github.actor_id }};

      - name: install YQ
        run: sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && sudo chmod +x /usr/bin/yq

      - name: update image
        run: |
          yq -e -i '.spec.template.spec.containers[0].image="${{ secrets.CONTAINER_REGISTRY }}/ephemeral-envs-gateway:${{ github.sha }}"' ./scripts/kubernetes/gateway.yaml
          yq -e -i '.spec.template.spec.containers[0].env[1].value="mongodb://db:27017"' ./scripts/kubernetes/worker.yaml

      - name: Update changes
        run: |
          git config --global user.email ${{secrets.ORG}}
          git config --global user.name ${{secrets.USERNAME}}
          git add .
          git commit -m "push image changes for ${{github.sha}}"
      - name: Push to Git
        uses: ad-m/github-push-action@master
        with:
          github_token: ${{ secrets.TOKEN }}
          repository: ${{ secrets.USERNAME }}/ephemeral-kubernetes-environments

      - name: Deploy
        run: |
          sudo curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | sudo bash && \
          ./kustomize build ./scripts/kubernetes | kubectl apply -f - -n gateway-dev-${{ github.actor_id }}

      - name: Artifacting KUBECONFIG Fille
        uses: actions/upload-artifact@v3
        with:
          name: 'dev-${{github.actor}}'
          path: ~/.kube/config

the steps on this pipeline will create an image and push it to our docker registry and update Github with a new tag, after that will deploy a virtual cluster and provide access to it using through normal KUBECONFIG file.

you can handle access to the cluster in a different way depending on your way of authorizing your users

now by running vcluster list command, we will be able to see the newly create ephemeral cluster, the cluster name depends on the GitHub commit SHA and the user id on GitHub to avoid creating clusters with the same name.

the newly created cluster

now let’s check the deployed workloads

it’s clear that we have the full workloads deployed to the new environment and only for that environment.

as development environments are created on demand whenever the developer needs it, it will be better to have an automated job that destroys those development environments every few hours to avoid keeping our main cluster resources busy all the time.

Now that we have finished our development and we are ready to open a pull request to merge the new changes to the main branch, on PR creation a new environment will be created only to test the new changes, this environment should be used by the QA and testing teams to accept the new changes/features.

PR GitHub workflow will be almost the same as the development environment but we won’t push the new image tag on GitHub we will deploy it directly on the newly created ephemeral environment.

name: gateway PR
on:
  pull_request:
    # Sequence of patterns matched against refs/heads
    branches:
      - main
    paths:
      - 'gateway/**'
jobs:
  pr-deploy:
    runs-on: self-hosted
    timeout-minutes: 10

    env:
      VERSION: ${GITHUB_REF##*/}
      CONTAINER_REGISTRY: ${{ secrets.CONTAINER_REGISTRY }}
      REGISTRY_UN: ${{ secrets.DOCKER_USERNAME }}
      REGISTRY_PW: ${{ secrets.DOCKER_PASSWORD  }}
    steps:
      - uses: actions/checkout@v2
        name: checkout
        with:
          persist-credentials: false

      - name: Login to Docker Hub
        uses: docker/login-action@v2
        with:
          username: ${{ secrets.DOCKER_USERNAME }}
          password: ${{ secrets.DOCKER_PASSWORD }}

      - name: Build
        run: ./scripts/build-image.sh gateway pr-${{ github.event.pull_request.number }}

      - name: Publish
        run: ./scripts/push-image.sh gateway pr-${{ github.event.pull_request.number }}


      - name: install kubectl
        run: |
          sudo curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl";
          sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
          sudo rm -rf kubectl
          kubectl get pods
      - name: install vcluster
        run: |
          curl -L -o vcluster "https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64" && sudo install -c -m 0755 vcluster /usr/local/bin && sudo rm -f vcluster
      - name: create vcluster for PR env
        run: |
          vcluster create gateway-pr-${{ github.event.pull_request.number }} --namespace pr-${{ github.event.pull_request.number }} &
          sleep 20
          vcluster connect gateway-pr-${{ github.event.pull_request.number }} --namespace pr-${{ github.event.pull_request.number }} &

          sleep 20
          kubectl create ns gateway-pr-${{ github.actor_id }};

      - name: install YQ
        run: sudo wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/bin/yq && sudo chmod +x /usr/bin/yq

      - name: update image
        run: |
          yq -e -i '.spec.template.spec.containers[0].image="${{ secrets.CONTAINER_REGISTRY }}/ephemeral-envs-gateway:pr-${{ github.event.pull_request.number }}"' ./scripts/kubernetes/gateway.yaml
          yq -e -i '.spec.template.spec.containers[0].env[1].value="mongodb://db:27017"' ./scripts/kubernetes/worker.yaml


      - name: Deploy
        run: |
          sudo curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh"  | sudo bash && \
          ./kustomize build ./scripts/kubernetes | kubectl apply -f - -n gateway-pr-${{ github.actor_id }}

      - name: Artifacting KUBECONFIG Fille
        uses: actions/upload-artifact@v3
        with:
          name: 'pr-${{github.actor}}'
          path: ~/.kube/config

we need to create a flow that will destroy the created environment when we merge or close the pull request.

name: pr-destroy-gateway
on:
  pull_request:
    types: [closed]
    paths:
      - 'gateway/**'
jobs:
  pr-destroy:
    runs-on: self-hosted
    timeout-minutes: 8
   
    steps:
      - uses: actions/checkout@v2
        name: checkout
        with:
          persist-credentials: false

      - name: install kubectl
        run: |
          sudo curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl";
          sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
          sudo rm -rf kubectl
          kubectl get pods
    
      - name: install vcluster
        run: |
          curl -L -o vcluster "https://github.com/loft-sh/vcluster/releases/latest/download/vcluster-linux-amd64" && sudo install -c -m 0755 vcluster /usr/local/bin && sudo rm -f vcluster
      - name: delete vcluster for PR env
        run: |
          vcluster delete gateway-pr-${{ github.event.pull_request.number }} &
          sleep 20

newly created ephemeral environment

now let’s merge this PR to see the destroy workflow running

destroying the environment on PR close/merge

Summary

In conclusion, ephemeral environments are good if you have a large team or multiple teams working in the same environment and it will avoid high costs for creating different clusters and boost your team's productivity.

A side note on data management, you need to maintain your data to be production-like data and provide it for the PreProd environment to have data integrity, and for that, you need a way to automate the creation of the needed data.

Ephemeral Environments may not suit you if you have a tiny team of developers that work on independent features

Overall, ephemeral Kubernetes clusters offer a flexible and powerful tool for achieving more efficient and effective testing and development processes. By adopting this approach, organizations can ensure their minor environments are fully optimized, enabling faster and more agile development cycles.

References

Contact

https://www.linkedin.com/in/amr-alaa-yassen-609785108/