Daniele Polencic
Daniele Polencic

Templating YAML in Kubernetes with real code

May 2020


Templating YAML in Kubernetes with real code

TL;DR: You should use tools such as yq and kustomize to template YAML resources instead of relying on tools that interpolate strings such as Helm.

If you're working on large scale projects, you should consider using real code — you can find hands-on examples on how to programmatically generate Kubernetes resources in Java, Go, Javascript, C# and Python in this repository.

Contents:

Introduction: managing YAML files

When you have multiple Kubernetes clusters, it's common to have resources that can be applied to all environments but with small modifications. As an example, when an app runs in the staging environment, it should connect to the staging database.

However, in production, it should connect to the production database.

In Kubernetes, you can use an environment variable to inject the correct database URL.

The following Pod is an example:

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://db_url:5432

Since the value postgres://db_url:5432 is hardcoded in the YAML definition, there's no easy way to deploy the same Pod in multiple environments such as dev, staging and production.

You could create a Pod YAML definition for each of the environment you plan to deploy.

bash

tree .
kube/
├── deployment-prod.yaml
├── deployment-staging.yaml
└── deployment-dev.yaml

Unfortunately, having several copies of the same file with minor modifications has its challenges.

If you update the name of the image or the version, you have to amend all of the remaining files.

Using templates with search and replace

A better strategy is to have a placeholder and replace it with the real value before the YAML is submitted to the cluster.

Search and replace.

If you're familiar with bash, you can implement search and replace with few lines of sed.

Your Pod should contain a placeholder like this:

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: ENV
      value: %ENV_NAME%

And you could run the following command to replace the value of the environment on the fly.

bash

sed s/%ENV_NAME%/$production/g \
 pod_template.yaml > pod_production.yaml

However, tagging all the templates and injecting placeholders is hard work.

What if you want to change a value that isn't a placeholder?

What if you have too many placeholders, what does the sed command look like?

If you have only a handful variables that you wish to change, you might want to install yq — a command-line tool designed to transform YAML.

yq is similar to another more popular tool called jq that focuses on JSON instead of YAML.

You can install yq on macOS with:

bash

brew install yq

On Linux with:

bash

sudo add-apt-repository ppa:rmescandon/yq
sudo apt-get install yq

In case you don't have the add-apt-repository command installed, you can install it with apt-get install software-properties-common.

If you're on Windows, you can download the executable from Github.

Templating with yq

yq takes a YAML file as input and can:

  1. read values from the file
  2. add new values
  3. updated existing values
  4. generate new YAML files
  5. covert YAML into JSON
  6. merge two or more YAML files

Let's have a look at the same Pod definition:

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://db_url:5432

You could read the value for the environment variable ENV with:

bash

yq r pod.yaml "spec.containers[0].env[0].value"
postgres://db_url:5432

The command works as follows:

What if you want to change the value instead?

Perhaps you want to deploy to the production environment and change the URL to the production database.

You can use the following command:

bash

yq w pod.yaml "spec.containers[0].env[0].value" "postgres://prod:5432"
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://prod:5432

You should notice that yq printed the result on the standard output.

If you prefer to edit the YAML in place, you should add the -i flag.

The difference between yq and sed is that the former understands the YAML format and can navigate and mangle the structured markup.

On the other hand, sed treats files as strings and it doesn't mind if the file isn't a valid YAML.

Since yq understands YAML, let's explore a few more complex scenarios.

Merging YAML files

Let's assume that you want to inject an extra container to all the Pods submitted to the cluster.

But instead of using an Admission Webhook, you decide to add an extra command in your deployment script.

You could save the YAML configuration for the extra container as a YAML file:

envoy-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: envoy-pod
spec:
  containers:
  - name: proxy-container
    image: envoyproxy/envoy:v1.12.2
    ports:
      - containerPort: 80

Assuming that you have a Pod like this:

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://db_url:5432

You can execute the following command and merge the two YAMLs:

bash

yq m -a append pod.yaml envoy-pod.yaml

Please notice the -a append flag that is necessary to append values to an array. You can find more details in the official documentation.

The output should have a proxy named Envoy as an additional container:

container-snippet.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://db_url:5432
  - name: proxy-container
    image: envoyproxy/envoy:v1.12.2
    ports:
      - containerPort: 80

Please note that yq sorts the YAML fields in the output alphabetically, so the order of fields in your output could be different from the above listing.

In other words, the two YAML files are merged into one.

While the example shows a useful strategy to compose complex YAML from basic files, it also shows some of the limitations of yq:

  1. The two YAML files are merged at the top level. For instance, you cannot add a chunk of YAML file under .spec.containers[].
  2. The order of the files matters. If you invert the order, yq keeps envoy-pod for the Pod's name in metadata.name.
  3. You have to tell yq explicitly when to append and overwrite values. Since those are flags that apply to the whole document, it's hard to get the granularity right.

However, if you plan to use yq for small projects, you can probably go quite far with it.

There's another tool similar to yq, but explicitly focused on Kubernetes YAML resources: kustomize.

While yq understands and transforms YAML, kustomize can understand and transform Kubernetes YAML.

That's a subtle but essential difference so you will explore that next.

Templating with Kustomize

Kustomize is a command-line tool that can create and transform YAML files — just like yq.

However, instead of using only the command line, kustomize uses a file called kustomization.yaml to decide how to template the YAML.

Let's have a look at how it works.

All the files should be created in a separate folder:

bash

mkdir prod
cd prod

You will use the same Pod as before:

pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: DB_URL
      value: postgres://db_url:5432

You can save the file as pod.yaml in the prod directory.

In the same directory, you should also create a kustomization.yaml file:

kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - pod.yaml

You can then execute the kustomize command by passing it the directory where your kustomization.yaml file resides as an argument:

Please note that the . (dot) in the next command is the prod directory.

bash

kubectl kustomize .
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - env:
    - name: DB_URL
      value: postgres://db_url:5432
    image: registry.k8s.io/busybox
    name: test-container

There's no change in the output since you haven't applied any kustomization yet.

You can define a patch that should be applied to the Pod.

The patch is defined like this:

pod-patch.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: proxy-container
    image: envoyproxy/envoy:v1.12.2
    ports:
      - containerPort: 80

You should save the file in the same directory as pod-patch.yaml.

Please notice that the name of the resources (highlighted) has to match the metadata.name in pod.yaml.

And you should update your kustomize.yaml to include the following lines:

kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
  - pod.yaml
patchesStrategicMerge:
  - pod-patch.yaml

If you rerun the command, the output should be a Pod with two containers:

bash

kubectl kustomize .
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - image: envoyproxy/envoy:v1.12.2
    name: proxy-container
    ports:
    - containerPort: 80
  - env:
    - name: DB_URL
      value: postgres://db_url:5432
    image: registry.k8s.io/busybox
    name: test-container

The kustomize patch functionality works similarly as yq merge, but the setup for kustomize is more tedious.

Also, kustomize merges the two YAML only when metadata.name is the same in both files.

It's safer, but is it enough to justify using kustomize in favour or yq?

Kustomize is designed to map changes and resources in code.

You should create another folder at the same level at the previous one:

bash

cd ..
mkdir dev
cd dev
tree ..
├── dev
└── prod
    ├── kustomization.yaml
    ├── pod-patch-envoy.yaml
    └── pod.yaml

You can create another kustomization.yaml:

kustomization.yaml

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
  - ../prod
patchesStrategicMerge:
  - pod-patch-env.yaml

The new configuration extends the base configuration in the prod directory, so there's no need to recreate all patches in the dev directory.

Instead, you can create a single patch in the dev directory that changes the value of the environment variable:

pod-patch-env.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
    - name: test-container
      env:
      - name: DB_URL
        value: postgres://dev:5432

The final directory structure is the following:

bash

tree .
├── dev
│   ├── kustomization.yaml
│   └── pod-patch-env.yaml
└── prod
    ├── kustomization.yaml
    ├── pod-patch-envoy.yaml
    └── pod.yaml

If you run kustomize this time, the output will be different:

bash

kubectl kustomize dev
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - image: envoyproxy/envoy:v1.12.2
    name: proxy-container
    ports:
    - containerPort: 80
  - env:
    - name: DB_URL
      value: postgres://dev:5432
    image: registry.k8s.io/busybox
    name: test-container

Please notice how:

  1. you still have the extra container which is patched in the base kustomization.yaml file
  2. the environment variable DB_URL was changed to postgres://dev:5432

You can imagine that you can have more folders and more kustomization.yaml to match your clusters and environments.

Can you also edit fields without having to create a folder, a kustomization.yaml and the extra YAML?

Kustomize's support for inline editing fields is limited and discouraged.

Kustomize is designed on purpose to make it hard to change values through the command line.

You have to write YAML files to change even a single value — unlike yq.

Is there an alternative to Kustomize and yq that is flexible and structured?

Generating resource manifests with code

You could generate YAML programmatically with code.

And you don't even need to start from scratch.

Kubernetes has several official libraries where you can create objects such as Deployments and Pod with code.

As an example, let's have a look at how you can use JavaScript to generate a Kubernetes Pod definition.

You will find more examples in Go, Java, Python and C# later on.

pod.js

const { Pod, Container } = require('kubernetes-models/v1')

const pod = new Pod({
  metadata: {
    name: 'test-pod',
  },
  spec: {
    containers: [
      new Container({
        name: 'test-container',
        image: 'nginx',
        env: [{ name: 'ENV', value: 'production' }],
      }),
    ],
  },
})

// Any valid JSON is also valid YAML
const json = JSON.stringify(pod, null, 2)

console.log(json)

You can execute the script with the node binary:

bash

node pod.js
{
  "metadata": {
    "name": "test-pod"
  },
  "spec": {
    "containers": [
      {
        "name": "test-container",
        "image": "registry.k8s.io/busybox",
        "env": [
          {
            "name": "ENV",
            "value": "production"
          }
        ]
      }
    ]
  },
  "apiVersion": "v1",
  "kind": "Pod"
}

The output is a Pod definition in JSON.

That shouldn't be a problem because:

  1. YAML is a superset of JSON. Any JSON file is also a valid YAML file
  2. The Kubernetes API receives the resource in JSON even if you write YAML files. kubectl serialises the YAML into JSON

What if you want to change the environment variable?

Since this is just code, you can use native constructs:

pod.js

const { Pod, Container } = require('kubernetes-models/v1')

function createPod(environment = 'production') {
  return new Pod({
    metadata: {
      name: 'test-pod',
    },
    spec: {
      containers: [
        new Container({
          name: 'test-container',
          image: 'registry.k8s.io/busybox',
          env: [{ name: 'ENV', value: environment }],
        }),
      ],
    },
  })
}

const pod = createPod('dev')

// Any valid JSON is also valid YAML
const json = JSON.stringify(pod, null, 2)

console.log(json)

The code above uses a function and an argument to customise the environment variables.

You can execute the script again with:

bash

node pod.js
{
  "metadata": {
    "name": "test-pod"
  },
  "spec": {
    "containers": [
      {
        "name": "test-container",
        "image": "nginx",
        "env": [
          {
            "name": "ENV",
            "value": "dev"
          }
        ]
      }
    ]
  },
  "apiVersion": "v1",
  "kind": "Pod"
}

You could save the above output in a file named pod.json and then create the Pod in the cluster with kubectl:

bash

kubectl apply -f pod.json

It works!

You could also skip kubectl altogether and submit the JSON to your cluster directly.

Using the official Javascript library, you could have the following code:

pod.js

const { Pod, Container } = require('kubernetes-models/v1')
const k8s = require('@kubernetes/client-node')
const kc = new k8s.KubeConfig()

// Using the default credentials for kubectl
kc.loadFromDefault()
const k8sApi = kc.makeApiClient(k8s.CoreV1Api)

function createPod(environment = 'production') {
  return new Pod({
    metadata: {
      name: 'test-pod',
    },
    spec: {
      containers: [
        new Container({
          name: 'test-container',
          image: 'nginx',
          env: [{ name: 'ENV', value: environment }],
        }),
      ],
    },
  })
}

const pod = createPod('dev')

k8sApi.createNamespacedPod('default', pod).then(() => console.log('success'))

Writing resource definition for objects such as Deployments, Services, StatefulSets, etc. with code is convenient.

  1. You don't need to come up with a way to replace values.
  2. You don't need to learn YAML.
  3. You can leverage functions, string concatenations and many other features that are already available as part of the language.
  4. If your language of choice supports types, you can use IntelliSense to create resources.

However, it's not as common despite the advantages.

You can find the above example translated in Java, Go, Python, C# in this repository.

Why not Helm?

Helm is a package manager, release manager and a templating engine.

So you could use Helm to template the same Pod.

pod-template.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: ENV
      value: {{ .Values.environment_name }}

The template cannot live in isolation and should be placed in a directory that has a specific structure — a Helm chart.

bash

tree
.
├── Chart.yaml
├── templates
│   └── pod-template.yaml
└── values.yaml

The values.yaml file contains all the customisable fields:

values.yaml

environment_name: production

You can render the template with:

bash

helm template .
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: test-container
    image: registry.k8s.io/busybox
    env:
    - name: ENV
      value: production

Please note that, unless a parameter is listed in the values.yaml, it cannot be changed.

In the example above, you can't customise the name of the container or the name of the Pod.

If you want to do so, you should introduce more variables such as {{ .Values.image_name }} and {{ .Values.pod_name }} and add them to the values.yaml.

The bottom line is unless it's wrapped into {{ }}, you cannot change any value.

Also, Helm doesn't really understand YAML.

Helm uses the Go templating engine which only replaces values.

Hence, you could generate invalid YAML with Helm.

Helm is usually a popular choice because you can share and discover charts — a collection of Kubernetes resources.

But, when it comes to templating, it's a poor choice.

Other configuration tools

Many other tools are designed to augment or replace YAML in Kubernetes.

The following list has some of the more interesting approaches:

  1. Cue is a configuration language that doesn't limit itself to Kubernetes. Instead, it can generate the configuration for APIs, database schemas, etc.
  2. jk is a data templating tool designed to help writing structured configuration files.
  3. jsonnet is a data templating language similar to Cue.
  4. Dhall is a programmable configuration language.
  5. Skycfg is an extension library for the Starlark language that adds support for constructing Protocol Buffer messages (and hence Kubernetes resources).

The bottom line is that all of the above tools require you to learn one more language (or DSL) to handle configuration.

If you have to introduce a new language, why not using a real language that perhaps you already use?

Summary

When you manage multiple environments and multiple teams, it's natural to look for strategies to parameterise your deployments.

And templating your Kubernetes definitions, it's the next logical choice to avoid repeating yourself and standardise your practices.

There're several options to template YAML some of them treat it as a string.

You should avoid tools that don't understand YAML because they require extra care on things such as indentation, escaping, etc.

Instead, you should look for tools that can mangle YAML such as yq or kustomize.

The other option at your disposal is to use your programming language of choice to create the objects and then serialise them into YAML or JSON.

You can find the a few examples on how to create Kubernetes YAML in Java, Go, Python, C# in this repository.

That's all folks!

A special thanks go to:

Your source for Kubernetes news

You are in!