Templating YAML in Kubernetes with real code
May 2020
TL;DR: You should use tools such as yq
and kustomize to template YAML resources instead of relying on tools that interpolate strings such as Helm.
If you're working on large scale projects, you should consider using real code — you can find hands-on examples on how to programmatically generate Kubernetes resources in Java, Go, Javascript, C# and Python in this repository.
Contents:
- Introduction: managing YAML files
- Search and replace
- Templating with
yq
- Templating with Kustomize
- Generating resource manifests with code
- Why not Helm?
- Other configuration tools
- Summary
Introduction: managing YAML files
When you have multiple Kubernetes clusters, it's common to have resources that can be applied to all environments but with small modifications. As an example, when an app runs in the staging environment, it should connect to the staging database.
However, in production, it should connect to the production database.
In Kubernetes, you can use an environment variable to inject the correct database URL.
The following Pod is an example:
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://db_url:5432
Since the value postgres://db_url:5432
is hardcoded in the YAML definition, there's no easy way to deploy the same Pod in multiple environments such as dev, staging and production.
You could create a Pod YAML definition for each of the environment you plan to deploy.
bash
tree .
kube/
├── deployment-prod.yaml
├── deployment-staging.yaml
└── deployment-dev.yaml
Unfortunately, having several copies of the same file with minor modifications has its challenges.
If you update the name of the image or the version, you have to amend all of the remaining files.
Using templates with search and replace
A better strategy is to have a placeholder and replace it with the real value before the YAML is submitted to the cluster.
Search and replace.
If you're familiar with bash
, you can implement search and replace with few lines of sed
.
Your Pod should contain a placeholder like this:
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: ENV
value: %ENV_NAME%
And you could run the following command to replace the value of the environment on the fly.
bash
sed s/%ENV_NAME%/$production/g \
pod_template.yaml > pod_production.yaml
However, tagging all the templates and injecting placeholders is hard work.
What if you want to change a value that isn't a placeholder?
What if you have too many placeholders, what does the sed
command look like?
If you have only a handful variables that you wish to change, you might want to install yq
— a command-line tool designed to transform YAML.
yq
is similar to another more popular tool calledjq
that focuses on JSON instead of YAML.
You can install yq
on macOS with:
bash
brew install yq
On Linux with:
bash
sudo add-apt-repository ppa:rmescandon/yq
sudo apt-get install yq
In case you don't have the
add-apt-repository
command installed, you can install it withapt-get install software-properties-common
.
If you're on Windows, you can download the executable from Github.
Templating with yq
yq
takes a YAML file as input and can:
- read values from the file
- add new values
- updated existing values
- generate new YAML files
- covert YAML into JSON
- merge two or more YAML files
Let's have a look at the same Pod definition:
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://db_url:5432
You could read the value for the environment variable ENV
with:
bash
yq r pod.yaml "spec.containers[0].env[0].value"
postgres://db_url:5432
The command works as follows:
yq r
is the command to read a value from the YAML file.pod.yaml
is the file path of the YAML that you want to read.spec.containers[0].env[0].value
is the query path.
What if you want to change the value instead?
Perhaps you want to deploy to the production environment and change the URL to the production database.
You can use the following command:
bash
yq w pod.yaml "spec.containers[0].env[0].value" "postgres://prod:5432"
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://prod:5432
You should notice that yq
printed the result on the standard output.
If you prefer to edit the YAML in place, you should add the -i
flag.
The difference between yq
and sed
is that the former understands the YAML format and can navigate and mangle the structured markup.
On the other hand, sed
treats files as strings and it doesn't mind if the file isn't a valid YAML.
Since yq
understands YAML, let's explore a few more complex scenarios.
Merging YAML files
Let's assume that you want to inject an extra container to all the Pods submitted to the cluster.
But instead of using an Admission Webhook, you decide to add an extra command in your deployment script.
You could save the YAML configuration for the extra container as a YAML file:
envoy-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: envoy-pod
spec:
containers:
- name: proxy-container
image: envoyproxy/envoy:v1.12.2
ports:
- containerPort: 80
Assuming that you have a Pod like this:
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://db_url:5432
You can execute the following command and merge the two YAMLs:
bash
yq m -a append pod.yaml envoy-pod.yaml
Please notice the
-a append
flag that is necessary to append values to an array. You can find more details in the official documentation.
The output should have a proxy named Envoy as an additional container:
container-snippet.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://db_url:5432
- name: proxy-container
image: envoyproxy/envoy:v1.12.2
ports:
- containerPort: 80
Please note that
yq
sorts the YAML fields in the output alphabetically, so the order of fields in your output could be different from the above listing.
In other words, the two YAML files are merged into one.
While the example shows a useful strategy to compose complex YAML from basic files, it also shows some of the limitations of yq
:
- The two YAML files are merged at the top level. For instance, you cannot add a chunk of YAML file under
.spec.containers[]
. - The order of the files matters. If you invert the order,
yq
keepsenvoy-pod
for the Pod's name inmetadata.name
. - You have to tell
yq
explicitly when to append and overwrite values. Since those are flags that apply to the whole document, it's hard to get the granularity right.
However, if you plan to use yq
for small projects, you can probably go quite far with it.
There's another tool similar to yq
, but explicitly focused on Kubernetes YAML resources: kustomize.
While yq
understands and transforms YAML, kustomize can understand and transform Kubernetes YAML.
That's a subtle but essential difference so you will explore that next.
Templating with Kustomize
Kustomize is a command-line tool that can create and transform YAML files — just like yq
.
However, instead of using only the command line, kustomize uses a file called kustomization.yaml
to decide how to template the YAML.
Let's have a look at how it works.
All the files should be created in a separate folder:
bash
mkdir prod
cd prod
You will use the same Pod as before:
pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: DB_URL
value: postgres://db_url:5432
You can save the file as pod.yaml
in the prod directory
.
In the same directory, you should also create a kustomization.yaml
file:
kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- pod.yaml
You can then execute the kustomize command by passing it the directory where your kustomization.yaml
file resides as an argument:
Please note that the
.
(dot) in the next command is theprod
directory.
bash
kubectl kustomize .
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- env:
- name: DB_URL
value: postgres://db_url:5432
image: registry.k8s.io/busybox
name: test-container
There's no change in the output since you haven't applied any kustomization yet.
You can define a patch that should be applied to the Pod.
The patch is defined like this:
pod-patch.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: proxy-container
image: envoyproxy/envoy:v1.12.2
ports:
- containerPort: 80
You should save the file in the same directory as pod-patch.yaml
.
Please notice that the name of the resources (highlighted) has to match the
metadata.name
inpod.yaml
.
And you should update your kustomize.yaml
to include the following lines:
kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- pod.yaml
patchesStrategicMerge:
- pod-patch.yaml
If you rerun the command, the output should be a Pod with two containers:
bash
kubectl kustomize .
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- image: envoyproxy/envoy:v1.12.2
name: proxy-container
ports:
- containerPort: 80
- env:
- name: DB_URL
value: postgres://db_url:5432
image: registry.k8s.io/busybox
name: test-container
The kustomize patch functionality works similarly as yq merge
, but the setup for kustomize is more tedious.
Also, kustomize merges the two YAML only when metadata.name
is the same in both files.
It's safer, but is it enough to justify using kustomize in favour or yq
?
Kustomize is designed to map changes and resources in code.
You should create another folder at the same level at the previous one:
bash
cd ..
mkdir dev
cd dev
tree ..
├── dev
└── prod
├── kustomization.yaml
├── pod-patch-envoy.yaml
└── pod.yaml
You can create another kustomization.yaml
:
kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../prod
patchesStrategicMerge:
- pod-patch-env.yaml
The new configuration extends the base configuration in the prod
directory, so there's no need to recreate all patches in the dev
directory.
Instead, you can create a single patch in the dev
directory that changes the value of the environment variable:
pod-patch-env.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
env:
- name: DB_URL
value: postgres://dev:5432
The final directory structure is the following:
bash
tree .
├── dev
│ ├── kustomization.yaml
│ └── pod-patch-env.yaml
└── prod
├── kustomization.yaml
├── pod-patch-envoy.yaml
└── pod.yaml
If you run kustomize this time, the output will be different:
bash
kubectl kustomize dev
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- image: envoyproxy/envoy:v1.12.2
name: proxy-container
ports:
- containerPort: 80
- env:
- name: DB_URL
value: postgres://dev:5432
image: registry.k8s.io/busybox
name: test-container
Please notice how:
- you still have the extra container which is patched in the base
kustomization.yaml
file - the environment variable
DB_URL
was changed topostgres://dev:5432
You can imagine that you can have more folders and more kustomization.yaml
to match your clusters and environments.
Can you also edit fields without having to create a folder, a kustomization.yaml
and the extra YAML?
Kustomize's support for inline editing fields is limited and discouraged.
Kustomize is designed on purpose to make it hard to change values through the command line.
You have to write YAML files to change even a single value — unlike yq
.
Is there an alternative to Kustomize and yq
that is flexible and structured?
Generating resource manifests with code
You could generate YAML programmatically with code.
And you don't even need to start from scratch.
Kubernetes has several official libraries where you can create objects such as Deployments and Pod with code.
As an example, let's have a look at how you can use JavaScript to generate a Kubernetes Pod definition.
You will find more examples in Go, Java, Python and C# later on.
pod.js
const { Pod, Container } = require('kubernetes-models/v1')
const pod = new Pod({
metadata: {
name: 'test-pod',
},
spec: {
containers: [
new Container({
name: 'test-container',
image: 'nginx',
env: [{ name: 'ENV', value: 'production' }],
}),
],
},
})
// Any valid JSON is also valid YAML
const json = JSON.stringify(pod, null, 2)
console.log(json)
You can execute the script with the node
binary:
bash
node pod.js
{
"metadata": {
"name": "test-pod"
},
"spec": {
"containers": [
{
"name": "test-container",
"image": "registry.k8s.io/busybox",
"env": [
{
"name": "ENV",
"value": "production"
}
]
}
]
},
"apiVersion": "v1",
"kind": "Pod"
}
The output is a Pod definition in JSON.
That shouldn't be a problem because:
- YAML is a superset of JSON. Any JSON file is also a valid YAML file
- The Kubernetes API receives the resource in JSON even if you write YAML files.
kubectl
serialises the YAML into JSON
What if you want to change the environment variable?
Since this is just code, you can use native constructs:
pod.js
const { Pod, Container } = require('kubernetes-models/v1')
function createPod(environment = 'production') {
return new Pod({
metadata: {
name: 'test-pod',
},
spec: {
containers: [
new Container({
name: 'test-container',
image: 'registry.k8s.io/busybox',
env: [{ name: 'ENV', value: environment }],
}),
],
},
})
}
const pod = createPod('dev')
// Any valid JSON is also valid YAML
const json = JSON.stringify(pod, null, 2)
console.log(json)
The code above uses a function and an argument to customise the environment variables.
You can execute the script again with:
bash
node pod.js
{
"metadata": {
"name": "test-pod"
},
"spec": {
"containers": [
{
"name": "test-container",
"image": "nginx",
"env": [
{
"name": "ENV",
"value": "dev"
}
]
}
]
},
"apiVersion": "v1",
"kind": "Pod"
}
You could save the above output in a file named pod.json
and then create the Pod in the cluster with kubectl
:
bash
kubectl apply -f pod.json
It works!
You could also skip kubectl
altogether and submit the JSON to your cluster directly.
Using the official Javascript library, you could have the following code:
pod.js
const { Pod, Container } = require('kubernetes-models/v1')
const k8s = require('@kubernetes/client-node')
const kc = new k8s.KubeConfig()
// Using the default credentials for kubectl
kc.loadFromDefault()
const k8sApi = kc.makeApiClient(k8s.CoreV1Api)
function createPod(environment = 'production') {
return new Pod({
metadata: {
name: 'test-pod',
},
spec: {
containers: [
new Container({
name: 'test-container',
image: 'nginx',
env: [{ name: 'ENV', value: environment }],
}),
],
},
})
}
const pod = createPod('dev')
k8sApi.createNamespacedPod('default', pod).then(() => console.log('success'))
Writing resource definition for objects such as Deployments, Services, StatefulSets, etc. with code is convenient.
- You don't need to come up with a way to replace values.
- You don't need to learn YAML.
- You can leverage functions, string concatenations and many other features that are already available as part of the language.
- If your language of choice supports types, you can use IntelliSense to create resources.
However, it's not as common despite the advantages.
You can find the above example translated in Java, Go, Python, C# in this repository.
Why not Helm?
Helm is a package manager, release manager and a templating engine.
So you could use Helm to template the same Pod.
pod-template.yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: ENV
value: {{ .Values.environment_name }}
The template cannot live in isolation and should be placed in a directory that has a specific structure — a Helm chart.
bash
tree
.
├── Chart.yaml
├── templates
│ └── pod-template.yaml
└── values.yaml
The values.yaml
file contains all the customisable fields:
values.yaml
environment_name: production
You can render the template with:
bash
helm template .
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
env:
- name: ENV
value: production
Please note that, unless a parameter is listed in the values.yaml
, it cannot be changed.
In the example above, you can't customise the name of the container or the name of the Pod.
If you want to do so, you should introduce more variables such as {{ .Values.image_name }}
and {{ .Values.pod_name }}
and add them to the values.yaml
.
The bottom line is unless it's wrapped into {{ }}
, you cannot change any value.
Also, Helm doesn't really understand YAML.
Helm uses the Go templating engine which only replaces values.
Hence, you could generate invalid YAML with Helm.
Helm is usually a popular choice because you can share and discover charts — a collection of Kubernetes resources.
But, when it comes to templating, it's a poor choice.
Other configuration tools
Many other tools are designed to augment or replace YAML in Kubernetes.
The following list has some of the more interesting approaches:
- Cue is a configuration language that doesn't limit itself to Kubernetes. Instead, it can generate the configuration for APIs, database schemas, etc.
- jk is a data templating tool designed to help writing structured configuration files.
- jsonnet is a data templating language similar to Cue.
- Dhall is a programmable configuration language.
- Skycfg is an extension library for the Starlark language that adds support for constructing Protocol Buffer messages (and hence Kubernetes resources).
The bottom line is that all of the above tools require you to learn one more language (or DSL) to handle configuration.
If you have to introduce a new language, why not using a real language that perhaps you already use?
Summary
When you manage multiple environments and multiple teams, it's natural to look for strategies to parameterise your deployments.
And templating your Kubernetes definitions, it's the next logical choice to avoid repeating yourself and standardise your practices.
There're several options to template YAML some of them treat it as a string.
You should avoid tools that don't understand YAML because they require extra care on things such as indentation, escaping, etc.
Instead, you should look for tools that can mangle YAML such as yq
or kustomize.
The other option at your disposal is to use your programming language of choice to create the objects and then serialise them into YAML or JSON.
You can find the a few examples on how to create Kubernetes YAML in Java, Go, Python, C# in this repository.
That's all folks!
A special thanks go to:
- Daniel Weibel who offered excellent feedback on this article and contributed with the Go and Python translation for the example.
- Salman Iqbal who translated the snippets into C#.
- Mauricio Salaboy Salatino who translated the snippets into Java.