We save thousands every month by using incremental snapshots with our Velero backups

With a large enough number of Kubernetes workloads and volumes, the cost of your backups can easily get out of control. The ability to enable incremental snapshots as part of your Velero backups can have a huge impact on the cost of your backup strategy

Daniel Jimenez Garcia
3 min readAug 10, 2023
Photo by Josh Appel on Unsplash

We use Kubernetes in Azure’s AKS as the basis of our development platform, and we have adopted tools such as Velero in order to make it easier for teams and services to follow best practices and be production ready. In the particular case of Velero, it allows us to ensure a consistent backup strategy that even allows for optional per-service/per-cluster customization.

First some context…

Typically, we will setup a minimum of 3 Velero Backup Schedules on each Kubernetes cluster:

  • a weekly one
  • a nightly one
  • a frequent one, set to a configurable frequency in minutes/hours.

There can be some customization done per cluster, but that’s roughly the baseline number of schedules. Depending on how critical a particular workload is, it will be included in some (or all) of these strategies. Any stateful workload that relies on Kubernetes Persistent Volumes (PV) will have the underlying Azure Disk backed up by Velero via Azure Disk Snapshots.

This is costing too much!

In a cluster with a large number of critical workloads that rely on volumes, the number of disk snapshots can very quickly grow! Imagine 40 such workloads, with a couple of PVs each. Let’s say the frequent backups run every 2h and you keep them for 2 days, while you keep the weekly/nightly for a week. This means for each individual PV you have

  • 24 frequent snapshots
  • 7 nightly snapshots
  • 1 weekly snapshot

That’s 32 snapshots per volume. Considering 40 workloads with 2 volumes each, thats 40*2*32 = 2,560 disk snapshots!

When each one of these snapshots is a full disk snapshot, their combined daily costs can be significant:

Daily costs of the snapshots created by the backup strategies when using full disk snapshots

We are burning ~174$ per day, more than 5,200$ per month!

Lets enable incremental snapshots

Velero allows you to use incremental snapshots in Azure when backing up the Azure disks associated with Kubernetes volumes. All you need is to enable the incremental flag within your Velero’s VolumeSnapshotLocation.

apiVersion: velero.io/v1
kind: VolumeSnapshotLocation
metadata:
name: default
namespace: velero
spec:
provider: velero.io/azure
config:
resourceGroup: my-resource-group
# Enable incremenal snapshots!
incremental: "true"

Note by default full snapshots are taken unless incremental is explicitly enabled.

The effects on the costs are significant, given you will pay just for the amount of data that changed since the last snapshot. This was the effect of enabling incremental backups on the same environment we saw the costs above:

Cost savings after enabling incremental snapshots

From a peak of ~174$ per day, costs plummeted to ~20$ per day. That’s 88.5% savings compared to the cost of the full snapshots!

As mentioned before, the savings depend on the rate of change of your data, ie how much data changes between each snapshot. But unless most of your data is constantly changing, you should see significant savings.
Of course, the total amount of these savings depends on the number of volumes and their size. Ie, if you have 2 volumes of 4Gb each, the cost of 32 full snapshots per volume is roughly 11$ per day and you might save 5-10$ per day.

To top it all off, there are no other changes needed than the incremental flag! Backups and restores continued to work like before. It’s all handled behind the scenes by Velero and Azure.

--

--

Daniel Jimenez Garcia

Principal engineer @oliverwyman. Author @DotNetCurry and @DorksKaizen. Interested in Vue, Node, Python, .NET, Kubernetes, Terraform, DevOps and cloud