Upgrading A Fleet Of Air-Gapped Openshift Clusters Using Advanced Cluster Management

Shon Paz
7 min readNov 5, 2022
Photo by Simon Micheler on Unsplash

Introduction

As time goes by, the need for a central management tool for Kubernetes clusters rises. Organizations are deploying more and more clusters in different availability zones, regions, clouds and to the edge, Each has different sets of configurations, versions, and workloads.

This is the point where managing so many clusters can become a bit painful. If we take upgrades as an example — we maintain different versions/releases in different registries, spread around different locations, and need to upgrade each one of those clusters separately.

If you ask me, It’s treating your clusters like pets instead of cattle.

In Red Hat, we have ACM (Advanced Cluster Management), which is based on the Stolostron upstream project and acts as a single pane of glass for the deployment, upgrades, and governance of thousands of clusters.

Today I’d like to share how you can use ACM and OSUS (Openshift Update Service) in order to distribute an upgrade path to multiple clusters in an air-gapped environment.

When working in a connected environment, our clusters are pointed to an API that holds this information, but when dealing with air-gapped environments this API isn’t accessible, thus we need to create our own update service.

Using OSUS and ACM you’ll be able to upgrade thousands of clusters, in parallel, from a single place.

Let the game begin.

Technical Abstract

In this article, We’ll upgrade two Openshift clusters (version 4.9.33) to a newer version (4.10.18) based on the eus-4.10 channel.

A channel holds a set of upgradeable Openshift versions and creates a graph with the shortest path toward a specific version. This information is accessible on the Openshift Update Graph website, which can help us plan our upgrade correctly.

Make sure to visit the website and plan your upgrade first!

An example of an upgrade plan from version 4.9.33, to 4.10.18 based on the eus-4.10 channel:

Prerequisites

  • Two air-gapped Openshift clusters (4.9.33)

Preparing Installation Media

In order to prepare all the needed materials for such an upgrade, we’ll use the oc-mirror tool, which can mirror all the needed images into a single directory that can be taken to the disconnected environment. Using oc-mirror we'll do the following:

  • Mirror both 4.9.33 and 4.10.18 versions
  • Mirror ACM and OSUS operators
  • Build and mirror the upgrade graph that holds the shortest path from 4.9.33 to 4.10.18
  • Generate all the needed configurations (imageContentSourcePolicy, catalogSource, UpdateService)

In a single YAML file, we're able to package our entire installation media and needed configs into a single directory!

In order to download the relevant oc-mirror version, you can use the following link.

We’ll use the following ImageSet configuration:

$ cat >./oc-mirror-config.yaml<<EOF
apiVersion: mirror.openshift.io/v1alpha2
kind: ImageSetConfiguration
archiveSize: 4
mirror:
platform:
channels:
- name: eus-4.10
minVersion: 4.9.33
maxVersion: 4.10.18
shortestPath: true
graph: true
operators:
- catalog: registry.redhat.io/redhat/redhat-operator-index:v4.9
packages:
- name: advanced-cluster-management
minVersion: '2.6.2'
maxVersion: '2.6.2'
- name: multicluster-engine
minVersion: '2.1.2'
maxVersion: '2.1.2'
- name: cincinnati-operator
minVersion: '5.0.0'
maxVersion: '5.0.0'
storageConfig:
local:
path: ./metadata
EOF

If you need any information on how to choose the right channel or which versions contained in a channel are relevant to which package, make sure to visit the following article.

Before you start mirroring, make sure you have your Pull Secret located in the ~/.docker/config file, as this is where oc-mirror looks for it. If you don't have it, make sure to create it.

Now let’s start mirroring the needed images to our directory:

$ oc-mirror --config=oc-mirror-config.yaml file://upgrade-from-4.9.33-to-4.10.18Adding graph data
wrote mirroring manifests to upgrade-from-4.9.33-to-4.10.18/oc-mirror-workspace/operators.1667653366/manifests-redhat-operator-index
To upload local images to a registry, run: oc adm catalog mirror file://redhat/redhat-operator-index:v4.9 REGISTRY/REPOSITORY

This will create the directory, and mirror all the needed images to it.

When it’s done, you can take this directory to your air-gapped environment and mirror it to your private registry.

Mirroring Images To Your Private Registry

Now that you have your oc-mirror directory in your air-gapped environment, mirror the relevant images to your private registry (Don't forget the ~/.docker/config file once again, This time your Pull Secret should point to your private registry):

$ oc-mirror --from upgrade-from-4.9.33-to-4.10.18/ docker://registry.spaz.local:8443/ocp4

In this phase, oc-mirror takes all the images from the directory and mirrors them to our private registry, under the ocp4 repository.

When it’s done, you’ll see that you have all of the needed configurations in the oc-mirror-workspace directory:

$ ls -l oc-mirror-workspace/results-1667592362/drwxr-xr-x 2 root root     6 Nov  4 16:06 charts
-rwxr-xr-x 1 root root 230 Nov 4 11:18 catalogSource-redhat-operator-index.yaml
-rwxr-xr-x 1 root root 419 Nov 4 16:31 imageContentSourcePolicy.yaml
-rw-r--r-- 1 root root 29114 Nov 4 16:23 mapping.txt
drwxr-xr-x 2 root root 98 Nov 4 16:22 release-signatures
-rwxr-xr-x 1 root root 349 Nov 4 16:23 updateService.yaml

Make sure to apply both imageContentSourcePolicy.yaml, and catalogSource-redhat-operator-index.yaml to ALL of clusters you wish to be upgraded by OSUS!

Integrating ACM With OSUS

Before we start configuring OSUS, make sure that you have both ACM and OSUS operators installed (including the multicluster-engine operator`):

Once you have those installed, create your MultiClusterHub instance, and wait for it to finish installing. Then, import your wanted clusters to ACM using one of the options ACM provides.

By shifting to the Clusters tab in ACM, you can see that there won't be any Upgrade Available option, as you're disconnected from the internet and there's no graph API that you can use.

Fortunately, we've used oc-mirror, which built and mirrored the relevant graph image to our private registry, according to our specified versions.

Now, let’s create the UpdateService custom resource, that points the Update Service Operator to the graph's location in our private registry:

cat <<EOF |oc apply -f -
apiVersion: updateservice.operator.openshift.io/v1
kind: UpdateService
metadata:
name: update-service-oc-mirror
spec:
graphDataImage: registry.spaz.local:8443/ocp4.10/openshift/graph-image@sha256:b305a92461ac6d2987edd98947960feac409c502b6e85a84803683c0b34c768e
releases: registry.spaz.local:8443/ocp4.10/openshift/release-images
replicas: 2
EOF

This custom resource tells OSUS where the graph image sits in our private registry, and where it should pull our release image from.

FYI, there was a bug that got the UpdateService pods OOMKilled when both the release image and Openshift images sat in the same repository.

it got fixed in this PR and is not relevant to the oc-mirror version we use in this article.

Make sure that your OSUS pods are running and ready to be used:

$ oc get pods -n openshift-update-serviceNAME                                        READY   STATUS    RESTARTS   AGE
update-service-oc-mirror-659989f887-7n5c6 2/2 Running 0 17h
update-service-oc-mirror-659989f887-zghnw 2/2 Running 0 17h
updateservice-operator-74d995c7d-mdnmm 1/1 Running 0 22h

Note — There’s a possibility that OSUS pods won’t go up (only the init pod starts), in order to fix that you should make sure that you update the updateservice-registry configmap, with the entire certificate chain, starting from the RootCA to the registry certificate (that should be the last in the chain).

Now that we have our OSUS in place, we need to point all of our managed clusters to our OSUS route. A route is created automatically, which exposes the graph image data to the outside world. This graph sits in our hub cluster, and all managed clusters pull data from it.

In order to get the route URL, run the following command on your hub cluster:

$ POLICY_ENGINE_GRAPH_URI="$(oc -n openshift-update-service get updateservice update-service-oc-mirror -o jsonpath='{.status.policyEngineURI}/api/upgrades_info/v1/graph')"

Now, make sure to print it:

$ echo $POLICY_ENGINE_GRAPH_URI https://update-service-oc-mirror-route-openshift-update-service.apps.ocp.spaz.local/api/upgrades_info/v1/graph

Point All Clusters To OSUS

Now, let’s configure each one of our clusters to pull the graph data from our hub OSUS (perform this on every single cluster):

$ PATCH="{\"spec\":{\"upstream\":\"${POLICY_ENGINE_GRAPH_URI}\"}}"$ oc patch clusterversion version -p $PATCH --type merge

In the managed cluster’s Openshift console, navigate to Cluster Settings and switch to your mirrored channel (in our case, it's eus-4.10). Verify that you see the custom route under Upstream Configuration and that you have your graph presented:

Now, you should see the ability to upgrade clusters from ACM:

Hit the Upgrade available button in the Infrastructure --> Clusters tab in ACM, and verify that it can be upgraded to the wanted version:

Great! now hit upgrade, and grab a cup of coffee until it’s done:

All set! We have both of our clusters in the wanted version, while both of them were upgraded simultaneously:

Conclusion

The integration between ACM, OSUS and oc-mirror is very powerful, you can upgrade a fleet of clusters (can be thousands) very easily. Hope you have enjoyed this article, see ya next time :)

--

--