SCS k8s-cluster-api-provider upgrade guide
SCS k8s-cluster-api-provider upgrade guide
This document explains the steps to upgrade the SCS Kubernetes cluster-API based cluster management solution as follows:
- from the R2 (2022-03) to the R3 (2022-09) state
- from the R3 (2022-09) to the R4 (2023-03) state
- from the R4 (2023-03) to the R5 (2023-09) state
- from the R5 (2023-09) to the R6 (2024-03) state
The document explains how the management cluster and the workload clusters can be upgraded without disruption. It is highly recommended to do a step-by-step upgrade across major releases i.e. upgrade from R2 to R3 and then to R4 in the case of upgrade from the R2 to the R4. Upgrades across major releases without step-by-step process is not recommended and could lead to undocumented issues.
The various steps are not very complicated, but there are numerous steps to take, and it is advisable that cluster operators get some experience with this kind of cluster management before applying this to customer clusters that carry important workloads.
Note that while the detailed steps are tested and targeted to an R2 -> R3 move, R3 -> R4 move, R4 -> R5 move or R5 -> R6 move, many of the steps are a generic approach that will apply also for other upgrades, so expect a lot of similar steps when moving beyond R6.
Upgrades from cluster management prior to R2 are difficult; many changes before
R2 assumed that you would redeploy the management cluster. Redeploying the
management cluster can of course always be done, but it's typically disruptive
to your workload clusters, unless you move your cluster management state into
a new management cluster with clusterctl move
.
Management host (cluster) vs. Workload clusters
When you initially deployed the SCS k8s-cluster-api-provider, you created a VM with a kind cluster inside and a number of templates, scripts and binaries that are then used to do the cluster management. This is your management host (or more precisely your single-host management cluster). Currently, all cluster management including upgrading etc. is done by connecting to this host via ssh and performing commands there. (You don't need root privileges to do cluster management there, the normal ubuntu user rights are sufficient; there are obviously host management tasks such as installing package updates that do require root power and the user has the sudo rights to do so.)
When you create the management host, you have the option to create your
first workload cluster. This cluster is no different from other workload
clusters that you create by calling commands on the management host, so you
can manage it there. (The default name of this cluster is typically
testcluster
, though that can be changed since a while, #264).
On the management host, you have the openstack and kubernetes tools
installed and configured, so you can nicely manage all aspects of your
CaaS setups as well as the underlying IaaS. The kubectl configuration
is in ~/.kube/config
while you will find the OpenStack configuration
in ~/.config/openstack/clouds.yaml
. These files are automatically
managed; you can add entries to the files though, and they should
persist.
Updating the management host
There are two different possibilities to upgrade the management host.
- You do a component-wise in-place upgrade of it.
- You deploy a new management host and
clusterctl move
the resources over to it from the old one. (Note: Config state in~/CLUSTER_NAME/
)
TODO: Advice when to do what, risks, limitations
In-place upgrade
Operating system
You should keep the host up-to-date with respect to normal operating system
upgrades, so perform your normal sudo apt-get update && sudo apt-get upgrade
.
kubectl
, kustomize
, yq
, lxd
and a few other tools are installed as
snaps, so you may want to upgrade these as well: sudo snap refresh
.
From R5 sudo apt-get install -y jq
is also required as this is used by the diskless flavors feature, #424.
Default operating system image was changed from Ubuntu 20.04 to Ubuntu 22.04 in R4.
k8s-cluster-api-provider git
The automation is deployed on the management host by cloning the relevant
git repository.
into the k8s-cluster-api-provider
directory. Note that the checked out
branch will be the one that has been used when creating the management host,
and you might want to change branches from maintained/v3.x
to maintained/v4.x
in case of R2 to R3 upgrade, maintained/v5.x
for R3 to R4 upgrade, maintained/v6.x
for R4 to R5 upgrade or maintained/v7.x
for R5 to R6 upgrade.
Use git branch -rl
to see available branches in the k8s-cluster-api-provider
repository.
You can update the scripts and templates by checking out the relevant branch
main
, maintained/v4.x
, maintained/v5.x
, maintained/v6.x
or maintained/v7.x
and using a git pull
to ensure the latest content is retrieved.
Once you do that, the cluster-management scripts will be up-to-date.
(The ~/bin
directory in your search PATH
is symlinked to the check-ed out scripts.)
Note however that the binaries and used templates are NOT automatically updated. This should not normally result in problems -- when new features are introduced in the management scripts, they ensure to continue to support older templates.
Updating cluster-API and openstack cluster-API provider
To get the latest version of cluster-API, you can download a new clusterctl
binary from https://github.com/kubernetes-sigs/cluster-api/releases,
make it executable chmod +x clusterctl
and install it to /usr/local/bin/
,
possibly saving the old binary by renaming it. clusterctl version
should now
display the current version number (v1.6.2 at the time of this writing).
You can now issue the command clusterctl upgrade plan
and clusterctl will
list the components in your (kind) management cluster that can be upgraded.
Here's an example output:
ubuntu@capi-old-mgmtcluster:~ [0]$ clusterctl upgrade plan
Checking cert-manager version...
Cert-Manager is already up to date
Checking new release availability...
Latest release available for the v1beta1 API Version of Cluster API (contract):
NAME NAMESPACE TYPE CURRENT VERSION NEXT VERSION
bootstrap-kubeadm capi-kubeadm-bootstrap-system BootstrapProvider v1.5.1 v1.6.2
control-plane-kubeadm capi-kubeadm-control-plane-system ControlPlaneProvider v1.5.1 v1.6.2
cluster-api capi-system CoreProvider v1.5.1 v1.6.2
infrastructure-openstack capo-system InfrastructureProvider v0.7.3 v0.9.0
You can now apply the upgrade by executing the following command:
clusterctl upgrade apply --contract v1beta1
You can then upgrade the components:
export CLUSTER_TOPOLOGY=true
- this is needed from R5 to R6 upgrade due to ClusterClass feature #600- Upgrade components
- You can do them one-by-one, e.g.:
clusterctl upgrade apply --infrastructure capo-system/openstack:v0.9.0 --core capi-system/cluster-api:v1.6.2 -b capi-kubeadm-bootstrap-system/kubeadm:v1.6.2 -c capi-kubeadm-control-plane-system/kubeadm:v1.6.2
- Or simply do
clusterctl upgrade apply --contract v1beta1
- You can do them one-by-one, e.g.:
New templates
The cluster-template.yaml
template used for the workload clusters is located in
~/k8s-cluster-api-provider/terraform/files/template/
and copied from there into
~/cluster-defaults/
. Then workload clusters are created, they will also have a
copy of it in ~/${CLUSTER_NAME}/
. If you have not changed it manually, you can
copy it over the old templates. (Consider backing up the old one though.)
The next create_cluster.sh <CLUSTER_NAME>
run will then use the new template.
Note that create_cluster.sh
is idempotent -- it will not perform any changes
on the cluster unless you have changed its configuration by tweaking
cluster-template.yaml
(which you almost never do!) or clusterctl.yaml
(which you do often).
The other template file that changed -- however, some opentofu logic is used to prefill it with values. So you can't copy it from git.
R2 to R3
For going from R2 to R3, there is just one real change that you want
to apply: Add the variables CONTROL_PLANE_MACHINE_GEN: genc01
and
WORKER_MACHINE_GEN: genw01
to it. If you have copied over the new
cluster-template.yaml
as described above, then you're done. Otherwise
you can use the script update-R2-R3.sh <CLUSTER_NAME>
to tweak both clusterctl.yaml
and cluster-template.yaml
for the
relevant cluster. (You can use cluster-defaults
to change the templates
in ~/cluster-defaults/
which get used when creating new clusters.)
R3 to R4
In the R3 to R4, CALICO_VERSION was moved from .capi-settings
to clusterctl.yaml
. So
before upgrading workload clusters, you must add it also to the ~/${CLUSTER_NAME}/clusterctl.yaml
.
echo "CALICO_VERSION: v3.25.0" >> ~/cluster-defaults/clusterctl.yaml
echo "CALICO_VERSION: v3.25.0" >> ~/testcluster/clusterctl.yaml
In the R3 to R4 upgrade process, cluster-template.yaml
changed etcd defrag process in the
kubeadm control-planes and also security group names by adding ${PREFIX}-
to them, so it
has to be changed also in openstack project e.g. (PREFIX=capi):
openstack security group set --name capi-allow-ssh allow-ssh
openstack security group set --name capi-allow-icmp allow-icmp
We changed immutable fields in the Cluster API templates, so before running
create_cluster.sh
to upgrade existing workload cluster the CONTROL_PLANE_MACHINE_GEN
and WORKER_MACHINE_GEN
needs to be incremented in cluster specific clusterctl.yaml
.
In the R3 to R4 process, also cloud.conf
added enable-ingress-hostname=true
to the
LoadBalancer section. It is due to NGINX_INGRESS_PROXY
defaulting to true now. So if
you want to use, or you are already using this proxy functionality we recommend you to
add this line to your cloud.conf
, e.g.:
echo "enable-ingress-hostname=true" >> ~/cluster-defaults/cloud.conf
echo "enable-ingress-hostname=true" >> ~/testcluster/cloud.conf
Then, before upgrading workload cluster by create_cluster.sh
,
you should delete cloud-config secret in the kube-system namespace, so it can be recreated. E.g.:
kubectl delete secret cloud-config -n kube-system --kubeconfig=testcluster/testcluster.yaml
Also, the default nginx-ingress version has changed, so we recommend before upgrading cluster to delete ingress-nginx jobs, so new job with new image can be created in the update process.
kubectl delete job ingress-nginx-admission-create -n ingress-nginx --kubeconfig=testcluster/testcluster.yaml
kubectl delete job ingress-nginx-admission-patch -n ingress-nginx --kubeconfig=testcluster/testcluster.yaml
R4 to R5
In R4 to R5, the cluster-template.yaml
and clusterctl.yaml
changed (see release notes).
You can use script update-R4-to-R5.sh
to update the cluster's cluster-template.yaml
and clusterctl.yaml
from
R4 to R5. This script could update an existing Kubernetes cluster configuration files
as well as cluster-defaults
files that could be used for spawning new R5 clusters.
If you want to update an existing cluster configuration files from R4 to R5, just use script as follows:
update-R4-to-R5.sh <CLUSTER_NAME>
After you executed the above you will find that e.g. Calico version has been bumped from v3.25.0 to v3.26.1. Note that some software versions are not configurable and are not directly mentioned in the cluster configuration files, but they are hardcoded in R5 scripts (e.g. ingress nginx controller, metrics server), see new-defaults section. Note that the Kubernetes version was not updated as well the default CNI is not the Cilium yet. This two R5 features are out of scope this script when it is applied on the existing cluster configuration files as this features require advanced action such as CNI migration and step-by-step Kubernetes upgrade of +2 minor releases.
If you want to update cluster-defaults
configuration files from R4 to R5, just use script as follows:
update-R4-to-R5.sh cluster-defaults
The above action updates a cluster-defaults configuration file, which is almost similar to updating an existing cluster configuration file described above. The distinction lies in the fact that both the Kubernetes version and the default CNI are also updated, specifically to Kubernetes version v1.27.5 and Cilium as a default CNI.
R5 to R6
In R5 to R6, the cluster-template.yaml
and clusterctl.yaml
changed (see release notes).
You can use script update-R5-to-R6.sh
to update the cluster's cluster-template.yaml
and clusterctl.yaml
from
R5 to R6. This script could update an existing Kubernetes cluster configuration files
as well as cluster-defaults
files that could be used for spawning new R6 clusters.
If you want to update an existing cluster configuration files from R5 to R6, just use script as follows:
update-R5-to-R6.sh <CLUSTER_NAME>
After you executed the above you will find that e.g. Calico version has been bumped from v3.26.1 to v3.27.2 or Kubernetes version bumped from v1.27.5 to v1.28.7. Note that some software versions are not configurable and are not directly mentioned in the cluster configuration files, but they are hardcoded in R6 scripts (e.g. ingress nginx controller, metrics server, cilium), see new-defaults section.
If you want to update cluster-defaults
configuration files from R5 to R6, just use script as follows:
update-R5-to-R6.sh cluster-defaults
If you are curious: In R2, doing rolling upgrades of k8s versions required
edits in cluster-template.yaml
-- this is no longer the case in R3, R4, R5 and R6.
Just increase the generation counter for node and control plane nodes if you
upgrade k8s versions -- or otherwise change the worker or control plane
node specs, such as e.g. using a different flavor.
New defaults
You deploy a CNI (calico or cilium), the OpenStack Cloud Controller
Manager (OCCM), the cinder CSI driver to clusters; optionally also a
metrics server (default is true), a nginx ingress controller (also
defaulting to true), the flux2 controller, the cert-manager.
Some of these tools come with binaries that you can use for management
purposes and which get installed on the management host in /usr/local/bin/
.
The scripts that deploy these components into your workload clusters
download the manifests into ~/kubernetes-manifests.d/
with a version
specific name. If you request a new version, a new download will happen;
already existing versions will not be re-downloaded.
Most binaries in /usr/local/bin/
are not stored under a version-specific
name. You need to rename them to case a re-download of a newer version.
(The reason for not having version specific names is that this would
break scripts from users that assume the unversioned names; the good
news is that most of these binaries have no trouble managing somewhat
older deployments, so you can typically work with the latest binary
tool even if you have a variety of versions deployed into various
clusters.)
The defaults have changed as follows:
R2 | R3 | R4 | R5 | R6 | |
---|---|---|---|---|---|
kind | v0.14.0 | v0.14.0 | v0.17.0 | v0.20.0 | v0.20.0 |
capi | v1.0.5 | v1.2.2 | v1.3.5 | v1.5.1 | v1.6.2 |
capo | v0.5.3 | v0.6.3 | v0.7.1 | v0.7.3 | v0.9.0 |
helm | v3.8.1 | v3.9.4 | v3.11.1 | v3.12.3 | v3.14.1 |
sonobuoy | v0.56.2 | v0.56.10 | v0.56.16 | v0.56.17 | v0.57.1 |
k9s | unversioned | unversioned | unversioned | v0.27.4 | v0.31.9 |
calico | v3.22.1 | v3.24.1 | v3.25.0 | v3.26.1 | v3.27.2 |
calico CLI | v3.22.1 | v3.24.1 | v3.25.0 | v3.26.1 | v3.27.2 |
cilium | unversioned | unversioned | v1.13.0 | v1.14.1 | v1.15.1 |
cilium CLI | unversioned | unversioned | v0.13.1 | v0.15.7 | v0.15.23 |
hubble CLI | unversioned | unversioned | v0.11.2 | v0.12.0 | v0.13.0 |
nginx-ingress | v1.1.2 | v1.3.0 | v1.6.4 | v1.8.1 | v1.9.6 |
flux2 | unversioned | unversioned | v0.40.2 | v2.1.0 | v2.2.3 |
cert-manager | v1.7.1 | v1.9.1 | v1.11.0 | v1.12.4 | v1.14.2 |
metrics-server | v0.6.1 | v0.6.1 | v0.6.1 | v0.6.4 | v0.7.0 |
kubectx | v0.9.5 | ||||
kube-ps1 | v0.8.0 |
The clusterctl move approach
To be written
- Create new management host in same project -- avoid name conflicts with different prefix, to be tweaked later. Avoid testcluster creation
- Ensure it's up and running ...
- Tweak prefix
- Copy over configs (and a bit of state though that's uncritical) by using the directories
- Copy over the openstack credentials clouds.yaml and the kubectl config
- clusterctl move
Updating workload clusters
k8s version upgrade
On R2 clusters
The old way: Editing cluster-template.yaml. Or better use the
update-R2-to-R3.sh
script to convert first.
On R3 and R4 clusters
Edit ~/<CLUSTER_NAME>/clusterctl.yaml
and put the wanted version into the
fields KUBERNETES_VERSION
and OPENSTACK_IMAGE_NAME
. The node image will
be downloaded from https://minio.services.osism.tech/openstack-k8s-capi-images
and registered if needed. (If you have updated the k8s-cluster-api-provider repo,
you can use a version v1.NN.x, where you fill in NN with the wanted k8s version,
but leave a literal .x
which will get translated to the newest tested version.)
In the same file, increase the generation counters for CONTROL_PLANE_MACHINE_GEN
and WORKER_MACHINE_GEN
.
Now do the normal create_cluster.sh <CLUSTER_NAME>
and watch cluster-api
replace your worker nodes and doing a rolling upgrade of your control plane.
If you used a 3-node (or 5 or higher) control plane node setup, you will have
uninterrupted access not just to your workloads but also the workload's cluster
control plane. Use clusterctl describe cluster <CLUSTER_NAME>
or simply
kubectl --context <CLUSTER_NAME>-admin@<CLUSTER_NAME> get nodes -o wide
to watch the progress of this.
On R5 clusters
If you decide to migrate your existing Kubernetes cluster from R4 to R5 be aware of the following:
- R5 features such as per cluster namespaces and Cilium as a default CNI are supported only on new clusters and will not be migrated on the existing clusters
- R4 default Kubernetes version v1.25.6 can not be directly migrated to the R5 default Kubernetes version v1.27.5, because +2 minor Kubernetes version upgrade is not allowed. See further migration steps below if you want to upgrade also Kubernetes version to R5
Follow the below steps if you want to migrate an existing cluster from R4 to R5:
- Access your management node
- Checkout R5 branch
cd k8s-cluster-api-provider
git pull
git checkout maintained/v6.x - Backup an existing cluster configuration files (recommended)
cd ..
cp -R <CLUSTER_NAME> <CLUSTER_NAME>-backup - Update an existing cluster configuration files from R4 to R5
update-R4-to-R5.sh <CLUSTER_NAME>
- Validate updated cluster configuration files. You will find that e.g. Calico version has been bumped from v3.25.0 to v3.26.1. Note that some software versions are not configurable and are not directly mentioned in the cluster configuration files, but they are hardcoded in R5 scripts (e.g. ingress nginx controller, metrics server). Hence, read carefully the R5 release notes too. Also see that Kubernetes version was not updated, and it is still v1.25.6.
- Update an existing cluster (except Kubernetes version)
create_cluster.sh <CLUSTER_NAME>
- Update cluster-API and openstack cluster-API provider, see related section for details
clusterctl upgrade plan
clusterctl upgrade apply --contract v1beta1 - Bump Kubernetes version +1 minor release (to v1.26.8) and increase the generation counter for node and control plane nodes
sed -i 's/^KUBERNETES_VERSION: v1.25.6/KUBERNETES_VERSION: v1.26.8/' <CLUSTER_NAME>/clusterctl.yaml
sed -i 's/^OPENSTACK_IMAGE_NAME: ubuntu-capi-image-v1.25.6/OPENSTACK_IMAGE_NAME: ubuntu-capi-image-v1.26.8/' <CLUSTER_NAME>/clusterctl.yaml
sed -r 's/(^CONTROL_PLANE_MACHINE_GEN: genc)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml
sed -r 's/(^WORKER_MACHINE_GEN: genw)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml - Update an existing cluster Kubernetes version to v1.26.8
create_cluster.sh <CLUSTER_NAME>
- Bump Kubernetes version to R5 v1.27.5 and increase the generation counter for node and control plane nodes
sed -i 's/^KUBERNETES_VERSION: v1.26.8/KUBERNETES_VERSION: v1.27.5/' <CLUSTER_NAME>/clusterctl.yaml
sed -i 's/^OPENSTACK_IMAGE_NAME: ubuntu-capi-image-v1.26.8/OPENSTACK_IMAGE_NAME: ubuntu-capi-image-v1.27.5/' <CLUSTER_NAME>/clusterctl.yaml
sed -r 's/(^CONTROL_PLANE_MACHINE_GEN: genc)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml
sed -r 's/(^WORKER_MACHINE_GEN: genw)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml
- Update an existing cluster to the R5 Kubernetes version v1.27.5
create_cluster.sh <CLUSTER_NAME>
On R6 clusters
If you decide to migrate your existing Kubernetes cluster from R5 to R6 be aware of the following:
- Kubernetes version will be upgraded from v1.27.5 to v1.28.7
- You have to migrate from Cluster based templates to ClusterClass based templates
- Upgrade of cilium needs to be done manually (for clusters with
USE_CILIUM: true
)
Follow the below steps if you want to migrate an existing cluster from R5 to R6:
- Access your management node
- Checkout R6 branch
cd k8s-cluster-api-provider
git pull
git checkout maintained/v7.x - Backup an existing cluster configuration files (recommended)
cd ..
cp -R <CLUSTER_NAME> <CLUSTER_NAME>-backup - Update an existing cluster configuration files from R5 to R6
update-R5-to-R6.sh <CLUSTER_NAME>
- Validate updated cluster configuration files. You will find that e.g. Calico version has been bumped from v3.26.1 to v3.27.2 or Kubernetes version bumped from v1.27.5 to v1.28.7. Note that some software versions are not configurable and are not directly mentioned in the cluster configuration files, but they are hardcoded in R6 scripts (e.g. ingress nginx controller, metrics server, cilium). Hence, read carefully the R6 release notes too.
- Update cluster-API and openstack cluster-API provider, see related section for details
clusterctl upgrade plan
export CLUSTER_TOPOLOGY=true
clusterctl upgrade apply --contract v1beta1 - Migrate to ClusterClass
migrate-to-cluster-class.sh <CLUSTER_NAME>
- Increase the generation counter for worker and control plane nodes
sed -r 's/(^CONTROL_PLANE_MACHINE_GEN: genc)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml
sed -r 's/(^WORKER_MACHINE_GEN: genw)([0-9][0-9])/printf "\1%02d" $((\2+1))/ge' -i <CLUSTER_NAME>/clusterctl.yaml - Update an existing cluster to the R6
create_cluster.sh <CLUSTER_NAME>
Note: You will probably experience a double rollout of nodes because the k8s version and templates are changed concurrently here. See https://cluster-api.sigs.k8s.io/tasks/experimental-features/cluster-class/operate-cluster#effects-of-concurrent-changes
- Upgrade cilium (for clusters with
USE_CILIUM: true
)KUBECONFIG=<CLUSTER_NAME>/<CLUSTER_NAME>.yaml bash -c 'helm get values cilium -n kube-system -o yaml | cilium upgrade --version v1.15.1 -f -'
New versions for mandatory components
OCCM, CNI (calico/cilium), CSI
New versions for optional components
nginx, metrics server, cert-manager, flux
etcd leader changes
While testing clusters with >= 3 control nodes, we have observed occasional transient error messages that reported an etcd leader change preventing a command from succeeding. This could result in a dozen of random failed tests in a sonobuoy conformance run. (Retrying the commands would let them succeed.)
Too frequent etcd leader changes are detrimental to your control plane performance and can lead to transient failures. They are a sign that the infrastructure supporting your cluster is introducing too high latencies.
We recommend to deploy the control nodes (which run etcd) on instances with local SSD storage (which we reflect in the default flavor name) and recommend using flavors with dedicated cores and that your network does not introduce latencies by significant packet drop.
We now always use slower heartbeat (250ms) and increase CPU and IO priority which used to be
controlled by ETCD_PRIO_BOOST
. This is safe.
If you build multi-controller clusters and can not use a flavor with low latency local storage
(ideally SSD), you can also work around this with ETCD_UNSAFE_FS
. ETCD_UNSAFE_FS
is using
barrier=0
mount option, which violates filesystem ordering guarantees.
This works around storage latencies, but introduces the risk of inconsistent
filesystem state and inconsistent etcd data in case of an unclean shutdown.
You may be able to live with this risk in a multi-controller etcd setup.
If you don't have flavors that fulfill the requirements (low-latency
storage attached), your choice is between a single-controller cluster
(without ETCD_UNSAFE_FS
) and a multi-controller cluster with
ETCD_UNSAFE_FS
. Neither option is perfect, but we deem the
multi-controller cluster preferable in such a scenario.