How To Back Up and Restore a Kubernetes Cluster

Kubernetes has emerged as an orchestration platform that has changed application deployment altogether. It has automated the process of deploying, scaling, and managing containerized applications. You can focus on developing applications while Kubernetes handles the deployment.

If you have never used Kubernetes, using and running it may pose some problems. Service interruptions can always disrupt your work, and these are inevitable. A backup is an essential component of the process when you are using Kubernetes. But do you know how to back up and restore a Kubernetes cluster? Why do you need it? Well, this article covers it all.

What is a Kubernetes Cluster?

A Kubernetes Cluster can be understood as a set of various nodes which can run containerized applications. It helps the containerized application run across various machines and different environments like virtual machines, clouds, physical hardware, etc.

A Kubernetes cluster has various nodes; there is one master node and multiple worker nodes. The master node manages the tasks to be allotted to worker nodes that perform the actual work. It makes application development, movement, and management easier. If the master node fails, the Kubernetes cluster is disrupted.

Why Do You Need to Backup?

It is always good to take backups. Like you take backup of your phone data and office data, you need to take the backup of a Kubernetes cluster. The reasons for backing up Kubernetes Cluster are listed below:

For Disaster Recovery – Disasters are always round the corner if you are not careful. If your namespace accidentally gets deleted by you or any team member, disaster recovery becomes essential. A backup helps to restore it to the earlier situation.
For Migration – To migrate your Kubernetes cluster to a different environment, you need to take a backup beforehand.
For Replication – Before any significant upgrade or if you want to replicate your environment from production to staging, backup is recommended.

What Do You Need to Backup?

It is not a general backup of Kubernetes but a backup of Kubernetes Cluster. You must know exactly what you need to backup.

You need to back up etcd storage because this is where the control plane is stored. To get all the Kubernetes resources, it is important to back up the etcd state. Etcd is a popular storage system for distributed systems. So, when a cluster is using etcd storage, you need to back up the data inside the cluster.

You need to backup your stateful containers. Stateful Containers are the ones whose memory isn’t lost, and they remember some of its state conditions every time they run.

How to Take Backup?

There are different tools available out there to back up and restore Kubernetes Cluster, like Kube-backup and Heptio ark. If you are not using a managed Kubernetes Cluster, you can learn to take backup in the below-mentioned steps.

For example, let’s take 3 master Kubernetes clusters with 3 etcd members each.

1. Taking etcd backup

Taking backup of your etcd cluster depends on their setup in the Kubernetes environment. They can be set up in two ways:

Internal etcd cluster: When your etcd cluster is running inside your Kubernetes cluster as containers/pods and Kubernetes has the responsibility to manage those pods.

External etcd cluster: When your etcd cluster is running in the form of Linux services outside the Kubernetes cluster and only its endpoints are provided to the Kubernetes cluster to write to.

2. Backing up Internal etcd Cluster

To backup your etcd cluster, Kubernetes CronJob functionality is used here as it does not require you to install etcdctl client on your host.

To take an etcd backup every minute, the following code will help:

`apiVersion: batch/v1beta1kind: CronJobmetadata:name: backupnamespace: kube-systemspec:# activeDeadlineSeconds: 100schedule: “*/1 * * * *”

jobTemplate:

spec:

template:

spec:

containers:

– name: backup

# Same image as in /etc/kubernetes/manifests/etcd.yaml

image: k8s.gcr.io/etcd:3.2.24

env:

– name: ETCDCTL_API

value: “3”

command: [“/bin/sh”]

args: [“-c”, “etcdctl –endpoints=https://127.0.0.1:2379 –cacert=/etc/kubernetes/pki/etcd/ca.crt –cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt –key=/etc/kubernetes/pki/etcd/healthcheck-client.key snapshot save /backup/etcd-snapshot-$(date +%Y-%m-%d_%H:%M:%S_%Z).db”]

volumeMounts:

– mountPath: /etc/kubernetes/pki/etcd

name: etcd-certs

readOnly: true

– mountPath: /backup

name: backup

restartPolicy: OnFailure

hostNetwork: true

volumes:

– name: etcd-certs

hostPath:

path: /etc/kubernetes/pki/etcd

type: DirectoryOrCreate

– name: backup

hostPath:

path: /data/backup

type: DirectoryOrCreate

3. Backing up external etcd cluster

Setting up a Linux CronJob will do the work for you if you are running an external etcd cluster on Linux Host as a service.

The following command will allow you to take the etcd backup:

ETCDCTL_API=3 etcdctl –endpoints $ENDPOINT snapshot save /path/for/backup/snapshot.db

How to Restore Kubernetes Cluster?

As mentioned earlier, the most important reason for taking a backup is to restore the Kubernetes cluster. In case the Kubernetes cluster fails, you can recover it from the etcd snapshot. You need to put the backup certificates into the /etc/kubernetes/pki folder. Then you can start the etcd cluster and run the kubeadm init command on the master node with etcd endpoints.

How to Restore Internal etcd Cluster?

You can run this command:

docker run –rm \

-v ‘/data/backup:/backup’ \

-v ‘/var/lib/etcd:/var/lib/etcd’ \

–env ETCDCTL_API=3 \

‘k8s.gcr.io/etcd:3.2.24’ \

/bin/sh -c “etcdctl snapshot restore ‘/backup/etcd-snapshot-2018-12-09_11:12:05_UTC.db’ ; mv /default.etcd/member/ /var/lib/etcd/”kubeadm init –ignore-preflight-errors=DirAvailable–var-lib-etcd

How to Restore External etcd Cluster?

You can restore the three nodes with these commands:

ETCDCTL_API=3 etcdctl snapshot restore snapshot-188.db \

–name master-0 \

–initial-cluster master-0=http://10.0.1.188:2380,master-01=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 \

–initial-cluster-token my-etcd-token \

–initial-advertise-peer-urls http://10.0.1.188:2380ETCDCTL_API=3 etcdctl snapshot restore snapshot-136.db \

–name master-1 \

–initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 \

–initial-cluster-token my-etcd-token \

–initial-advertise-peer-urls http://10.0.1.136:2380ETCDCTL_API=3 etcdctl snapshot restore snapshot-155.db \

–name master-2 \

–initial-cluster master-0=http://10.0.1.188:2380,master-1=http://10.0.1.136:2380,master-2=http://10.0.1.155:2380 \

–initial-cluster-token my-etcd-token \

–initial-advertise-peer-urls http://10.0.1.155:2380

The above three commands will give you three restored folders on three nodes named master:0.etcd, master-1.etcd and master-2.etcd

After executing the command, stop the etcd services on all the nodes, and replace the new restored folder on all the nodes. Start the etcd service again. If you also have multiple nodes, you may find only one node ready while others in a not-so-ready state. You need to use the ca.crt file here to join them as well.

You need to run the following command on your master node:

kubeadm token create –print-join-command

It will bring the other nodes in the ready state.

So, this is what you need to backup a Kubernetes cluster.

Conclusion

This article discusses the procedure for taking backup of a Kubernetes cluster with 3 master nodes. It is an effective way to set up a multi-master Kubernetes cluster as a master failure with a single master node may be disruptive. Even that is not foolproof, and this backup and restore guide will come in handy.