Published on October 7, 2019
Author Henrik Hoegh

Tutorial: Snapshotting Persistent Volume Claims in Kubernetes

Sneak peak at CSI Volume snapshotting Alpha feature

In this blog I will show you how to create snapshots of Persistent volumes in Kubernetes clusters and restore them again by only talking to the api server. This can be useful for either backups or when scaling stateful applications that need “startup data”.

The snapshot feature was introduced as Alpha in Kubernetes v1.12. So, for this to work, you need to enable the VolumeSnapshotDataSource feature gate on your Kubernetes cluster API server.

--feature-gates=VolumeSnapshotDataSource=true

I will be using Rook to provision my storage as they support layered filesystems and the CSI driver.

I assume you have an application up and running in your cluster. In my case, I have Jira Software running in Data Center mode with one active node provisioned with ASK.

In order to scale horizontally, I need a copy of Node0 home folder before I can start Node1. So, we start by defining some objects in Kubernetes.

Creating the StorageClass

When you create your StorageClass for Rook, you need to add imageFeatures and set it to layering as shown below:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    # clusterID is the namespace where the rook cluster is running
    clusterID: rook-ceph
    # Ceph pool into which the RBD image shall be created
    pool: replicapool

    # RBD image format. Defaults to "2".
    imageFormat: "2"

    # RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
    imageFeatures: layering

    # The secrets contain Ceph admin credentials.
    csi.storage.k8s.io/provisioner-secret-name: rook-ceph-csi
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-ceph-csi
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph

    # Specify the filesystem type of the volume. If not specified, csi-provisioner
    # will set default as `ext4`.
    csi.storage.k8s.io/fstype: xfs

# Delete the rbd volume when a PVC is deleted
reclaimPolicy: Delete

When we deploy Jira with ASK, we simply use this storageclass, and Rook will create the storage when needed.

So, now we have a PVC for the home folder and one for the Data Center volume.

The Data Center volume is out of scope for this blogpost, as it’s not a block storage but a shared filesystem (Read Write Many) in Rook.

Creating the VolumeSnapshotClass and your first Snapshot

Now we define a VolumeSnapshotClass to handle our snapshots

apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshotClass
metadata:
  name: csi-rbdplugin-snapclass
snapshotter: rook-ceph.rbd.csi.ceph.com
parameters:
  # Specify a string that identifies your cluster. Ceph CSI supports any
  # unique string. When Ceph CSI is deployed by Rook use the Rook namespace,
  # for example "rook-ceph".
  clusterID: rook-ceph
  csi.storage.k8s.io/snapshotter-secret-name: rook-ceph-csi
  csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph

And then we are ready to create snapshots of the source PVC, in this case jira-persistent-storage-jira-0.

apiVersion: snapshot.storage.k8s.io/v1alpha1
kind: VolumeSnapshot
metadata:
  name: rbd-pvc-snapshot
spec:
  snapshotClassName: csi-rbdplugin-snapclass
  source:
    name: jira-persistent-storage-jira-0
    kind: PersistentVolumeClaim

This will give us a volumesnapshots, as seen here:


kubectl get volumesnapshots -n jira-production
NAME               AGE
rbd-pvc-snapshot   57m

Creating a new PVC from our snapshot

Now, if we want to create a new PVC based on this VolumeSnapshots, we define it like this:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: jira-persistent-storage-jira-1
spec:
  storageClassName: rook-ceph-block
  dataSource:
    name: rbd-pvc-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Now we have a second PVC called jira-persistent-storage-jira-1, based on the PVC jira-persistent-storage-jira-0 with all its data from that point. So now we can scale our statefulset Jira, and the new Jira node1 will use this PVC which is a copy of Node0.

NAME                             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS          AGE
jira-datacenter-pvc              Bound    pvc-a286df18-f9a3-4c52-b0a0-377a193f04de   5Gi        RWX            atlassian-dc-cephfs   93m
jira-persistent-storage-jira-0   Bound    pvc-b73b96d6-c3f7-4448-9f12-d9956efe2989   5Gi        RWO            rook-ceph-block       93m
jira-persistent-storage-jira-1   Bound    pvc-1984c9de-d13e-435e-b59c-28731d8f30bc   5Gi        RWO            rook-ceph-block       60m

Verification

We can verify it by looking at the mountpoint inside the container, once it has started up. The reason why the cluster.properties has a different timestamp, is because our entrypoint script makes changes to it, before starting Jira.

$ kubectl exec -ti jira-0 -n jira-production -- ls -l /var/atlassian/application-data/jira/
total 12
drwxrws---. 4 jira jira   46 Aug 22 13:43 caches
-rw-rw-r--. 1 jira jira  633 Aug 22 13:42 cluster.properties
-rw-rw----. 1 jira jira 1102 Aug 22 13:30 dbconfig.xml
drwxr-s---. 2 jira jira 4096 Aug 22 13:58 localq
drwxrws---. 2 jira jira  132 Aug 22 14:01 log
drwxrws---. 2 jira jira   76 Aug 22 13:32 monitor
drwxrws---. 6 jira jira  100 Aug 22 13:31 plugins
drwxrws---. 3 jira jira   26 Aug 22 13:24 tmp

$ kubectl exec -ti jira-1 -n jira-production -- ls -l /var/atlassian/application-data/jira/
total 12
drwxrws---. 4 jira jira   46 Aug 22 13:43 caches
-rw-rw-r--. 1 jira jira  633 Aug 22 13:57 cluster.properties
-rw-rw----. 1 jira jira 1102 Aug 22 13:30 dbconfig.xml
drwxr-s---. 2 jira jira 4096 Aug 22 13:58 localq
drwxrws---. 2 jira jira  100 Aug 22 13:32 log
drwxrws---. 2 jira jira   76 Aug 22 13:32 monitor
drwxrws---. 6 jira jira  100 Aug 22 13:31 plugins
drwxrws---. 3 jira jira   26 Aug 22 13:24 tmp

We can also see that we now have a VolumeSnapshotContent object in our cluster

$ kubectl get VolumeSnapshotContent
NAME                                               AGE
snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11   73m

$ kubectl describe VolumeSnapshotContent snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
Name:         snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
Namespace:
Labels:       <none>
Annotations:  <none>
API Version:  snapshot.storage.k8s.io/v1alpha1
Kind:         VolumeSnapshotContent
Metadata:
  Creation Timestamp:  2019-08-22T11:56:35Z
  Finalizers:
    snapshot.storage.kubernetes.io/volumesnapshotcontent-protection
  Generation:        1
  Resource Version:  176903
  Self Link:         /apis/snapshot.storage.k8s.io/v1alpha1/volumesnapshotcontents/snapcontent-05166c28-cdf9-4504-89c8-29c67ee23c11
  UID:               0a6afd6d-032d-4bf6-841d-a37146daf799
Spec:
  Csi Volume Snapshot Source:
    Creation Time:    1566474995000000000
    Driver:           rook-ceph.rbd.csi.ceph.com
    Restore Size:     5368709120
    Snapshot Handle:  0001-0009-rook-ceph-0000000000000003-e322e4c4-c4d3-11e9-afc8-0a580a2a0033
  Deletion Policy:    Delete
  Persistent Volume Ref:
    API Version:        v1
    Kind:               PersistentVolume
    Name:               pvc-b73b96d6-c3f7-4448-9f12-d9956efe2989
    Resource Version:   171532
    UID:                b8eed866-4e73-4a6a-bf74-d8fba8c9a8f5
  Snapshot Class Name:  csi-rbdplugin-snapclass
  Volume Snapshot Ref:
    API Version:       snapshot.storage.k8s.io/v1alpha1
    Kind:              VolumeSnapshot
    Name:              rbd-pvc-snapshot
    Namespace:         jira-production
    Resource Version:  176889
    UID:               05166c28-cdf9-4504-89c8-29c67ee23c11
Events:                <none>

Get Kubernetes to do it for you

So, what is this all good for? you ask? Well. So far we had to help Kubernetes each time we had to scale our Jira, Confluence or Bitbucket Data Center installation, as we needed to copy the data around. This could be automated with scripts, but now we can get Kubernetes to do it for us.

Although this is still in Alpha, and as of writing this blogpost, only supported by Block Storage by Rook but the developers told us that they are working on getting Shared Filesystem to be supported as well.

Also, we can now create snapshots as backups of our running applications. If we want, we can then start a backup pod that will mount this backup PVC and copy it outside the cluster to some cold backup location.

Author: Henrik Hoegh

Read more about Henrik


Related Stories

Related Stories

×

CoDe-Conf 2019

The Continuous Delivery and DevOps Conference in Scandinavia

Start well with Kubernetes

How to make the right technical choices on your cloud native journey

The Atlassian Software in Kubernetes solution goes open source

Praqma is at KubeCon Europe 2019 for this announcement