Operations Lead
Chronicles of a Kubernetes Storage Adventure
This post chronicles our adventure of implementing (and migrating) our Kubernetes storage provisioning from FlexVolumes to Third Party Resources to Storage Classes.
Some quick notes before we begin:
- We are running on AWS in the Asia Pacific region.
- We have a high level API called “Skipper” which we use as a “man in the middle” for easing developer usage on the cluster.
- We are an Ops team of 3.
FlexVolumes
When we first started hosting our sites on Kubernetes we did not have access to ideal “managed storage” options. The only options we
had available to us were:
- EBS - Could be used for a single Pod deployment, locking us into a non high availability solution.
- S3 - This is ok if you have 1 or 2 applications per dev team, but the effort required to migrate a large amount of applications was very daunting.
- Roll our own - We are a small team and didn’t want to have to maintain our own solution.
Around this time FlexVolumes were added to Kubernetes.
FlexVolumes are a very low level entrypoint which allows for a large amount of control over the storage which is mounted on a node.
The options that FlexVolumes opened up for us were very exciting, so we decided to go with a sneaky fourth option, “fuse filesystems”.
We had a choice between 3 fuse filesystems:
- S3FS - Stable, but very slow for our needs.
- RioFS - Faster, less support.
- Goofys - Fastest, but at the time it was experimental.
We chose RioFS. At the time it seemed like a good compromise.
The FlexVolume we implemented was awesome! We wrote it to auto provision S3 buckets and then shell out to RioFS to mount the volume.
Given we run our high-level "Skipper" API over the top of Kubernetes, this made rolling out the storage very easy.
Here is an example of a Deployment definition using our FlexVolume.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
namespace: test-project
name: production
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: web
image: nginx:latest
volumeMounts:
- mountPath: /var/www/files
name: files
volumes:
- name: files
flexVolume:
driver: "pnx/volume"
fsType: "fuse"
options:
name: "previousnext-test-project-files"
Over time we found that this solution was not viable, we ran into multiple random issues with replication across nodes and the dev team started to lose faith in the solution.
While I really liked the idea of a fuse mounted filesystem, we saw this as a means to an end, we were waiting for AWS EFS to launch in Australia.
If you are interested in the FlexVolume code, you can find it here:
https://github.com/previousnext/flexvolume-riofs
Third Party Resources
I now reference my “Kubernetes life” as “before” and “after” the AWS EFS launch in Sydney.
The first implementation we wrote for provisioning EFS volumes was done via a ThirdPartyResource which was comprised of 3 components:
- Provisioner - Daemon to provision new AWS EFS resources.
- Status - Daemon to monitor the AWS EFS API for changes to the state of the volumes.
- CLI - Simple command line client for listing all the EFS Third Party Resources (cluster admin only)
With these components installed, admins were able to provision new AWS EFS resources with the following definition:
apiVersion: skpr.io/v1
kind: Efs
metadata:
name: public-files
namespace: test-project
spec:
performance: "generalPurpose"
region: ap-southeast-2
securityGroup: sg-xxxxxxxx
subnets:
- subnet-xxxxxxxx
- subnet-xxxxxxxx
We then integrated this ThirdPartyResource into our Skipper API, allowing our developers to automatically have EFS backed deployments.
At the time of writing this post we are managing 100 AWS EFS volumes with this implementation.
While this has worked great for us, we acknowledged that this approach would not be ideal for the Kubernetes community as developers would be required to have knowledge of the infrastructure the cluster is running on, such as:
- Security Group
- VPC Subnets
- Region
We have now marked this implementation as v1.
Storage Classes
After reading through the Kubernetes Blog post, Dynamic Provisioning and Storage Classes in Kubernetes, we knew that this architecture was for us.
This is how I think of Storage Classes:
- Developer submits a PersistentVolumeClaim which contains a StorageClass reference
- StorageClass references a Provisioner
- Provisioner creates our SourceVolumes and returns PersistentVolumeSource information (how to mount)
- Developer references PersistentVolumeClaim in Pod definition
Not only has this approach allowed us to decouple our applications from the storage layer, but it also allowed us to move away from our ThirdPartyResource definition and command-line client, meaning less code to maintain!
So, how do you use our AWS EFS Storage Class?
First, we declare our Storage Classes. I think of these as aliases for our storage, allowing us to decouple the hosting provider from the deployment.
For example, a Storage Class named “general” could be provisioned by “EFS: General Purpose” on AWS or “Azure File Storage” on Microsoft Azure.
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: general
provisioner: efs.aws.skpr.io/generalPurpose
---
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: efs.aws.skpr.io/maxIO
Now that we have our Storage Classes declared, we need provisioners to do the work.
The provisioner was very easy to implement with the help of the Kubernetes incubator project External Storage.
To implement a provisioner with this library all you need to do is satisfy the interface functions Provision()
and Delete()
.
The examples I used to bootstrap our provisioner can be found here:
- https://github.com/kubernetes-incubator/external-storage/tree/master/docs/demo/hostpath-provisioner
- https://github.com/kubernetes-incubator/external-storage/tree/master/nfs
Our implementation can be found here:
To deploy the provisioners we used the following manifest file:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: aws-efs-gp
namespace: kube-system
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: aws-efs-gp
spec:
containers:
- name: provisioner
image: previousnext/k8s-aws-efs:2.0.0
env:
- name: EFS_PERFORMANCE
value: "generalPurpose"
- name: AWS_REGION
value: "ap-southeast-2"
- name: AWS_SECURITY_GROUP
value: "sg-xxxxxxxxx"
- name: AWS_SUBNETS
value: "subnet-xxxxxx,subnet-xxxxxx"
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: aws-efs-max-io
namespace: kube-system
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: aws-efs-max-io
spec:
containers:
- name: provisioner
image: previousnext/k8s-aws-efs:2.0.0
env:
- name: EFS_PERFORMANCE
value: "maxIO"
- name: AWS_REGION
value: "ap-southeast-2"
- name: AWS_SECURITY_GROUP
value: "sg-xxxxxxxxx"
- name: AWS_SUBNETS
value: "subnet-xxxxxx,subnet-xxxxxx"
Now we can provision some storage!
In the following definition, we are requesting one of each Storage Class type.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: files
namespace: test-project
annotations:
volume.beta.kubernetes.io/storage-class: "general"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: files-fast
namespace: test-project
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
We can inspect the status of these PersistentVolumeClaim objects with the following command:
$ kubectl -n test-project get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
files Bound fs-xxxxxxxx 8E RWX general 5m
files-fast Bound fs-xxxxxxxx 8E RWX fast 5m
Consuming a PersistentVolumeClaim is super easy, we now only need to reference what storage we want, not how we mount it (eg. nfs mount details).
apiVersion: apps/v1beta1
kind: Deployment
metadata:
namespace: test-project
name: production
spec:
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: web
image: nginx:latest
volumeMounts:
- mountPath: /var/www/files
name: files
volumes:
- name: files
persistentVolumeClaim:
claimName: files
Conclusion
Each of these APIs are for different use cases:
- FlexVolumes are for how a volume is mounted, in this case, we already had access to the NFS volume mount in Kubernetes.
- Storage Classes are for the what do you want eg. give me some storage please with X speed and X size.
- Third Party Resources allowed us to prototype early and we still use this for other custom API definitions on our clusters.
I am very grateful for the contributors working on these APIs. What we have been able to achieve with them is a testament to their excellent design.
Any feedback or contributions on the AWS EFS project are most welcome.
https://github.com/previousnext/k8s-aws-efs
Further discussion is also welcome on this site and on this Hacker News discussion.