What is the preferred Kubernetes storageClass for a PersistentVolume used by a Postgresql database? Which factors should go into consideration choosing the storageClass when I have the choice between S3 (Minio), NFS and HostPath?
-
Hi mxcd, how you set up a k8s cluster (cloud provider / on-premises)?mozello– mozello2022-03-14 17:02:24 +00:00Commented Mar 14, 2022 at 17:02
-
@mozello I am currently planning a setup with a K3s cluster on premisemxcd– mxcd2022-03-15 10:41:56 +00:00Commented Mar 15, 2022 at 10:41
3 Answers
When you choose a storage option for Postgresql in Kubernetes, you should take into account the following:
NFS / Minio is not the preferred storage for databases, if your application is latency-sensitive. A common use case is a download folder or a logging/backup folder.
But it gives you flexibility to design a k8s cluster and ability to easily move to cloud-based solution in future (AWS EFS or S3 for example).HostPath is a better option for databases. But
Kubernetes supports hostPath for development and testing on a single-node cluster. A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
In a production cluster, you would not use hostPath. Instead a cluster administrator would provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume. Cluster administrators can also use StorageClasses to set up dynamic provisioning.
- As you mentioned, there is quite a good option for non-cloud k8s clusters Longhorn
Longhorn is a lightweight, reliable, and powerful distributed block storage system for Kubernetes.
Longhorn implements distributed block storage using containers and microservices. Longhorn creates a dedicated storage controller for each block device volume and synchronously replicates the volume across multiple replicas stored on multiple nodes. The storage controller and replicas are themselves orchestrated using Kubernetes.
- Also, check this Bitnami PostgreSQL Helm chart
It offers a PostgreSQL Helm chart that comes pre-configured for security, scalability and data replication. It's a great combination: all the open source goodness of PostgreSQL (foreign keys, joins, views, triggers, stored procedures…) together with the consistency, portability and self-healing features of Kubernetes.
Comments
You should take care of getting dynamic block storage.
Host path is kind of what you want, but it's not dynamic, meaning it can't move around nodes. So if your node goes down, you have a problem.
If it's managed by a cloud vendor, there should be a premade storage class that covers this, i.e. azure disk.
NFS and S3 don't make sense for database data. You are not dealing with files/objects in that sense.
4 Comments
Whole, point of running DB on k8s to ensure auto scaling, High availability, fault tolerance. Hostpath or local storage will make your DB to run on specific hosts which is as good as running on a bare metal server. If that nice goes down then DB is gone. So when a host goes down the DB should auto fail over to another node, for that you want shared storage. Following should be your option. Ceph-rook: uses local storage but provides replication and clustering. Too complicated and a big beast to maintain. Longhorn: easiest k8s storage provider, uses local storage. Backed by SUSE. Robin: a middle ground between the above two. Nfs provisioner: points to an external nfs server. Pretty easy. Not recommended for big DB load. We on Azure AKS use pre built Azure disk.