Kadalu Storage Config
Storage configuration is core of the kadalu.io project. This config file provides details of which devices provide PVs, and what type (Replica1/Replica3 or other). Kadalu operator reads this config file, and starts the relevant gluster server pods exporting the storage provided in the config.
The features of kadalu project comes from options in storage-config file. In this document we will provide details of each options possible in config file.
Config file format.
Let us look into below sample config file.
---
apiVersion: kadalu-operator.storage/v1alpha1
kind: KadaluStorage
metadata:
# This will be used as name of PV Hosting Volume
name: storage1
spec:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
type: Replica1
storage:
- node: kube1
device: /dev/vdc
For now,
* apiVersion
will always be having value of kadalu-operator.storage/v1alpha1
.
* kind
will always be having value of KadaluStorage
. kadalu operator only understands config of this kind.
* metadata
is required, and should have an entry for 'name', which will be used as 'gluster’s volume name and pod names.
* spec
field should be required, and should have type
field.
The above 3 fields will be common for config files for all different modes.
Tolerations
-
For
.spec.tolerations
refer Toleration Spec -
These tolerations will get added to
server
andkadalu-csi-nodeplugin
pods typical use-case includes usage of tainted nodes from which kadalu storage is created. -
Please decide on what all tolerations should be used before applying the Pool CR as modifications would typically restart
server
andkadalu-csi-nodeplugin
pods.
Native mode
In this mode, the storage is provided from inside of the kubernetes cluster. Lets look at how native mode can be configured in each these options.
Type
Native mode is available with 5 type
options. They are Replica1
, Replica2
, Replica3
, Disperse
and Arbiter
.
-
Replica1 In this mode, Gluster will be started without high availability, i.e without replicate module. It will use just one storage, from which the RWX and RWO PVs will be carved out. This storage will be exposed by a gluster server pod, which gets spawned by kadalu operator.
$ kubectl kadalu storage-add storage-pool1 --type=Replica1 \ --device kube-nodename1:/dev/vdc
To create a PVC from this storage, you need to provide
storageClassName
option askadalu.replica1
. -
Replica2 In this mode, Gluster will be started with high availability, with 'tie-breaker' (or thin-arbiter in gluster speak) option. It will use 2 storage options, from which the RWX and RWO PVs will be carved out. This storage will be exposed by a gluster server pod, which gets spawned by kadalu operator. The
tiebreaker
node has to be managed outside, and if the tiebreaker option is not provided, Kadalu creates the Replica 2 Volume without Tiebreaker.... spec: type: Replica2 storage: - ... - ... tiebreaker: node: ... path: ... ...
To create a PVC from this storage, you need to provide
storageClassName
option askadalu.replica2
. -
Replica3 In this mode, Gluster will be started with high availability, ie, replicate module. It will require 3 storage options to be provided, and each storage will be one gluster server process pod running as pod.
$ kubectl kadalu storage-add storage-pool1 --type=Replica3 \ --device kube-nodename1:/dev/vdc \ --device kube-nodename2:/dev/vdc \ --device kube-nodename3:/dev/vdc \
To create a PVC from this storage, you need to provide
storageClassName
option askadalu.replica3
. -
Disperse Dispersed volumes are based on erasure codes. It stripes the encoded data of files, with some redundancy added, across multiple bricks in the volume. You can use dispersed volumes to have a configurable level of reliability with minimum space waste. This mode requires atleast 3 storages and increase the number of storages based on the requirement. Refer GlusterFS documentation to know more about Disperse Volumes.
$ kubectl kadalu storage-add storage-pool1 --type=Disperse \ --data 2 --redundancy 1 \ --device kube-nodename1:/dev/vdc \ --device kube-nodename2:/dev/vdc \ --device kube-nodename3:/dev/vdc \
To create a PVC from this storage, you need to provide
storageClassName
option askadalu.disperse
. -
Arbiter Arbiter volume type is a subset of Replica volume. Arbiter volume aims to reduce split-brain and inconsitencies amongst the storage-units/bricks. The Arbiter storage unit will only hold the directory structure, the file and its metadata without the content. Providing Replica3 level consistency without requiring 3 copies of space. This mode requires atleast 3 storages or mulitple of 3 storages if more storages are needed based on the requirement. Refer GlusterFS documentation to know more about Arbiter Volumes.
$ kubectl kadalu storage-add storage-pool1 --type=Arbiter \ --device kube-nodename1:/dev/vdc \ --device kube-nodename2:/dev/vdc \ --device kube-nodename3:/dev/vdc \
Above command creates storage-pool of type
Arbiter
where 1st and 2nd bricks are replicated data bricks and the 3rd brick is the Arbiter brick which holds no file contents.To create a PVC from this storage, you need to provide
storageClassName
option askadalu.arbiter
.
Storage
The native mode storage can be provided by many different types in kadalu. Lets look into each of those, with samples.
Storage from Device
In this case, a device (/dev/sd*
or similar), attached to a node in k8s cluster is provided as storage. The device is exported into the gluster server pod, and is formatted and mounted into specific brick path.
The sample command looks like below:
$ kubectl kadalu storage-add storage-pool1 \
--device kube-nodename:/dev/vdc
Note that both node
and device
fields are required. A device info without a node doesn’t contain all the required information for kadalu operator to start the brick process.
According to us, this is most common way of providing the storage to kadalu.
Also, if device
option has a file as option, the same file will be formatted and used as device too. This is particularly helpful as a testing option, where from same backend multiple devices needs to be carved out. Our CI/CD tests use this approach.
Storage from path
In this case, a directory path (/mnt/mount/path or similar) is exported as a brick from a node in the cluster as storage for gluster server process. This is particularly useful when a larger device is mounted and shared with other applications too.
Note that path option is valid only if the file system on the given path is xfs. path option is helpful for those who want to try kadalu in an existing setup. When path option is provided, kadalu operator doesn’t try to format and mount, but uses the path as export path for kadalu storage volume.
The sample command looks like below:
$ kubectl kadalu storage-add storage-pool1 \
--path kube-nodename:/mnt/mount/export-path
Again here, both node
and path
are required fields. kadalu operator won’t have all required information to start gluster server pods without these two fields.
Storage from another PVC
This is an interesting option, and makes sense specifically in a cloud environment, where a virtual storage device would be available as PVC in k8s cluster. kadalu can use a PVC, which is not bound to any 'node' as the storage, and provide multiple smaller PVCs through kadalu storageclass.
In this case, a PVC is exposed to kadalu’s server pod as storage through volumes
option of pod config. With that the given PVC exposed into the server pods, we expose the given storage through gluster.
The sample config looks like below:
$ kubectl kadalu storage-add storage-pool1 \
--pvc pvc-name-in-namespace
Note that this PVC should be available in 'kadalu' namespace. Also there is no need of mentioning node
field for this storage. k8s itself will start pod in relevant node in cluster.
External mode
In this mode, storage will be provided by gluster servers not managed by kadalu operator. Note that in this case, the gluster server can be running inside or outside k8s cluster.
The external mode can be specified with type
as External
. And when the type is External, the field it expects is details
. Lets look at a sample, and then describe each of the options it takes.
$ kubectl kadalu storage-add external-pool \
--external gluster_host:/gluster_volname
Above,
-
'gluster_host': This option takes one hostname or IP address, which is accessible from the k8s cluster.
-
'gluster_volname': Gluster volume name to be used as kadalu host storage volume. We prefer it to be a new volume created for kadalu.
Notice that to create PVC from External Storage config, you need to provide storageClassName
option as kadalu.{{ config-name }}
. In above case, it becomes kadalu.external-pool
.
How it works?
kadalu operator doesn’t start any storage pods when 'External' type is used, but creates a StorageClass
particular to this config, so when a PVC is created, the information is passed to the CSI drivers. The host-volume is mounted as below:
mount -t glusterfs {{ gluster_host }}:/{{ gluster_volname }} -o{{gluster_options}} /mount/point
Other than this, the CSI volume’s behavior would be same for both Native mode, and External mode.
Single PV per Pool
This option is provided in kadalu to access a gluster volume as a whole as PV. This is particularly useful if one wants to use an already existing Gluster volume as a PV (for example, a gluster volume created by heketi). We don’t recommend this for normal usage, as this mode would have scale limitations, and also would add more k8s resources likes StorageClass.
The example config file added for CI/CD gives an idea about options. Note that the options provided here looks same as whats given in storage config, but when kadalu operator creates the StorageClass with values supplied to field single_pv_per_pool
inorder to decide multiple PVs per pool or not. Refer the external-storage document for more information on this mode.
Archiving Persistent Volume Claims
Archiving a PVC retains its data when delete
is called on it. This option can be enabled by specifying pvReclaimPolicy
either through StorageClass
or Kadalu CLI
. The pvReclaimPolicy
takes either delete
, archive
or retain
. Default value is delete
which deletes the PVC along with its data. archive
option retains the data by renaming pvc-123
to archived-pvc-123
. retain
option retains the data in place.
-
Adding 'pvReclaimPolicy' through config file:
apiVersion: kadalu-operator.storage/v1alpha1 kind: KadaluStorage metadata: # This will be used as name of PV Hosting Volume name: storage1 spec: type: Replica1 pvReclaimPolicy: archive storage: - node: kube1 device: /dev/vdc
-
Adding 'pvReclaimPolicy' through Kadalu CLI:
$ kubectl kadalu storage-add storage-pool1 \ --device kube1:/dev/vdc --pv-reclaim-policy=archive
When PVCs are archived, the data is intact. Due to which 'storage-list' might still be showing consumption. One can free these archived data manually or through Kadalu CLI.
Note: While using option --pvc only pass the pvc which are archived.
-
Removing archived pvc(s) through Kadalu CLI:
$ kubectl kadalu remove-archived-pv storage-pool-1
$ kubectl kadalu remove-archived-pv storage-pool-1 --pvc=pvc-e91ab8c8-4a48-48ad-ab5e-b207399565bc
-
Removing archived pvc(s) through Kadalu CSI provisioner:
Exec into Kadalu Provisioner pod. Run the script 'remove_archived_pv.py' with similar arguments as Kadalu CLI.
$ cd /kadalu $ python remove_archived_pv.py storage-pool-1 --pvc=archived-pvc-123-456-789
Recreation of storage-pool with existing Volume ID
Every storage-pool created with device/path/pvc/external volumes will be associated with a unique Volume ID. In case of cleanup of Kadalu namespace and referencing the same existing volume will throw error since it has already been attached with unique ID, to avoid this and recreate the storage-pool specify the volume_id
in storage-config or --volume-id
in Kadalu CLI.
-
Adding 'volume_id' through config file:
apiVersion: kadalu-operator.storage/v1alpha1 kind: KadaluStorage metadata: # This will be used as name of PV Hosting Volume name: storage1 spec: type: Replica1 volume_id: example-1234-volumeid-7890 storage: - node: kube1 device: /dev/vdc
-
Adding 'volume_id' through Kadalu CLI:
$ kubectl kadalu storage-add storage-pool1 \ --device kube1:/dev/vdc --volume-id=example-1234-volumeid-7890
Configuring Storage Pool Options
Storage Pool can be modified in their behaviour by configuring with Storage Pool Options available on Kadalu. Storage Pool Options can be configured either through storage-config file or using Kubectl Kadalu CLI.
List of supported Storage Pool Options for Kadalu:
Option | Value |
---|---|
performance.client-io-threads |
on/off |
performance.stat-prefetch |
on/off |
performance.quick-read |
on/off |
performance.open-behind |
on/off |
performance.read-ahead |
on/off |
performance.io-cache |
on/off |
performance.readdir-ahead |
on/off |
-
Configuring 'Storage Pool Options' through config file:
apiVersion: kadalu-operator.storage/v1alpha1 kind: KadaluStorage metadata: # This will be used as name of PV Hosting Volume name: storage1 spec: type: Replica1 options: - key: "performance.quick-read" value: "off" - key: "performance.write-behind" value: "on" storage: - node: kube1 device: /dev/vdc
If any further Storage Pool Options is to be configured, simply apply the modified storage-config file again so that Kadalu Operator notices the change and will be handled appropriately by Kadalu Server pods.
-
Configuring 'Storage Pool Options' through Kadalu CLI:
To add/set Storage Pool Options, 'option-set' subcommand can be used. It expects name of Storage Pool and Option key(s) and value(s).
$ kubectl kadalu option-set storage-pool1 \ performance.quick-read off performance.write-behind on
To remove/reset Storage Pool Options, 'option-reset' subcommand can be used. It expects name of Storage Pool and Option key(s) to be removed.
$ kubectl kadalu option-reset storage-pool1 \ performance.quick-read performance.write-behind
Additionally all of configured options for a Storage Pool can be removed with '--all' flag.
$ kubectl kadalu option-reset storage-pool1 --all