Cinder CSI Driver

Image by John Reed / Unsplash

The Cinder CSI Driver, like most CSI Drivers in OpenShift, is managed by the Cinder CSI Driver Operator. This document contains notes on both Cinder CSI Driver and Cinder CSI Driver Operator.

Deployment

As noted above, the Cinder CSI Driver is managed by the Cinder CSI Driver Operator. The source for this operator can be found in the openshift/csi-operator repository, though before OCP 4.18 it could be found in the openshift/openstack-cinder-csi-driver-operator repository. This operator is responsible for deploying the controller pods onto control plane nodes and the driver pods onto worker nodes. It does this using a deployment and a daemonset, respectively. It is also responsible for creating necessary service accounts, roles, and role bindings, along with default StorageClass and VolumeSnapshotClass implementations. In a standalone deployment, you can see most of these created in the openshift-cluster-csi-drivers namespace:

❯ oc get -n openshift-cluster-csi-drivers all
NAME                                                          READY   STATUS    RESTARTS        AGE
pod/manila-csi-driver-operator-6b9645777-hn4fj                1/1     Running   1 (3d18h ago)   3d18h
pod/openstack-cinder-csi-driver-controller-8554666ccc-ltjqm   10/10   Running   0               3d18h
pod/openstack-cinder-csi-driver-controller-8554666ccc-zglhh   10/10   Running   4 (3d18h ago)   3d18h
pod/openstack-cinder-csi-driver-node-454fz                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-node-b4r6v                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-node-cvqt9                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-node-l52qd                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-node-rl6gx                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-node-sssgp                    3/3     Running   0               3d18h
pod/openstack-cinder-csi-driver-operator-7f4cf55b98-56smx     1/1     Running   1 (3d18h ago)   3d18h

NAME                                                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                           AGE
service/openstack-cinder-csi-driver-controller-metrics   ClusterIP   172.30.188.107   <none>        443/TCP,444/TCP,445/TCP,446/TCP   3d18h

NAME                                              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/openstack-cinder-csi-driver-node   6         6         6       6            6           kubernetes.io/os=linux   3d18h

NAME                                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/manila-csi-driver-operator               1/1     1            1           3d18h
deployment.apps/openstack-cinder-csi-driver-controller   2/2     2            2           3d18h
deployment.apps/openstack-cinder-csi-driver-operator     1/1     1            1           3d18h

NAME                                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/manila-csi-driver-operator-6b9645777                1         1         1       3d18h
replicaset.apps/openstack-cinder-csi-driver-controller-8554666ccc   2         2         2       3d18h
replicaset.apps/openstack-cinder-csi-driver-operator-7f4cf55b98     1         1         1       3d18h

❯ oc get -n openshift-cluster-csi-drivers cm
NAME                                            DATA   AGE
cloud-conf                                      2      3d18h
cloud-provider-config                           1      3d18h
kube-root-ca.crt                                1      3d18h
openshift-service-ca.crt                        1      3d18h
openstack-cinder-csi-driver-trusted-ca-bundle   1      3d18h

❯ oc get -n openshift-cluster-csi-drivers secret
NAME                                                          TYPE                      DATA   AGE
builder-dockercfg-lzknz                                       kubernetes.io/dockercfg   1      3d18h
default-dockercfg-n2q5s                                       kubernetes.io/dockercfg   1      3d18h
deployer-dockercfg-q4ftf                                      kubernetes.io/dockercfg   1      3d18h
manila-cloud-credentials                                      Opaque                    1      3d18h
manila-csi-driver-operator-dockercfg-79l7t                    kubernetes.io/dockercfg   1      3d18h
openstack-cinder-csi-driver-controller-metrics-serving-cert   kubernetes.io/tls         2      3d18h
openstack-cinder-csi-driver-controller-sa-dockercfg-klpwf     kubernetes.io/dockercfg   1      3d18h
openstack-cinder-csi-driver-node-sa-dockercfg-zst79           kubernetes.io/dockercfg   1      3d18h
openstack-cinder-csi-driver-operator-dockercfg-g659j          kubernetes.io/dockercfg   1      3d18h
openstack-cloud-credentials                                   Opaque                    1      3d18h

The Cinder CSI Driver Operator is itself managed by another operator, the Cluster Storage Operator. The source for this operator can be found in the openshift/cluster-storage-operator repository. The Cluster Storage Operator is responsible for deploying the pods that run the Cinder CSI Driver Operator binary (via a deployment) alongside the necessary service account, roles and role bindings. It also create an appropriate ClusterCSIDriver CR which allows the CSI Driver to report status back up to the CSI Driver Operator.

Finally, completing the trifecta, the Cluster Storage Operator is managed by the Cluster Version Operator (CVO). This is relatively common: CVO’s job is to ensure that all operator are running with the correct version expected for the given OCP release (so you don’t end up with e.g. a 4.18 version on Cluster Storage Operator running on a 4.16 cluster, or vice versa).

Credentials

Most credentials in OpenShift are managed by the Cloud Credential Operator. The Cinder CSI Driver is no different. The root cloud credentials are stored in a secret at kube-system / openstack-credentials:

❯ oc get -n kube-system secret openstack-credentials -o yaml | yq '.data'
clouds.conf: <redacted>
clouds.yaml: <redacted>

These are then rolled out to secrets in other namespaces based on CredentialsRequest CRs. There are many CredentialsRequest CRs in a standard OpenShift Installer-provisioned cluster but the CredentialsRequest CR for the Cinder CSI Driver Operator is stored at openshift-cloud-credential-operator / openshift-cluster-csi-drivers:

❯ oc get -n openshift-cloud-credential-operator CredentialsRequest openshift-cluster-csi-drivers -o yaml | yq '.spec'
providerSpec:
  apiVersion: cloudcredential.openshift.io/v1
  kind: OpenStackProviderSpec
secretRef:
  name: openstack-cloud-credentials
  namespace: openshift-cluster-csi-drivers

Based on this CR, we can expect the resulting secret to be found in openshift-cluster-csi-drivers / openstack-cloud-credentials. And indeed, we can see the secret in a deployment:

❯ oc get -n openshift-cluster-csi-drivers secret openstack-cloud-credentials -o yaml | yq '.data'
clouds.yaml: <redacted>

Thus, to rotate the credentials for the Cinder CSI Driver Operator, and in turn the Cinder CSI Driver, you should update the secret at kube-system / openstack-credentials. The Cloud Credential Operator will then roll this out and the Operator will restart all affected pods (meaning, the openstack-cinder-csi-driver-controller-* and openstack-cinder-csi-driver-node- pods. The Operator accomplishes this by way of hooks which calculate hashes derived from the various secrets and config maps that we care about. These hashes are added as annotations on the deployment (for the controller pods) and daemonset (for the node pods), and the operator relies on the fact that any changes to a deployment or daemonset will result in new pods being rolled out. You can see these by inspecting the resources. For example:

❯ oc get -n openshift-cluster-csi-drivers deployment openstack-cinder-csi-driver-controller -o yaml | yq '.metadata.annotations | with_entries(select(.key == "operator.openshift.io/dep-*"))'
operator.openshift.io/dep-2a78a60e2cfb360d9fe72d1859d93b0fdd3bc: Dpyxpw==
operator.openshift.io/dep-4b51488ec5b742a098d092dfe49449df0986e: J1mZaA==
operator.openshift.io/dep-12129a17758a4339b718cd8f746bebe59b0e2: PDfUoQ==
operator.openshift.io/dep-28104e6bcb20f4d44a571c8e2f3f0a1f5a880: ElMHxA==

Credential rotation

The below script is an example of how to modify the credentials. In this case, we are setting the volume_api_version attribute for the cloud entry, however, there’s no reason you could choose to replace the credentials with wholly new credentials.

Development

As discussed above, in a standalone deployment, the management of Cinder CSI Driver effectively occurs through three operators: the Cinder CSI Driver Operator, the Cluster Storage Operator, and the Cluster Version Operator. As a result, in order to work on the Cinder CSI Driver or the operator - at least without building and deploying a custom release image - you will need to scale down both the Cluster Version Operator and Cluster Version Operator in that order:

❯ oc scale -n openshift-cluster-version --replicas 0 deployments/cluster-version-operator
❯ oc scale -n openshift-cluster-storage-operator --replicas 0 deployment/cluster-storage-operator

If you are in a Hypershift cluster, the above is mostly still true. However, as Hypershift doesn’t have the Cluster Version Operator you instead need to scale down two other operators: the Hypershift Operator and the Control Plane Operator. Once again, this needs to be done in that order:

❯ oc scale -n hypershift --replicas 0 deployments/operator
❯ oc scale -n clusters-foo --replicas 0 deployments/control-plane-operator
❯ oc scale -n clusters-foo --replicas 0 deployments/cluster-storage-operator

where foo is the name given to your guest cluster.

Failure to scale down these operators will result in any changes you make to Cluster Storage Operator-managed resources

  • such as the Cinder CSI Driver Operator deployment - being overridden almost immediately.

Once you’ve scaled these down, you’re free to manipulate the Cinder CSI Driver Operator as you see fit. For example, if you are iterating on changes to the Operator itself, you can build and publish an image with your changes included and then modify the deployment using e.g. oc edit to reference this new image. For example:

❯ podman build -t ghcr.io/<username>/csi-operator-openstack-cinder:latest -f Dockerfile.openstack-cinder
❯ podman push ghcr.io/<username>/csi-operator-openstack-cinder:latest

❯ oc edit -n openshift-cluster-csi-drivers deployment openstack-cinder-csi-driver-operator
# ... or for hypershift...
❯ oc edit -n clusters-foo deployment openstack-cinder-csi-driver-operator

If you are iterating on changes to Cinder CSI Driver itself (or one of the sidecar containers), then you can either set the appropriate IMAGE environment variable in the deployment (e.g. DRIVER_IMAGE) or scale down the CSI Operator also and manipulate the deployments and daemonsets as necessary.