Availability Zones in Openstack and Openshift (Part 1)

After seeing a few too many availability zone-related issues popping up in OpenShift clusters of late, I’ve decided it might make sense to document the situation with OpenStack AZs on OpenShift (and, by extension, Kubernetes). This is the first of two parts. This part provides some background on what AZs are and how you can configure them, while the second part examines how AZs affect OpenShift and Kubernetes components such as the OpenStack Machine API Provider, the OpenStack Cluster API Provider, and the Cinder and Manila CSI drivers.

Background

Both the Compute (Nova) and Block Storage (Cinder) services in OpenStack support the concept of Availability Zones (AZs) and the envisioned use cases is very similar for both. Quoting from the Nova documentation:

Availability Zones are an end-user visible logical abstraction for partitioning a cloud without knowing the physical infrastructure. They can be used to partition a cloud on arbitrary factors, such as location (country, datacenter, rack), network layout and/or power source.

The Nova documentation then goes on to specifically note that the AZ feature provides no HA benefit in and of itself - whatever benefits there are are entirely down to how the deployment is designed - thus it’s really just a way to signal something you’ve done in your physical deployment. All of this is equally true of both Nova and Cinder, and in my experience I’ve seen AZs used to demarcate both compute and block storage nodes existing on different racks or in different datacenters.

Implementation

As you might expect, Nova’s AZs are an attribute of the compute hosts (hosts running the nova-compute service), while Cinder’s are an attribute of the block storage hosts (hosts running the cinder-volume service). In a Hyperconverged Infrastructure (HCI) deployment, where compute and block storage services run side-by-side on hyperconverged hosts, the compute hosts are the block storage hosts and this is meaningless difference. In a non-HCI deployment, this is unlikely to be the case but this hasn’t prevented people and applications from frequently munging the two types of AZ, as we will see later. Because this conflation of different AZ types can happen, the general expectation we would have is that one of the following is true:

  1. There is only a single compute AZ, a single block storage and they have the same name. This is the default configuration if you use “stock” OpenStack: Nova’s default AZ is nova and Cinder helpfully defaults to the same value.

  2. There are multiple compute and block storage AZs, but there is the same number of both and they share the same name. For example, both the compute and block storage services have the following AZs defined: AZ0, AZ1, and AZ2. In this case, users and applications which incorrectly use compute host AZ information to configure the AZ of volumes and related block storage resources will “just work”.

  3. There are multiple compute and block storage AZs, and there is either a different number of each or they have different names. For example, the compute services have the compute-az0 and compute-az1 AZs defined while the block storage services have the volume-az0 and volume-az1 AZs defined. In this case, the users and applications must be very careful to explicitly specify a correct AZ when creating volumes and related block storage resources and must ensure Nova is configured to allow attaching volumes in other AZs (more of this later too).

Finally, a compute AZ can be specified be when creating an instance (or “server”, in OpenStackClient parlance), while a block storage AZ can be requested when creating volumes, volume backups, volume groups, and volume groups’ deprecated predecessor, consistency groups. An instance that is not explicitly assigned an AZ during creation (e.g. server create) will be assigned the AZ of the host that it is eventually scheduled to. This is also the case when creating volume or related resources. However, while nova will always reject requests with invalid AZs, cinder allows you to configure a fallback to a default value.

Configuration

Since this feature exists across two services, there are two sets of configuration options to be concerned with.

Nova

As of the 2023.1 (Antelope) release, Nova has three relevant configuration options:

  • [DEFAULT] default_availability_zone defines the default AZ of each compute host, which can be changed by adding the host to a host aggregate and setting the special availability_zone metadata property as described in the nova docs. This option defaults to nova and as noted in the nova docs, the default AZ should never explicitly requesting this AZ when creating new instances since it will prevent migration of instance between different hosts in different AZs (which is allowed by default if the AZ was unset during initial creation) as well as identification of hosts that are missing AZ information. You have been warned.

  • [DEFAULT] default_schedule_zone defines the default AZ that should be assigned to an instance on creation. If this is unset, the instance will be assigned an implicit AZ of the host it lands on. You might want to use this if you wanted the majority of instances to go into a “generic” AZ while special instances can go into specific AZs.

  • [cinder] cross_az_attach determines whether volumes are allowed to be attached to an instance if the instance host’s compute AZ differs from that of the volume’s block storage AZ. It also determines whether volumes created when creating a boot-from-volume server have an explicit AZ associated with them or not. This defaults to true and with good reason, given the aforementioned caveats around munging of compute and block storage AZs and the need for them to be identical.

There is also the [DEFAULT] internal_service_availability_zone configuration option, but this has no real impact for end-users.

Cinder

As of the 2023.1 (Antelope) release, Cinder has four relevant configuration options:

  • [DEFAULT] storage_availability_zone defines the default AZ of the block storage host. This defaults to nova and can be overridden on a per-backend basis using [foo] backend_availability_zone. Speaking of which…

  • [foo] backend_availability_zone define the default AZ for a specific backend of the block storage host. foo should be the name of the volume backend, as defined in [DEFAULT] enabled_backends.

  • [DEFAULT] default_availability_zone defines the default AZ that should be assigned to a volume on creation. If this is unset, the volume will be assigned the AZ of the host it lands on (which in turn defaults to [DEFAULT] storage_availability_zone, per above).

  • [DEFAULT] allow_availability_zone_fallback allows you to ignore an request for an invalid block storage AZ and instead fallback to the default AZ defined in [DEFAULT] default_availability_zone. This defaults to false, though to be honest true is probably a sensible value for configurations where e.g. there are multiple compute AZs and a single volume AZ.

Usage

This isn’t really the point of this article, but you can use AZs when creating resources using OpenStackClient. For example, when creating an instance (or “server”):

openstack server create --availibility-zone compute-az1 ...

Likewise, when creating a volume:

openstack volume create --availability-zone volume-az1 ...

Or when creating a volume backup:

openstack volume backup create --availability-zone volume-az2 ...

Other API libraries like gophercloud also expose these attributes and allow them to be configured, but we won’t go into that here.

Next Steps

In the next post, I’ll detail how all of these pieces play into various OpenStack-specific components of OpenShift and Kubernetes.

comments powered by Disqus