By default, the control plane of a workload cluster created by CAPA will span multiple availability zones (AZs) (also referred to as “failure domains”) when using multiple control plane nodes. This is because CAPA will, by default, create public and private subnets in all the AZs of a region (up to a maximum of 3 AZs by default). If a region has more than 3 AZs then CAPA will pick 3 AZs to use.
The Cluster API controller will look for the FailureDomain status field and will set the FailureDomain field in a
Machine if a value hasn’t already been explicitly set. It will try to ensure that the machines are spread across all the failure domains.
AWSMachine controller looks for a failure domain (i.e. Availability Zone) first in the
Machine before checking in the
network specification of
AWSMachine. This failure domain is then used when provisioning the
To control the placement of
AWSMachine into a failure domain (i.e. Availability Zones), we can explicitly state the failure domain in
Machine. The best way is to specify this using the FailureDomain field within the
apiVersion: cluster.x-k8s.io/v1beta1 kind: Machine metadata: labels: cluster.x-k8s.io/cluster-name: my-cluster cluster.x-k8s.io/control-plane: "true" name: controlplane-0 namespace: default spec: version: "v1.22.1" clusterName: my-cluster failureDomain: "1" bootstrap: configRef: apiVersion: bootstrap.cluster.x-k8s.io/v1beta1 kind: KubeadmConfigTemplate name: my-cluster-md-0 infrastructureRef: apiVersion: infrastructure.cluster.x-k8s.io/v1beta1 kind: AWSMachineTemplate name: my-cluster-md-0
IMPORTANT WARNING: All the replicas within a
MachineDeploymentwill reside in the same Availability Zone.
Another way to explicitly instruct CAPA to create resources in specific AZs (and not by random), users can add a
network object to the AWSCluster specification. Here is an example
network that creates resources across three AZs in the “us-west-2” region:
spec: network: vpc: cidrBlock: 10.50.0.0/16 subnets: - availabilityZone: us-west-2a cidrBlock: 10.50.0.0/20 isPublic: true - availabilityZone: us-west-2a cidrBlock: 10.50.16.0/20 - availabilityZone: us-west-2b cidrBlock: 10.50.32.0/20 isPublic: true - availabilityZone: us-west-2b cidrBlock: 10.50.48.0/20 - availabilityZone: us-west-2c cidrBlock: 10.50.64.0/20 isPublic: true - availabilityZone: us-west-2c cidrBlock: 10.50.80.0/20
Note: This method can also be used with worker nodes as well.
Specifying the CIDR block alone for the VPC is not enough; users must also supply a list of subnets that provides the desired AZ, the CIDR for the subnet, and whether the subnet is public (has a route to an Internet gateway) or is private (does not have a route to an Internet gateway).
Note that CAPA insists that there must be a public subnet (and associated Internet gateway), even if no public load balancer is requested for the control plane. Therefore, for every AZ where a control plane node should be placed, the
network object must define both a public and private subnet.
Once CAPA is provided with a
network that spans multiple AZs, the KubeadmControlPlane controller will automatically distribute control plane nodes across multiple AZs. No further configuration from the user is required.
Note: This method can also be used if you do not want to split your EC2 instances across multiple AZs.
When creating default subnets by default a maximum of 3 AZs will be used. If you are creating a cluster in a region that has more than 3 AZs then 3 AZs will be picked based on alphabetical from that region.
If this default behavior for maximum number of AZs and ordered selection method doesn’t suit your requirements you can use the following to change the behaviour:
availabilityZoneUsageLimit- specifies the maximum number of availability zones (AZ) that should be used in a region when automatically creating subnets.
availabilityZoneSelection- specifies how AZs should be selected if there are more AZs in a region than specified by availabilityZoneUsageLimit. There are 2 selection schemes:
Ordered- selects based on alphabetical order
Random- selects AZs randomly in a region
For example if you wanted have a maximum of 2 AZs using a random selection scheme:
spec: network: vpc: availabilityZoneUsageLimit: 2 availabilityZoneSelection: Random
Deploying control plane nodes across multiple AZs is not a panacea to cure all availability concerns. The sizing and overall utilization of the cluster will greatly affect the behavior of the cluster and the workloads hosted there in the event of an AZ failure. Careful planning is needed to maximize the availability of the cluster even in the face of an AZ failure. There are also other considerations, like cross-AZ traffic charges, that should be taken into account.