Go Report Card

Powered by AWS Cloud Computing


Kubernetes-native declarative infrastructure for AWS.

What is the Cluster API Provider AWS

The Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management.

The API itself is shared across multiple cloud providers allowing for true AWS hybrid deployments of Kubernetes. It is built atop the lessons learned from previous cluster managers such as kops and kubicorn.

Documentation

Please see our book for in-depth documentation.

Launching a Kubernetes cluster on AWS

Check out the Cluster API Quick Start for launching a cluster on AWS.

Features

  • Native Kubernetes manifests and API
  • Manages the bootstrapping of VPCs, gateways, security groups and instances.
  • Choice of Linux distribution between Amazon Linux 2, CentOS 7 and Ubuntu 18.04, using pre-baked AMIs.
  • Deploys Kubernetes control planes into private subnets with a separate bastion server.
  • Doesn’t use SSH for bootstrapping nodes.
  • Installs only the minimal components to bootstrap a control plane and workers.
  • Supports control planes on EC2 instances.
  • Experimental EKS support

Compatibility with Cluster API and Kubernetes Versions

This provider’s versions are compatible with the following versions of Cluster API:

Cluster API v1alpha1 (v0.1)Cluster API v1alpha2 (v0.2)Cluster API v1alpha3 (v0.3)
AWS Provider v1alpha1 (v0.2)
AWS Provider v1alpha1 (v0.3)
AWS Provider v1alpha2 (v0.4)
AWS Provider v1alpha3 (v0.5)
AWS Provider v1alpha3 (v0.6)

This provider’s versions are able to install and manage the following versions of Kubernetes:

Kubernetes 1.13Kubernetes 1.14Kubernetes 1.15Kubernetes 1.16Kubernetes 1.17Kubernetes 1.18Kubernetes 1.19Kubernetes 1.20
AWS Provider v1alpha1 (v0.2)
AWS Provider v1alpha1 (v0.3)
AWS Provider v1alpha2 (v0.4)
AWS Provider v1alpha3 (v0.5)
AWS Provider v1alpha3 (v0.6)

Each version of Cluster API for AWS will attempt to support two Kubernetes versions; e.g., Cluster API for AWS v0.2 may support Kubernetes 1.13 and Kubernetes 1.14.

NOTE: As the versioning for this project is tied to the versioning of Cluster API, future modifications to this policy may be made to more closely align with other providers in the Cluster API ecosystem.


Kubernetes versions with published AMIs

Note: These AMIs are not updated for security fixes and it is recommended to always use the latest patch version for the Kubernetes version you wish to run. For production-like environments, it is highly recommended to build and use your own custom images.

Kubernetes minor versionKubernetes full version
v1.16v1.16.0
v1.16.1
v1.16.2
v1.16.3
v1.16.4
v1.16.5
v1.16.6
v1.16.7
v1.16.8
v1.16.9
v1.16.14
v1.16.15
v1.17v1.17.0
v1.17.1
v1.17.2
v1.17.3
v1.17.4
v1.17.5
v1.17.11
v1.17.12
v1.17.13
v1.17.14
v1.17.15
v1.17.16
v1.17.17
v1.18v1.18.0
v1.18.1
v1.18.2
v1.18.8
v1.18.9
v1.18.10
v1.18.12
v1.18.13
v1.18.14
v1.18.15
v1.18.16
v1.19v1.19.0
v1.19.1
v1.19.2
v1.19.3
v1.19.4
v1.19.5
v1.19.6
v1.19.7
v1.19.8
v1.20v1.20.1
v1.20.2
v1.20.4

Getting involved and contributing

Are you interested in contributing to cluster-api-provider-aws? We, the maintainers and community, would love your suggestions, contributions, and help! Also, the maintainers can be contacted at any time to learn more about how to get involved.

In the interest of getting more new people involved we tag issues with good first issue. These are typically issues that have smaller scope but are good ways to start to get acquainted with the codebase.

We also encourage ALL active community participants to act as if they are maintainers, even if you don’t have “official” write permissions. This is a community effort, we are here to serve the Kubernetes community. If you have an active interest and you want to get involved, you have real power! Don’t assume that the only people who can get things done around here are the “maintainers”.

We also would love to add more “official” maintainers, so show us what you can do!

This repository uses the Kubernetes bots. See a full list of the commands here.

Build the images locally

If you want to just build the CAPA containers locally, run

	REGISTRY=docker.io/my-reg make docker-build

Tilt-based development environment

We have support for using Tilt for rapid iterative development. Please visit the Cluster API documentation on Tilt for information on how to set up your development environment. Additionally, you must also include your base64 encoded AWS credentials in your tilt-settings.json file or you will not be able to deploy this provider.

  1. make clusterawsadm
  2. export AWS_REGION=<your desired region
  3. ./bin/clusterawsadm alpha bootstrap encode-aws-credentials
  4. Copy the output containing the base64 encoded credentials and add it to your tilt-settings.json file like this:
{
  "allowed_contexts": ["kind-kind"],
  "default_registry": "your registry here",
  "provider_repos": ["../cluster-api-provider-aws"],
  "enable_providers": ["aws"],
  "kustomize_substitutions": {
    "AWS_B64ENCODED_CREDENTIALS": "put your encoded credentials here"
  }
}

Implementer office hours

Maintainers hold office hours every two weeks, with sessions open to all developers working on this project.

Office hours are hosted on a zoom video chat every other Monday at 10:00 (Pacific) / 13:00 (Eastern) / 18:00 (Europe/London), and are published on the Kubernetes community meetings calendar.

Other ways to communicate with the contributors

Please check in with us in the #cluster-api-aws channel on Slack.

Github issues

Bugs

If you think you have found a bug please follow the instructions below.

  • Please spend a small amount of time giving due diligence to the issue tracker. Your issue might be a duplicate.
  • Get the logs from the cluster controllers. Please paste this into your issue.
  • Open a new issue.
  • Remember that users might be searching for your issue in the future, so please give it a meaningful title to help others.
  • Feel free to reach out to the cluster-api community on the kubernetes slack.

Tracking new features

We also use the issue tracker to track features. If you have an idea for a feature, or think you can help kops become even more awesome follow the steps below.

  • Open a new issue.
  • Remember that users might be searching for your issue in the future, so please give it a meaningful title to help others.
  • Clearly define the use case, using concrete examples. EG: I type this and cluster-api-provider-aws does that.
  • Some of our larger features will require some design. If you would like to include a technical design for your feature please include it in the issue.
  • After the new feature is well understood, and the design agreed upon, we can start coding the feature. We would love for you to code it. So please open up a WIP (work in progress) pull request, and happy coding.

“Amazon Web Services, AWS, and the “Powered by AWS” logo materials are trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries.”

Getting Started

Quick Start

In this tutorial we’ll cover the basics of how to use Cluster API to create one or more Kubernetes clusters.

Installation

Common Prerequisites

Install and/or configure a kubernetes cluster

Cluster API requires an existing Kubernetes cluster accessible via kubectl; during the installation process the Kubernetes cluster will be transformed into a management cluster by installing the Cluster API provider components, so it is recommended to keep it separated from any application workload.

It is a common practice to create a temporary, local bootstrap cluster which is then used to provision a target management cluster on the selected infrastructure provider.

Choose one of the options below:

  1. Existing Management Cluster

For production use-cases a “real” Kubernetes cluster should be used with appropriate backup and DR policies and procedures in place. The Kubernetes cluster must be at least v1.19.1.

export KUBECONFIG=<...>
  1. Kind

kind can be used for creating a local Kubernetes cluster for development environments or for the creation of a temporary bootstrap cluster used to provision a target management cluster on the selected infrastructure provider.

The installation procedure depends on the version of kind; if you are planning to use the docker infrastructure provider, please follow the additional instructions in the dedicated tab:

Create the kind cluster:

kind create cluster

Test to ensure the local kind cluster is ready:

kubectl cluster-info

Run the following command to create a kind config file for allowing the Docker provider to access Docker on the host:

cat > kind-cluster-with-extramounts.yaml <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraMounts:
    - hostPath: /var/run/docker.sock
      containerPath: /var/run/docker.sock
EOF

Then follow the instruction for your kind version using kind create cluster --config kind-cluster-with-extramounts.yaml to create the management cluster using the above file.

Install clusterctl

The clusterctl CLI tool handles the lifecycle of a Cluster API management cluster.

Install clusterctl binary with curl on linux

Download the latest release; for example, to download version v0.3.0 on linux, type:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.3.14/clusterctl-linux-amd64 -o clusterctl

Make the clusterctl binary executable.

chmod +x ./clusterctl

Move the binary in to your PATH.

sudo mv ./clusterctl /usr/local/bin/clusterctl

Test to ensure the version you installed is up-to-date:

clusterctl version
Install clusterctl binary with curl on macOS

Download the latest release; for example, to download version v0.3.0 on macOS, type:

curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.3.14/clusterctl-darwin-amd64 -o clusterctl

Make the clusterctl binary executable.

chmod +x ./clusterctl

Move the binary in to your PATH.

sudo mv ./clusterctl /usr/local/bin/clusterctl

Test to ensure the version you installed is up-to-date:

clusterctl version

Initialize the management cluster

Now that we’ve got clusterctl installed and all the prerequisites in place, let’s transform the Kubernetes cluster into a management cluster by using clusterctl init.

The command accepts as input a list of providers to install; when executed for the first time, clusterctl init automatically adds to the list the cluster-api core provider, and if unspecified, it also adds the kubeadm bootstrap and kubeadm control-plane providers.

Initialization for common providers

Depending on the infrastructure provider you are planning to use, some additional prerequisites should be satisfied before getting started with Cluster API. See below for the expected settings for common providers.

Download the latest binary of clusterawsadm from the AWS provider releases and make sure to place it in your path. You need at least version v0.5.5 for these instructions. Instructions for older versions of clusterawsadm are available in Github.

The clusterawsadm command line utility assists with identity and access management (IAM) for Cluster API Provider AWS.

export AWS_REGION=us-east-1 # This is used to help encode your environment variables
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<session-token> # If you are using Multi-Factor Auth.

# The clusterawsadm utility takes the credentials that you set as environment
# variables and uses them to create a CloudFormation stack in your AWS account
# with the correct IAM resources.
clusterawsadm bootstrap iam create-cloudformation-stack

# Create the base64 encoded credentials using clusterawsadm.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile)

# Finally, initialize the management cluster
clusterctl init --infrastructure aws

See the AWS provider prerequisites document for more details.

For more information about authorization, AAD, or requirements for Azure, visit the Azure provider prerequisites document.

export AZURE_SUBSCRIPTION_ID="<SubscriptionId>"

# Create an Azure Service Principal and paste the output here
export AZURE_TENANT_ID="<Tenant>"
export AZURE_CLIENT_ID="<AppId>"
export AZURE_CLIENT_SECRET="<Password>"

# Azure cloud settings
# To use the default public cloud, otherwise set to AzureChinaCloud|AzureGermanCloud|AzureUSGovernmentCloud
export AZURE_ENVIRONMENT="AzurePublicCloud"

export AZURE_SUBSCRIPTION_ID_B64="$(echo -n "$AZURE_SUBSCRIPTION_ID" | base64 | tr -d '\n')"
export AZURE_TENANT_ID_B64="$(echo -n "$AZURE_TENANT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_ID_B64="$(echo -n "$AZURE_CLIENT_ID" | base64 | tr -d '\n')"
export AZURE_CLIENT_SECRET_B64="$(echo -n "$AZURE_CLIENT_SECRET" | base64 | tr -d '\n')"

# Finally, initialize the management cluster
clusterctl init --infrastructure azure
export DIGITALOCEAN_ACCESS_TOKEN=<your-access-token>
export DO_B64ENCODED_CREDENTIALS="$(echo -n "${DIGITALOCEAN_ACCESS_TOKEN}" | base64 | tr -d '\n')"

# Initialize the management cluster
clusterctl init --infrastructure digitalocean

The docker provider does not require additional prerequisites. You can run

clusterctl init --infrastructure docker
# Create the base64 encoded credentials by catting your credentials json.
# This command uses your environment variables and encodes
# them in a value to be stored in a Kubernetes Secret.
export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp-credentials.json | base64 | tr -d '\n' )

# Finally, initialize the management cluster
clusterctl init --infrastructure gcp
# The username used to access the remote vSphere endpoint
export VSPHERE_USERNAME="vi-admin@vsphere.local"
# The password used to access the remote vSphere endpoint
# You may want to set this in ~/.cluster-api/clusterctl.yaml so your password is not in
# bash history
export VSPHERE_PASSWORD="admin!23"

# Finally, initialize the management cluster
clusterctl init --infrastructure vsphere

For more information about prerequisites, credentials management, or permissions for vSphere, see the vSphere project.

# Initialize the management cluster
clusterctl init --infrastructure openstack

Please visit the Metal3 project.

In order to initialize the Packet Provider you have to expose the environment variable PACKET_API_KEY. This variable is used to authorize the infrastructure provider manager against the Packet API. You can retrieve your token directly from the Packet Portal.

export PACKET_API_KEY="34ts3g4s5g45gd45dhdh"

clusterctl init --infrastructure packet

The output of clusterctl init is similar to this:

Fetching providers
Installing cert-manager
Waiting for cert-manager to be available...
Installing Provider="cluster-api" Version="v0.3.0" TargetNamespace="capi-system"
Installing Provider="bootstrap-kubeadm" Version="v0.3.0" TargetNamespace="capi-kubeadm-bootstrap-system"
Installing Provider="control-plane-kubeadm" Version="v0.3.0" TargetNamespace="capi-kubeadm-control-plane-system"
Installing Provider="infrastructure-aws" Version="v0.5.0" TargetNamespace="capa-system"

Your management cluster has been initialized successfully!

You can now create your first workload cluster by running the following:

  clusterctl config cluster [name] --kubernetes-version [version] | kubectl apply -f -

Create your first workload cluster

Once the management cluster is ready, you can create your first workload cluster.

Preparing the workload cluster configuration

The clusterctl config cluster command returns a YAML template for creating a workload cluster.

Required configuration for common providers

Depending on the infrastructure provider you are planning to use, some additional prerequisites should be satisfied before configuring a cluster with Cluster API. Instructions are provided for common providers below.

Otherwise, you can look at the clusterctl config cluster command documentation for details about how to discover the list of variables required by a cluster templates.

export AWS_REGION=us-east-1
export AWS_SSH_KEY_NAME=default
# Select instance types
export AWS_CONTROL_PLANE_MACHINE_TYPE=t3.large
export AWS_NODE_MACHINE_TYPE=t3.large

See the AWS provider prerequisites document for more details.

# Name of the Azure datacenter location. Change this value to your desired location.
export AZURE_LOCATION="centralus" 

# Select VM types.
export AZURE_CONTROL_PLANE_MACHINE_TYPE="Standard_D2s_v3"
export AZURE_NODE_MACHINE_TYPE="Standard_D2s_v3"

A ClusterAPI compatible image must be available in your DigitalOcean account. For instructions on how to build a compatible image see image-builder.

export DO_REGION=nyc1
export DO_SSH_KEY_FINGERPRINT=<your-ssh-key-fingerprint>
export DO_CONTROL_PLANE_MACHINE_TYPE=s-2vcpu-2gb
export DO_CONTROL_PLANE_MACHINE_IMAGE=<your-capi-image-id>
export DO_NODE_MACHINE_TYPE=s-2vcpu-2gb
export DO_NODE_MACHINE_IMAGE==<your-capi-image-id>

The docker provider does not require additional configurations for cluster templates.

However, if you require special network settings you can set the following environment variables:

# The list of service CIDR, default ["10.128.0.0/12"]
export SERVICE_CIDR=["10.96.0.0/12"]

# The list of pod CIDR, default ["192.168.0.0/16"]
export POD_CIDR=["192.168.0.0/16"]

# The service domain, default "cluster.local"
export SERVICE_DOMAIN="k8s.test"

See the GCP provider for more information.

It is required to use an official CAPV machine images for your vSphere VM templates. See uploading CAPV machine images for instructions on how to do this.

# The vCenter server IP or FQDN
export VSPHERE_SERVER="10.0.0.1"
# The vSphere datacenter to deploy the management cluster on
export VSPHERE_DATACENTER="SDDC-Datacenter"
# The vSphere datastore to deploy the management cluster on
export VSPHERE_DATASTORE="vsanDatastore"
# The VM network to deploy the management cluster on
export VSPHERE_NETWORK="VM Network"
# The vSphere resource pool for your VMs
export VSPHERE_RESOURCE_POOL="*/Resources"
# The VM folder for your VMs. Set to "" to use the root vSphere folder
export VSPHERE_FOLDER="vm"
# The VM template to use for your VMs
export VSPHERE_TEMPLATE="ubuntu-1804-kube-v1.17.3"
# The VM template to use for the HAProxy load balancer of the management cluster
export VSPHERE_HAPROXY_TEMPLATE="capv-haproxy-v0.6.0-rc.2"
# The public ssh authorized key on all machines
export VSPHERE_SSH_AUTHORIZED_KEY="ssh-rsa AAAAB3N..."

clusterctl init --infrastructure vsphere

For more information about prerequisites, credentials management, or permissions for vSphere, see the vSphere getting started guide.

A ClusterAPI compatible image must be available in your OpenStack. For instructions on how to build a compatible image see image-builder. Depending on your OpenStack and underlying hypervisor the following options might be of interest:

To see all required OpenStack environment variables execute:

clusterctl config cluster --infrastructure openstack --list-variables capi-quickstart

The following script can be used to export some of them:

wget https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-openstack/master/templates/env.rc -O /tmp/env.rc
source /tmp/env.rc <path/to/clouds.yaml> <cloud>

Apart from the script, the following OpenStack environment variables are required.

# The list of nameservers for OpenStack Subnet being created.
# Set this value when you need create a new network/subnet while the access through DNS is required.
export OPENSTACK_DNS_NAMESERVERS=<dns nameserver>
# FailureDomain is the failure domain the machine will be created in.
export OPENSTACK_FAILURE_DOMAIN=<availability zone name>
# The flavor reference for the flavor for your server instance.
export OPENSTACK_CONTROL_PLANE_MACHINE_FLAVOR=<flavor>
# The flavor reference for the flavor for your server instance.
export OPENSTACK_NODE_MACHINE_FLAVOR=<flavor>
# The name of the image to use for your server instance. If the RootVolume is specified, this will be ignored and use rootVolume directly.
export OPENSTACK_IMAGE_NAME=<image name>
# The SSH key pair name
export OPENSTACK_SSH_KEY_NAME=<ssh key pair name>

A full configuration reference can be found in configuration.md.

# The URL of the kernel to deploy.
export DEPLOY_KERNEL_URL="http://172.22.0.1:6180/images/ironic-python-agent.kernel"
# The URL of the ramdisk to deploy.
export DEPLOY_RAMDISK_URL="http://172.22.0.1:6180/images/ironic-python-agent.initramfs"
# The URL of the Ironic endpoint.
export IRONIC_URL="http://172.22.0.1:6385/v1/"
# The URL of the Ironic inspector endpoint.
export IRONIC_INSPECTOR_URL="http://172.22.0.1:5050/v1/"
# Do not use a dedicated CA certificate for Ironic API. Any value provided in this variable disables additional CA certificate validation.
# To provide a CA certificate, leave this variable unset. If unset, then IRONIC_CA_CERT_B64 must be set.
export IRONIC_NO_CA_CERT=true
# Disables basic authentication for Ironic API. Any value provided in this variable disables authentication.
# To enable authentication, leave this variable unset. If unset, then IRONIC_USERNAME and IRONIC_PASSWORD must be set.
export IRONIC_NO_BASIC_AUTH=true
# Disables basic authentication for Ironic inspector API. Any value provided in this variable disables authentication.
# To enable authentication, leave this variable unset. If unset, then IRONIC_INSPECTOR_USERNAME and IRONIC_INSPECTOR_PASSWORD must be set.
export IRONIC_INSPECTOR_NO_BASIC_AUTH=true

Please visit the Metal3 getting started guide for more details.

There are a couple of required environment variables that you have to expose in order to get a well tuned and function workload, they are all listed here:

# The project where your cluster will be placed to.
# You have to get out from Packet Portal if you do not have one already.
export PROJECT_ID="5yd4thd-5h35-5hwk-1111-125gjej40930"
# The facility where you want your cluster to be provisioned
export FACILITY="ewr1"
# The operatin system used to provision the device
export NODE_OS="ubuntu_18_04"
# The ssh key name you loaded in Packet Portal
export SSH_KEY="my-ssh"
export POD_CIDR="192.168.0.0/16"
export SERVICE_CIDR="172.26.0.0/16"
export CONTROLPLANE_NODE_TYPE="t1.small"
export WORKER_NODE_TYPE="t1.small"

Generating the cluster configuration

For the purpose of this tutorial, we’ll name our cluster capi-quickstart.

clusterctl config cluster capi-quickstart \
  --kubernetes-version v1.19.7 \
  --control-plane-machine-count=3 \
  --worker-machine-count=3 \
  > capi-quickstart.yaml
clusterctl config cluster capi-quickstart --flavor development \
  --kubernetes-version v1.19.7 \
  --control-plane-machine-count=3 \
  --worker-machine-count=3 \
  > capi-quickstart.yaml

This creates a YAML file named capi-quickstart.yaml with a predefined list of Cluster API objects; Cluster, Machines, Machine Deployments, etc.

The file can be eventually modified using your editor of choice.

See clusterctl config cluster for more details.

Apply the workload cluster

When ready, run the following command to apply the cluster manifest.

kubectl apply -f capi-quickstart.yaml

The output is similar to this:

cluster.cluster.x-k8s.io/capi-quickstart created
awscluster.infrastructure.cluster.x-k8s.io/capi-quickstart created
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/capi-quickstart-control-plane created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-control-plane created
machinedeployment.cluster.x-k8s.io/capi-quickstart-md-0 created
awsmachinetemplate.infrastructure.cluster.x-k8s.io/capi-quickstart-md-0 created
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/capi-quickstart-md-0 created

Accessing the workload cluster

The cluster will now start provisioning. You can check status with:

kubectl get cluster --all-namespaces

You can also get an “at glance” view of the cluster and its resources by running:

clusterctl describe cluster capi-quickstart

To verify the first control plane is up:

kubectl get kubeadmcontrolplane --all-namespaces

You should see an output is similar to this:

NAME                            INITIALIZED   API SERVER AVAILABLE   VERSION   REPLICAS   READY   UPDATED   UNAVAILABLE
capi-quickstart-control-plane   true                                 v1.19.7   3                  3         3

After the first control plane node is up and running, we can retrieve the workload cluster Kubeconfig:

clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig

Deploy a CNI solution

Calico is used here as an example.

kubectl --kubeconfig=./capi-quickstart.kubeconfig \
  apply -f https://docs.projectcalico.org/v3.15/manifests/calico.yaml

After a short while, our nodes should be running and in Ready state, let’s check the status using kubectl get nodes:

kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes

Azure does not currently support Calico networking. As a workaround, it is recommended that Azure clusters use the Calico spec below that uses VXLAN.

kubectl --kubeconfig=./capi-quickstart.kubeconfig \
  apply -f https://raw.githubusercontent.com/kubernetes-sigs/cluster-api-provider-azure/master/templates/addons/calico.yaml

After a short while, our nodes should be running and in Ready state, let’s check the status using kubectl get nodes:

kubectl --kubeconfig=./capi-quickstart.kubeconfig get nodes

Clean Up

Delete workload cluster.

kubectl delete cluster capi-quickstart

Delete management cluster

kind delete cluster

Next steps

See the clusterctl documentation for more detail about clusterctl supported actions.

Latest pre-built Kubernetes AMIs

Kubernetes Version v1.20.4

Amazon Linux 2

RegionAMI
ap-northeast-1ami-0def44a691434d7c3
ap-northeast-2ami-0604ebf2cebc52995
ap-south-1ami-05e601064d7033e76
ap-southeast-1ami-0bfcac54fb085f809
ap-southeast-2ami-01ba92352cf9075f4
ca-central-1ami-0ccf0c8d0c59d4a27
eu-central-1ami-049d79fa0cf66257f
eu-west-1ami-08e8244d19cb62cac
eu-west-2ami-06e5dcc2d5278d682
eu-west-3ami-0cb08059dd2040f64
sa-east-1ami-0f0124b2800f53418
us-east-1ami-08905f968631c22b3
us-east-2ami-02baa4365d6a5bc66
us-west-1ami-0b19284186e2291c0
us-west-2ami-0f11ee437161fe76d

CentOS 7

RegionAMI
ap-northeast-1ami-0616bf1ef2dc5b197
ap-northeast-2ami-00fa527176737283f
ap-south-1ami-0b30ce5687cbcb96f
ap-southeast-1ami-0c8e6c81f1a3a7ac7
ap-southeast-2ami-0ff9497b75a538ba8
ca-central-1ami-091295205e8b4aed7
eu-central-1ami-07ceef034e6fb62df
eu-west-1ami-0812bd756c4fb1e28
eu-west-2ami-06bdf28632d513c2e
eu-west-3ami-06eeee21777522915
sa-east-1ami-036abb7c171299d8a
us-east-1ami-016e443642be37121
us-east-2ami-0e81eef1402a132f1
us-west-1ami-0b05378134bc1d66d
us-west-2ami-08a827f35e91ca2d1

Ubuntu 20.04 (Focal)

RegionAMI
ap-northeast-1ami-05acd50fded8a7c2a
ap-northeast-2ami-04115fa9e4a9ba123
ap-south-1ami-06a3cc1a9527db17e
ap-southeast-1ami-08a20be4d8b2a034e
ap-southeast-2ami-04c4c73158a0f9ce5
ca-central-1ami-0fa3f97b0146d06f0
eu-central-1ami-0120656d38c206057
eu-west-1ami-02f08cd0f9a522739
eu-west-2ami-0b51e520f5e410e28
eu-west-3ami-01e1943a123743403
sa-east-1ami-0cb790ab627374bce
us-east-1ami-03c9c415c7c96958a
us-east-2ami-00acf6686522b85ba
us-west-1ami-0955cbe19b4cea0ce
us-west-2ami-07efc50f02876718b

Ubuntu 18.04 (Bionic)

RegionAMI
ap-northeast-1ami-02242facc5a54b077
ap-northeast-2ami-0b2419e34eaa5389a
ap-south-1ami-06112ca538b75cc75
ap-southeast-1ami-0ada1b518dea7fdb4
ap-southeast-2ami-01a384dd057152631
ca-central-1ami-0e6539b37e7ff4bdb
eu-central-1ami-0f495e340d45c48ff
eu-west-1ami-097454f2cd527b447
eu-west-2ami-026133fbcfc7b0db4
eu-west-3ami-06321b8c8c1f0d7be
sa-east-1ami-0be5ad36fb7fe3743
us-east-1ami-080e26cb951e723e7
us-east-2ami-054f48401fa61bd52
us-west-1ami-01059c7e7df11e9b3
us-west-2ami-09709369c53539cd8

Kubernetes Version v1.19.8

Amazon Linux 2

RegionAMI
ap-northeast-1ami-02cf26d2686f8c79f
ap-northeast-2ami-0d516d2ceb6a26a04
ap-south-1ami-0058d9a144cc247a3
ap-southeast-1ami-07ed0ea92d776999b
ap-southeast-2ami-0c4c3c408ddf1947c
ca-central-1ami-08d8c1f91a44e7dd9
eu-central-1ami-0e9d647207cf50c07
eu-west-1ami-0647b1a7a657b2e18
eu-west-2ami-0d27638fb02353644
eu-west-3ami-02ccf1c17a8d2dd61
sa-east-1ami-0473a5c340edadd9f
us-east-1ami-000724a0096f5690d
us-east-2ami-0dbc5c3fad65d4026
us-west-1ami-04df9335b4177f77f
us-west-2ami-03b938bf470e071f5

CentOS 7

RegionAMI
ap-northeast-1ami-0602f7c15d452b084
ap-northeast-2ami-06f0203ab4985a061
ap-south-1ami-056b6be53e1673f00
ap-southeast-1ami-0f2abc390f3cc37c3
ap-southeast-2ami-03b76c5cff8c9543b
ca-central-1ami-0539f69a258083787
eu-central-1ami-04ddd2b272e754813
eu-west-1ami-0737cf945d555a2c2
eu-west-2ami-0af619a6c72f31705
eu-west-3ami-00aea2da691a90a89
sa-east-1ami-08e43c849bbd2fe1b
us-east-1ami-0f10d5ca925f07915
us-east-2ami-0a24cbaeddca203f9
us-west-1ami-08a8409d27cd43317
us-west-2ami-01efcad1dc2345d87

Ubuntu 20.04 (Focal)

RegionAMI
ap-northeast-1ami-061719150e5ce6e0c
ap-northeast-2ami-075fcefa4013e9ec6
ap-south-1ami-0bb69bc8736652c7a
ap-southeast-1ami-06b8a983c34593b40
ap-southeast-2ami-00cffe0e16ab49d51
ca-central-1ami-02f6ff996b8bd19e3
eu-central-1ami-0aa4c3858be181d12
eu-west-1ami-0f8d83c665566da5c
eu-west-2ami-02ec375fddfeefe3a
eu-west-3ami-0d7fd58339eaa6083
sa-east-1ami-0a5bf7346193b1744
us-east-1ami-04c8aee56d8153e4c
us-east-2ami-00cc08d58fa188ba0
us-west-1ami-0b3a158087e24843e
us-west-2ami-0bde915edf19035a0

Ubuntu 18.04 (Bionic)

RegionAMI
ap-northeast-1ami-0568dc35ab202a316
ap-northeast-2ami-0efac49b4dbd61159
ap-south-1ami-038cf93dfdad687e0
ap-southeast-1ami-0bcb23cba283c4f6f
ap-southeast-2ami-042683a65be8ef0dc
ca-central-1ami-08d734ecd2e077b5f
eu-central-1ami-0fb796a4d4da77ad0
eu-west-1ami-03c1953f8361ff293
eu-west-2ami-061cb7063d48a11bf
eu-west-3ami-03aa887d1292b6897
sa-east-1ami-0f5364b94c99eb021
us-east-1ami-09aa9b2163b593b8e
us-east-2ami-08c15ab3b5395d2bc
us-west-1ami-03c5e8f32d9089deb
us-west-2ami-0fa190928c0648a62

Kubernetes Version v1.18.16

Amazon Linux 2

RegionAMI
ap-northeast-1ami-0d7a5c0808f26dd0f
ap-northeast-2ami-0345ba93b4cd610a4
ap-south-1ami-0f2239705635a6d8b
ap-southeast-1ami-060b86574ad80ac04
ap-southeast-2ami-023dcb7fbf7e402e7
ca-central-1ami-0885faacc312a00ce
eu-central-1ami-070a86cf4fe41873b
eu-west-1ami-02fd28242bf5fe2ec
eu-west-2ami-08437730ef5e8f919
eu-west-3ami-0fd3be437259d34ac
sa-east-1ami-08913d36238896c64
us-east-1ami-080c60635607c7664
us-east-2ami-012b7bf19c4fcf5c8
us-west-1ami-0ac23854530e7a8c3
us-west-2ami-0b8cec069bfae329f

CentOS 7

RegionAMI
ap-northeast-1ami-0ab1d2feb3c042753
ap-northeast-2ami-00392f7971bd3d387
ap-south-1ami-037584b12bbae73d1
ap-southeast-1ami-0f0813b217c1639ab
ap-southeast-2ami-039f65a7090072cdb
ca-central-1ami-0baf640154b0f128f
eu-central-1ami-042e441b57f901c25
eu-west-1ami-07650d3235dfa2909
eu-west-2ami-0a48a16a85ba93326
eu-west-3ami-00db0b58274413972
sa-east-1ami-0cdb785e3779576bb
us-east-1ami-0f5939990b35510e8
us-east-2ami-0e38219a892e50e1b
us-west-1ami-091742ea32cc34851
us-west-2ami-04c68e7d7e0154c85

Ubuntu 20.04 (Focal)

RegionAMI
ap-northeast-1ami-098148c4fcbcd3213
ap-northeast-2ami-056cd7a7c1220e729
ap-south-1ami-0a1d14f7184a7130f
ap-southeast-1ami-0e5187c87c6444c56
ap-southeast-2ami-03509c7e5cd98ede8
ca-central-1ami-0cd5b1b01196e966a
eu-central-1ami-01f62626af7d7e468
eu-west-1ami-0aacd1397416dc0ac
eu-west-2ami-060820b781091cf23
eu-west-3ami-00d64e998f42f2a34
sa-east-1ami-098501bae6a5178b6
us-east-1ami-0e09941ec40c8f63b
us-east-2ami-0af2e5b94595b5ef8
us-west-1ami-043259f5acf92db63
us-west-2ami-0063c64fcdfec1e28

Ubuntu 18.04 (Bionic)

RegionAMI
ap-northeast-1ami-09adde9fc807af422
ap-northeast-2ami-071523d99a8c9ceee
ap-south-1ami-0deaf555aa94b4f97
ap-southeast-1ami-09d3b16312940ed24
ap-southeast-2ami-0ff66cc0d33edef54
ca-central-1ami-07a46798ee2903537
eu-central-1ami-0176bc4abb71c5ecd
eu-west-1ami-01226b4dbe0586f72
eu-west-2ami-0ea2b4a4f4fa989fc
eu-west-3ami-07c3c62775b9af703
sa-east-1ami-080f5ca70c4d75d41
us-east-1ami-07addade84fbd7ed9
us-east-2ami-0bd86aaf2f5e30b8e
us-west-1ami-045c3d251eccf8a99
us-west-2ami-0e8f053559e14e00d

Kubernetes Version v1.17.17

Amazon Linux 2

RegionAMI
ap-northeast-1ami-0e3fbdba7cec45dff
ap-northeast-2ami-0d695ebf09a71df92
ap-south-1ami-077a0fce723f359a8
ap-southeast-1ami-05dd874bce2f83e53
ap-southeast-2ami-0f8c75ac803fc626b
ca-central-1ami-02e137c33666ca8ca
eu-central-1ami-0a6919a059d9ded92
eu-west-1ami-07ea4a66ad4399df2
eu-west-2ami-024f4fbb3076eb49b
eu-west-3ami-0fac0f4c0512a2a3e
sa-east-1ami-0de58cc202d7e9d17
us-east-1ami-031af266ccc130930
us-east-2ami-00f8cf5a27b695f37
us-west-1ami-04a7564950a2428b1
us-west-2ami-0a813ef7e14f1b935

CentOS 7

RegionAMI
ap-northeast-1ami-0e1562b121e320890
ap-northeast-2ami-048b0d6dbb5b8ca59
ap-south-1ami-032b822555b256a8c
ap-southeast-1ami-0fb72ff3997c8230e
ap-southeast-2ami-0e6d3127fbd7d831e
ca-central-1ami-01454c116dae88214
eu-central-1ami-07ee31f8ca5b42863
eu-west-1ami-041d71c00024886a7
eu-west-2ami-04abd1f37e8e224da
eu-west-3ami-0e9b0a58d83b9d208
sa-east-1ami-02308901ed8d001e8
us-east-1ami-05c2398beaf33aa62
us-east-2ami-00d3c6b34f27d453e
us-west-1ami-0729248c323398659
us-west-2ami-05b01a5380daf31f2

Ubuntu 20.04 (Focal)

RegionAMI
ap-northeast-1ami-09b75dd5008e2917b
ap-northeast-2ami-0be2848ce73d18b6b
ap-south-1ami-0231b4074465ac01c
ap-southeast-1ami-00008b47fc30163ae
ap-southeast-2ami-014aa6fd56948ce00
ca-central-1ami-0acbdea18f9b3c188
eu-central-1ami-093852c891470f525
eu-west-1ami-0782bf1383e67b4a8
eu-west-2ami-06b02324e96e1af50
eu-west-3ami-0fa68d082765945e0
sa-east-1ami-0d036f35ed0e6f37d
us-east-1ami-0e440ffd0fe3351ea
us-east-2ami-02c4ef6e5c805757b
us-west-1ami-0d15c23018d576791
us-west-2ami-0259aac1a60636b2f

Ubuntu 18.04 (Bionic)

RegionAMI
ap-northeast-1ami-038683dcad6632ce8
ap-northeast-2ami-05c9b21ce0684bc95
ap-south-1ami-0feb6b56dc573ab26
ap-southeast-1ami-0f34a91ba35aa1407
ap-southeast-2ami-061fddc1a1b582364
ca-central-1ami-00fe729dda74c66e9
eu-central-1ami-0163c0d6d6a79f4e5
eu-west-1ami-01477ccbb189b5ca4
eu-west-2ami-056328349e02485a9
eu-west-3ami-014eefaef56a19184
sa-east-1ami-0b12b5f398eb0ca90
us-east-1ami-0a76db8d461a104fe
us-east-2ami-05cc942e982c6b109
us-west-1ami-047fb570fac2c266d
us-west-2ami-026fbe0dd7751428c

Topics

Using clusterawsadm to fulfill prerequisites

Requirements

  • Linux or MacOS (Windows isn’t supported at the moment).
  • AWS credentials.
  • AWS CLI
  • jq

IAM resources

With clusterawsadm

Get the latest clusterawsadm and place it in your path.

Cluster API Provider AWS ships with clusterawsadm, a utility to help you manage IAM objects for this project.

In order to use clusterawsadm you must have an administrative user in an AWS account. Once you have that administrator user you need to set your environment variables:

  • AWS_REGION
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_SESSION_TOKEN (if you are using Multi-factor authentication)

After these are set run this command to get you up and running:

clusterawsadm bootstrap iam create-cloudformation-stack

Additional policies can be added by creating a configuration file

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  controlPlane:
    ExtraPolicyAttachments:
      - arn:aws:iam::<AWS_ACCOUNT>:policy/my-policy
      - arn:aws:iam::aws:policy/AmazonEC2FullAccess
  nodes:
    ExtraPolicyAttachments:
      - arn:aws:iam::<AWS_ACCOUNT>:policy/my-other-policy

and passing it to clusterawsadm as follows

clusterawsadm bootstrap iam create-stack --config bootstrap-config.yaml

These will be added to the control plane and node roles respectively when they are created.

Note: If you used the now deprecated clusterawsadm alpha bootstrap 0.5.4 or earlier to create IAM objects for the Cluster API Provider for AWS, using clusterawsadm bootstrap iam 0.5.5 or later will, by default, remove the bootstrap user and group. Anything using those credentials to authenticate will start experiencing authentication failures. If you rely on the bootstrap user and group credentials, specify bootstrapUser.enable = true in the configuration file, like this:

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  bootstrapUser:
    enable: true

With EKS Support

If you want to use the the EKS support in the provider then you will need to enable these features via the configuration file. For example:

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  eks:
    enable: true
    iamRoleCreation: false # Set to true if you plan to use the EKSEnableIAM feature flag to enable automatic creation of IAM roles
    defaultControlPlaneRole:
      disable: false # Set to false to enable creation of the default control plane role
    managedMachinePool:
      disable: false # Set to false to enable creation of the default node role for managed machine pools

and then use that configuration file:

clusterawsadm bootstrap iam create-cloudformation-stack --config bootstrap-config.yaml

Enabling EventBridge Events

To enable EventBridge instance state events, additional permissions must be granted along with enabling the feature-flag. Additional permissions for events and queue management can be enabled through the configuration file as follows:

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  ...
  eventBridge:
    enable: true
  ...

Without clusterawsadm

This is not a recommended route as the policies are very specific and will change with new features.

If you do not wish to use the clusteradwsadm tool then you will need to understand exactly which IAM policies and groups we are expecting. There are several policies, roles and users that need to be created. Please see our controller policy file to understand the permissions that are necessary.

You can use clusteradwadm to print out the needed IAM policies, e.g.

clusterawsadm bootstrap iam print-policy --document AWSIAMManagedPolicyControllers --config bootstrap-config.yaml

SSH Key pair

If you plan to use SSH to access the instances created by Cluster API Provider AWS then you will need to specify the name of an existing SSH key pair within the region you plan on using. If you don’t have one yet, a new one needs to be created.

Create a new key pair

# Save the output to a secure location
aws ec2 create-key-pair --key-name default | jq .KeyMaterial -r
-----BEGIN RSA PRIVATE KEY-----
[... contents omitted ...]
-----END RSA PRIVATE KEY-----

If you want to save the private key directly into AWS Systems Manager Parameter Store with KMS encryption for security, you can use the following command:

aws ssm put-parameter --name "/sigs.k8s.io/cluster-api-provider-aws/ssh-key" \
  --type SecureString \
  --value "$(aws ec2 create-key-pair --key-name default | jq .KeyMaterial -r)"

Adding an existing public key to AWS

# Replace with your own public key
aws ec2 import-key-pair \
  --key-name default \
  --public-key-material "$(cat ~/.ssh/id_rsa.pub)"

NB: Only RSA keys are supported by AWS.

Setting up the environment

The current iteration of the Cluster API Provider AWS relies on credentials being present in your environment. These then get written into the cluster manifests for use by the controllers.

E.g.

export AWS_REGION=us-east-1 # This is used to help encode your environment variables
export AWS_ACCESS_KEY_ID=<your-access-key>
export AWS_SECRET_ACCESS_KEY=<your-secret-access-key>
export AWS_SESSION_TOKEN=<session-token> # If you are using Multi-Factor Auth.

Note: The credentials used must have the appropriate permissions for use by the controllers. You can get the required policy statement by using the following command:

clusterawsadm bootstrap iam print-policy --document AWSIAMManagedPolicyControllers --config bootstrap-config.yaml

To save credentials securely in your environment, aws-vault uses the OS keystore as permanent storage, and offers shell features to securely expose and setup local AWS environments.

Accessing cluster instances

Overview

After running clusterctl config cluster to generate the configuration for a new workload cluster (and then redirecting that output to a file for use with kubectl apply, or piping it directly to kubectl apply), the new workload cluster will be deployed. This document explains how to access the new workload cluster’s nodes.

Prerequisites

  1. clusterctl config cluster was successfully executed to generate the configuration for a new workload cluster
  2. The configuration for the new workload cluster was applied to the management cluster using kubectl apply and the cluster is up and running in an AWS environment.
  3. The SSH key referenced by clusterctl in step 1 exists in AWS and is stored in the correct location locally for use by SSH (on macOS/Linux systems, this is typically $HOME/.ssh). This document will refer to this key as cluster-api-provider-aws.sigs.k8s.io.
  4. (If using AWS Session Manager) The AWS CLI and the Session Manager plugin have been installed and configured.

Methods for accessing nodes

There are two ways to access cluster nodes once the workload cluster is up and running:

  • via SSH
  • via AWS Session Manager

Accessing nodes via SSH

By default, workload clusters created in AWS will not support access via SSH apart from AWS Session Manager (see the section titled “Accessing nodes via AWS Session Manager”). However, the manifest for a workload cluster can be modified to include an SSH bastion host, created and managed by the management cluster, to enable SSH access to cluster nodes. The bastion node is created in a public subnet and provides SSH access from the world. It runs the official Ubuntu Linux image.

Enabling the bastion host

To configure the Cluster API Provider for AWS to create an SSH bastion host, add this line to the AWSCluster spec:

spec:
  bastion:
    enabled: true

Obtain public IP address of the bastion node

Once the workload cluster is up and running after being configured for an SSH bastion host, you can use the kubectl get awscluster command to look up the public IP address of the bastion host (make sure the kubectl context is set to the management cluster). The output will look something like this:

NAME   CLUSTER   READY   VPC                     BASTION IP
test   test      true    vpc-1739285ed052be7ad   1.2.3.4

Setting up the SSH key path

Assumming that the cluster-api-provider-aws.sigs.k8s.io SSH key is stored in $HOME/.ssh/cluster-api-provider-aws, use this command to set up an environment variable for use in a later command:

export CLUSTER_SSH_KEY=$HOME/.ssh/cluster-api-provider-aws

Get private IP addresses of nodes in the cluster

To get the private IP addresses of nodes in the cluster (nodes may be control plane nodes or worker nodes), use this kubectl command with the context set to the management cluster:

kubectl get nodes -o custom-columns=NAME:.metadata.name,\
IP:"{.status.addresses[?(@.type=='InternalIP')].address}"

This will produce output that looks like this:

NAME                                         IP
ip-10-0-0-16.us-west-2.compute.internal   10.0.0.16
ip-10-0-0-68.us-west-2.compute.internal   10.0.0.68

The above command returns IP addresses of the nodes in the cluster. In this case, the values returned are 10.0.0.16 and 10.0.0.68.

Connecting to the nodes via SSH

To access one of the nodes (either a control plane node or a worker node) via the SSH bastion host, use this command if you are using a non-EKS cluster:

ssh -i ${CLUSTER_SSH_KEY} ubuntu@<NODE_IP> \
	-o "ProxyCommand ssh -W %h:%p -i ${CLUSTER_SSH_KEY} ubuntu@${BASTION_HOST}"

And use this command if you are using a EKS based cluster:

ssh -i ${CLUSTER_SSH_KEY} ec2-user@<NODE_IP> \
	-o "ProxyCommand ssh -W %h:%p -i ${CLUSTER_SSH_KEY} ubuntu@${BASTION_HOST}"

If the whole document is followed, the value of <NODE_IP> will be either 10.0.0.16 or 10.0.0.68.

Alternately, users can add a configuration stanza to their SSH configuration file (typically found on macOS/Linux systems as $HOME/.ssh/config):

Host 10.0.*
  User ubuntu
  IdentityFile <CLUSTER_SSH_KEY>
  ProxyCommand ssh -W %h:%p ubuntu@<BASTION_HOST>

Accessing nodes via AWS Session Manager

All CAPA-published AMIs based on Ubuntu have the AWS SSM Agent pre-installed (as a Snap package; this was added in June 2018 to the base Ubuntu Server image for all 16.04 and later AMIs). This allows users to access cluster nodes directly, without the need for an SSH bastion host, using the AWS CLI and the Session Manager plugin.

To access a cluster node (control plane node or worker node), you’ll need the instance ID. You can retrieve the instance ID using this kubectl command with the context set to the management cluster:

kubectl get awsmachines -o custom-columns=NAME:.metadata.name,INSTANCEID:.spec.providerID

This will produce output similar to this:

NAME                      INSTANCEID
test-controlplane-52fhh   aws:////i-112bac41a19da1819
test-controlplane-lc5xz   aws:////i-99aaef2381ada9228

Users can then use the instance ID (everything after the aws://// prefix) to connect to the cluster node with this command:

aws ssm start-session --target <INSTANCE_ID>

This will log you into the cluster node as the ssm-user user ID.

Additional Notes

Using the AWS CLI instead of kubectl

It is also possible to use AWS CLI commands instead of kubectl to gather information about the cluster nodes.

For example, to use the AWS CLI to get the public IP address of the SSH bastion host, use this AWS CLI command:

export BASTION_HOST=$(aws ec2 describe-instances --filter='Name=tag:Name,Values=<CLUSTER_NAME>-bastion' \
	| jq '.Reservations[].Instances[].PublicIpAddress' -r)

You should substitute the correct cluster name for <CLUSTER_NAME> in the above command. (NOTE: If make manifests was used to generate manifests, by default the <CLUSTER_NAME> is set to test1.)

Similarly, to obtain the list of private IP addresses of the cluster nodes, use this AWS CLI command:

for type in control-plane node
do
	aws ec2 describe-instances \
    --filter="Name=tag:sigs.k8s.io/cluster-api-provider-aws/role,\
    Values=${type}" \
		| jq '.Reservations[].Instances[].PrivateIpAddress' -r
done
10.0.0.16
10.0.0.68

Finally, to obtain AWS instance IDs for cluster nodes, you can use this AWS CLI command:

for type in control-plane node
do
	aws ec2 describe-instances \
    --filter="Name=tag:sigs.k8s.io/cluster-api-provider-aws/role,\
    Values=${type}" \
		| jq '.Reservations[].Instances[].InstanceId' -r
done
i-112bac41a19da1819
i-99aaef2381ada9228

Note that your AWS CLI must be configured with credentials that enable you to query the AWS EC2 API.

MachinePools

  • Feature status: Experimental
  • Feature gate: MachinePool=true

MachinePool allows users to manage many machines as a single entity. Infrastructure providers implement a separate CRD that handles infrastructure side of the feature.

AWSMachinePool

Cluster API Provider AWS (CAPA) has experimental support for MachinePool though the infrastructure type AWSMachinePool. An AWSMachinePool corresponds to an AWS AutoScaling Groups, which provides the cloud provider specific resource for orchestrating a group of EC2 machines.

The AWSMachinePool controller creates and manages an AWS AutoScaling Group using launch templates so users don’t have to manage individual machines. You can use Autoscaling health checks for replacing instances and it will maintain the number of instances specified.

Using clusterctl to deploy

To deploy a MachinePool / AWSMachinePool via clusterctl config there’s a flavor for that.

Make sure to set up your AWS environment as described here.

export EXP_MACHINE_POOL=true
clusterctl init --infrastructure aws
clusterctl config cluster my-cluster --kubernetes-version v1.16.8 --flavor machinepool > my-cluster.yaml

The template used for this flavor is located here.

AWSManagedMachinePool

Cluster API Provider AWS (CAPA) has experimental support for EKS Managed Node Groups using MachinePool through the infrastructure type AWSManagedMachinePool. An AWSManagedMachinePool corresponds to an AWS AutoScaling Groups that is used for an EKS managed node group. .

The AWSManagedMachinePool controller creates and manages an EKS managed node group with in turn manages an AWS AutoScaling Group of managed EC2 instance types.

To use the managed machine pools certain IAM permissions are needed. The easiest way to ensure the required IAM permissions are in place is to use clusterawsadm to create them. To do this, follow the EKS instructions in using clusterawsadm to fulfill prerequisites.

Using clusterctl to deploy

To deploy an EKS managed node group using AWSManagedMachinePool via clusterctl config you can use a flavor.

Make sure to set up your AWS environment as described here.

export EXP_MACHINE_POOL=true
clusterctl init --infrastructure aws
clusterctl config cluster my-cluster --kubernetes-version v1.16.8 --flavor eks-managedmachinepool > my-cluster.yaml

The template used for this flavor is located here.

Examples

Example MachinePool, AWSMachinePool and KubeadmConfig Resources

Below is an example of the resources needed to create a pool of EC2 machines orchestrated with an AWS Auto Scaling Group.

---
apiVersion: exp.cluster.x-k8s.io/v1alpha3
kind: MachinePool
metadata:
  name: capa-mp-0
spec:
  clusterName: capa
  replicas: 2
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfig
          name: capa-mp-0
      clusterName: capa
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: AWSMachinePool
        name: capa-mp-0
      version: v1.16.8
---
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachinePool
metadata:
  name: capa-mp-0
spec:
  minSize: 1
  maxSize: 10
  availabilityZones:
    - us-east-1
  awsLaunchTemplate:
    instanceType: "${AWS_CONTROL_PLANE_MACHINE_TYPE}"
    sshKeyName: "${AWS_SSH_KEY_NAME}"
  subnets:
  - ${AWS_SUBNET}
---
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
kind: KubeadmConfig
metadata:
  name: capa-mp-0
  namespace: default
spec:
  joinConfiguration:
    nodeRegistration:
      name: '{{ ds.meta_data.local_hostname }}'
      kubeletExtraArgs:
        cloud-provider: aws

EKS Support in the AWS Provider

  • Feature status: Experimental
  • Feature gate (required): EKS=true
  • Feature gate (optional): EKSEnableIAM=true,EKSAllowAddRoles=true

Overview

Experimental support for EKS has been introduced in the AWS provider. Currently the following features are supported:

The implementation introduces new CRD kinds:

  • AWSManagedControlPlane - specifies the EKS Cluster in AWS and used by the Cluster API AWS Managed Control plane (MACP)
  • AWSManagedCluster - holds details of the EKS cluster for use by CAPI
  • AWSManagedMachinePool - defines the managed node pool for the cluster
  • EKSConfig - used by Cluster API bootstrap provider EKS (CABPE)

And a number of new templates are available in the templates folder for creating a managed workload cluster.

SEE ALSO

Prerequisites

To use EKS you must give the controller the required permissions. The easiest way to do this is by using clusterawasadm. For instructions on how to do this see the prerequisites.

When using clusterawsadm and enabling EKS support a new IAM role will be created for you called eks-controlplane.cluster-api-provider-aws.sigs.k8s.io. This role is the IAM role that will be used for the EKS control plane if you don’t specify your own role and if EKSEnableIAM isn’t enabled.

Additionally using clusterawsadm will add permissions to the controllers.cluster-api-provider-aws.sigs.k8s.io policy for EKS to function properly.

Enabling EKS Support

You must explicitly enable the EKS support in the provider by doing the following:

  • Enabling support in the infrastructure manager (capa-controller-manager) by enabling the EKS feature flags (see below)
  • Add the EKS Control Plane Provider (aws-eks)
  • Add the EKS Bootstrap Provider (aws-eks)

Enabling the EKS features

Enabling the EKS functionality is done using the following feature flags:

  • EKS - this enables the core EKS functionality and is required for the other EKS feature flags
  • EKSEnableIAM - by enabling this the controllers will create any IAM roles required by EKS and the roles will be cluster specific. If this isn’t enabled then you can manually create a role and specify the role name in the AWSManagedControlPlane spec otherwise the default rolename will be used.
  • EKSAllowAddRoles - by enabling this you can add additional roles to the control plane role that is created. This has no affect unless used wtih EKSEnableIAM

Enabling the feature flags can be done using clusterctl by setting the following environment variables to true (they all default to false):

  • EXP_EKS - this is used to set the value of the EKS feature flag
  • EXP_EKS_IAM - this is used to set the value of the EKSEnableIAM feature flag
  • EXP_EKS_ADD_ROLES - this is used to set the value of the EKSAllowAddRoles feature flag

As an example:

export EXP_EKS=true
export EXP_EKS_IAM=true
export EXP_EKS_ADD_ROLES=true

clusterctl init --infrastructure=aws --control-plane aws-eks --bootstrap aws-eks

Creating a EKS cluster

New “eks” cluster templates have been created that you can use with clusterctl to create a EKS cluster. To create a EKS cluster with self-managed nodes (a.k.a machines):

clusterctl config cluster capi-eks-quickstart --flavor eks --kubernetes-version v1.17.3 --worker-machine-count=3 > capi-eks-quickstart.yaml

To create a EKS cluster with a managed node group (a.k.a managed machine pool):

clusterctl config cluster capi-eks-quickstart --flavor eks-managedmachinepool --kubernetes-version v1.17.3 --worker-machine-count=3 > capi-eks-quickstart.yaml

NOTE: When creating an EKS cluster only the MAJOR.MINOR of the -kubernetes-version is taken into consideration.

Kubeconfig

When creating an EKS cluster 2 kubeconfigs are generated and stored as secrets in the managmenet cluster. This is different to when you create a non-managed cluster using the AWS provider.

User kubeconfig

This should be used by users that want to connect to the newly created EKS cluster. The name of the secret that contains the kubeconfig will be [cluster-name]-user-kubeconfig where you need to replace [cluster-name] with the name of your cluster. The -user-kubeconfig in the name indicates that the kubeconfig is for the user use.

To get the user kubeconfig for a cluster named managed-test you can run a command similar to:

kubectl --namespace=default get secret managed-test-user-kubeconfig \
   -o jsonpath={.data.value} | base64 --decode \
   > managed-test.kubeconfig

Cluster API (CAPI) kubeconfig

This kubeconfig is used internally by CAPI and shouldn’t be used outside of the management server. It is used by CAPI to perform operations, such as draining a node. The name of the secret that contains the kubeconfig will be [cluster-name]-kubeconfig where you need to replace [cluster-name] with the name of your cluster. Note that there is NO -user in the name.

The kubeconfig is regenerated every sync-period as the token that is embedded in the kubeconfig is only valid for a short period of time. When EKS support is enabled the maximum sync period is 10 minutes. If you try to set --sync-period to greater than 10 minutes then an error will be raised.

EKS Console

To use the Amazon EKS Console to view workloads running in an EKS cluster created using the AWS provider (CAPA) you can do the following:

  1. Create a new policy with the required IAM permissions for the console. This example can be used. For example, a policy called EKSViewNodesAndWorkloads.

  2. Assign the policy created in step 1) to a IAM user or role for the users of your EKS cluster

  3. Map the IAM user or role from step 2) to a Kubernetes user that has the RBAC permissions to view the Kubernetes resources. This needs to be done via the aws-auth configmap (used by aws-iam-authenticator) which is generated by the AWS provider. This mapping can be specified using in the AWSManagedControlPlane, for example:

kind: AWSManagedControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
metadata:
  name: "capi-managed-test-control-plane"
spec:
  region: "eu-west-2"
  sshKeyName: "capi-management"
  version: "v1.18.0"
  iamAuthenticatorConfig:
    mapRoles:
    - username: "kubernetes-admin"
      rolearn: "arn:aws:iam::1234567890:role/AdministratorAccess"
      groups:
      - "system:masters"

In the sample above the arn:aws:iam::1234567890:role/AdministratorAccess IAM role has the EKSViewNodesAndWorkloads policy attached (created in step 1.)

EKS Addons

EKS Addons can be used with EKS clusters created using Cluster API Provider AWS.

Addons are supported in EKS clusters using Kubernetes v1.18 or greater.

Installing addons

To install an addon you need to declare them by specifying the name, version and optionally how conflicts should be resolved in the AWSManagedControlPlane. For example:

kind: AWSManagedControlPlane
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
metadata:
  name: "capi-managed-test-control-plane"
spec:
  region: "eu-west-2"
  sshKeyName: "capi-management"
  version: "v1.18.0"
  addons:
    - name: "vpc-cni"
      version: "v1.6.3-eksbuild.1"
      conflictResolution: "overwrite"

Additionally, there is a cluster flavor called eks-managedmachinepool-vpccni that you can use with clusterctl:

clusterctl config cluster my-cluster --kubernetes-version v1.18.0 --flavor eks-managedmachinepool-vpccni > my-cluster.yaml

Updating Addons

To update the version of an addon you need to edit the AWSManagedControlPlane instance and update the version of the addon you want to update. Using the example from the previous section we would do:

...
  addons:
    - name: "vpc-cni"
      version: "v1.7.5-eksbuild.1"
      conflictResolution: "overwrite"
...

Deleting Addons

To delete an addon from a cluster you need to edit the AWSManagedControlPlane instance and remove the entry for the addon you want to delete.

Viewing installed addons

You can see what addons are installed on your EKS cluster by looking in the Status of the AWSManagedControlPlane instance.

Additionally you can run the following command:

clusterawsadm eks addons list-installed -n <<eksclustername>>

Viewing available addons

You can see what addons are available to your EKS cluster by running the following command:

clusterawsadm eks addons list-available -n <<eksclustername>>

EKS Cluster Upgrades

Control Plane Upgrade

Upgrading the Kubernetes version of the control plane is supported by the provider. To perform an upgrade you need to update the version in the spec of the AWSManagedControlPlane. Once the version has changed the provider will handle the upgrade for you.

You can only upgrade a EKS cluster by 1 minor version at a time. If you attempt to upgrade the version by more then 1 minor version the provider will ensure the upgrade is done in multiple steps of 1 minor version. For example upgrading from v1.15 to v1.17 would result in your cluster being upgraded v1.15 -> v1.16 first and then v1.16 to v1.17.

Consuming Existing AWS Infrastructure

Normally, Cluster API will create infrastructure on AWS when standing up a new workload cluster. However, it is possible to have Cluster API re-use existing AWS infrastructure instead of creating its own infrastructure. Follow the instructions below to configure Cluster API to consume existing AWS infrastructure.

Prerequisites

In order to have Cluster API consume existing AWS infrastructure, you will need to have already created the following resources:

  • A VPC
  • One or more private subnets (subnets that do not have a route to an Internet gateway)
  • A public subnet in the same Availability Zone (AZ) for each private subnet (this is required for NAT gateways to function properly)
  • A NAT gateway for each private subnet, along with associated Elastic IP addresses
  • An Internet gateway for all public subnets
  • Route table associations that provide connectivity to the Internet through a NAT gateway (for private subnets) or the Internet gateway (for public subnets)

Note that a public subnet (and associated Internet gateway) are required even if the control plane of the workload cluster is set to use an internal load balancer.

You will need the ID of the VPC and subnet IDs that Cluster API should use. This information is available via the AWS Management Console or the AWS CLI.

Note that there is no need to create an Elastic Load Balancer (ELB), security groups, or EC2 instances; Cluster API will take care of these items.

If you want to use existing security groups, these can be specified and new ones will not be created.

Tagging AWS Resources

Cluster API itself does tag AWS resources it creates. The sigs.k8s.io/cluster-api-provider-aws/cluster/<cluster-name> (where <cluster-name> matches the metadata.name field of the Cluster object) tag, with a value of owned, tells Cluster API that it has ownership of the resource. In this case, Cluster API will modify and manage the lifecycle of the resource.

When consuming existing AWS infrastructure, the Cluster API AWS provider does not require any tags to be present. The absence of the tags on an AWS resource indicates to Cluster API that it should not modify the resource or attempt to manage the lifecycle of the resource.

However, the built-in Kubernetes AWS cloud provider does require certain tags in order to function properly. Specifically, all subnets where Kubernetes nodes reside should have the kubernetes.io/cluster/<cluster-name> tag present. Private subnets should also have the kubernetes.io/role/internal-elb tag with a value of 1, and public subnets should have the kubernetes.io/role/elb tag with a value of 1. These latter two tags help the cloud provider understand which subnets to use when creating load balancers.

Finally, if the controller manager isn’t started with the --configure-cloud-routes: "false" parameter, the route table(s) will also need the kubernetes.io/cluster/<cluster-name> tag. (This parameter can be added by customizing the KubeadmConfigSpec object of the KubeadmControlPlane object.)

Configuring the AWSCluster Specification

Specifying existing infrastructure for Cluster API to use takes place in the specification for the AWSCluster object. Specifically, you will need to add an entry with the VPC ID and the IDs of all applicable subnets into the networkSpec field. Here is an example:

For EC2

apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: AWSCluster

For EKS

apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: AWSManagedControlPlane
spec:
  networkSpec:
    vpc:
      id: vpc-0425c335226437144
    subnets:
    - id: subnet-0261219d564bb0dc5
    - id: subnet-0fdcccba78668e013

When you use kubectl apply to apply the Cluster and AWSCluster specifications to the management cluster, Cluster API will use the specified VPC ID, will discover the associated subnet IDs, and will not create a new VPC, new subnets, or other associated resources. It will, however, create a new ELB and new security groups.

Placing EC2 Instances in Specific AZs

To distribute EC2 instances across multiple AZs, you can add information to the Machine specification. This is optional and only necessary if control over AZ placement is desired.

To tell Cluster API that an EC2 instance should be placed in a particular AZ but allow Cluster API to select which subnet in that AZ can be used, add this to the Machine specification:

spec:
  failureDomain: "us-west-2a"

If using a MachineDeployment, specify AZ placement like so:

spec:
  template:
    spec:
      failureDomain: "us-west-2b"

Note that all replicas within a MachineDeployment will reside in the same AZ.

Placing EC2 Instances in Specific Subnets

To specify that an EC2 instance should be placed in a specific subnet, add this to the AWSMachine specification:

spec:
  subnet:
    id: subnet-0a3507a5ad2c5c8c3

When using MachineDeployments, users can control subnet selection by adding information to the AWSMachineTemplate associated with that MachineDeployment, like this:

spec:
  template:
    spec:
      subnet:
        id: subnet-0a3507a5ad2c5c8c3

Users may either specify failureDomain on the Machine or MachineDeployment objects, or users may explicitly specify subnet IDs on the AWSMachine or AWSMachineTemplate objects. If both are specified, the subnet ID is used and the failureDomain is ignored.

Security Groups

To use existing security groups for instances for a cluster, add this to the AWSCluster specification:

spec:
  networkSpec:
    securityGroupOverrides:
      bastion: sg-0350a3507a5ad2c5c8c3
      controlplane: sg-0350a3507a5ad2c5c8c3
      apiserver-lb: sg-0200a3507a5ad2c5c8c3
      node: sg-04e870a3507a5ad2c5c8c3
      lb: sg-00a3507a5ad2c5c8c3

Any additional security groups specified in an AWSMachineTemplate will be applied in addition to these overriden security groups.

To specify additional security groups for the control plane load balancer for a cluster, add this to the AWSCluster specification:

spec:
  controlPlaneLoadBalancer:
    AdditionalsecurityGroups:
    - sg-0200a3507a5ad2c5c8c3
    - ...   

Caveats/Notes

  • When both public and private subnets are available in an AZ, CAPI will choose the private subnet in the AZ over the public subnet for placing EC2 instances.
  • If you configure CAPI to use existing infrastructure as outlined above, CAPI will not create an SSH bastion host. Combined with the previous bullet, this means you must make sure you have established some form of connectivity to the instances that CAPI will create.

Specifying the IAM Role to use for Management Components

Prerequisites

To be able to specify the IAM role that the management components should run as your cluster must be set up with the ability to assume IAM roles using one of the following solutions:

Setting IAM Role

Set the AWS_CONTROLLER_IAM_ROLE environment variable to the ARN of the IAM role to use when performing the clustercrl init command.

For example:

export AWS_CONTROLLER_IAM_ROLE=arn:aws:iam::1234567890:role/capa-management-components
clusterctl init --infrastructure=aws

IAM Role Trust Policy

IAM Roles for Service Accounts

When creating the IAM role the following trust policy will need to be used with the AWS_ACCOUNT_ID, AWS_REGION and OIDC_PROVIDER_ID environment variables replaced.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/oidc.eks.${AWS_REGION}.amazonaws.com/id/${OIDC_PROVIDER_ID}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "ForAnyValue:StringEquals": {
          "oidc.eks.${AWS_REGION}.amazonaws.com/id/${OIDC_PROVIDER_ID}:sub": [
            "system:serviceaccount:capa-system:capa-controller-manager",
            "system:serviceaccount:capi-system:capi-controller-manager",
            "system:serviceaccount:capa-eks-control-plane-system:capa-eks-control-plane-controller-manager",
            "system:serviceaccount:capa-eks-bootstrap-system:capa-eks-bootstrap-controller-manager",
          ]
        }
      }
    }
  ]
}

If you plan to use the controllers.cluster-api-provider-aws.sigs.k8s.io role created by clusterawsadm then you’ll need to add the following to your AWSIAMConfiguration:

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  clusterAPIControllers:
    disabled: false
    trustStatements:
    - Action:
      - "sts:AssumeRoleWithWebIdentity"
      Effect: "Allow"
      Principal:
        Federated:
        - "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/oidc.eks.${AWS_REGION}.amazonaws.com/id/${OIDC_PROVIDER_ID}"
      Condition:
        "ForAnyValue:StringEquals":
          "oidc.eks.${AWS_REGION}.amazonaws.com/id/${OIDC_PROVIDER_ID}:sub":
            - system:serviceaccount:capa-system:capa-controller-manager
            - system:serviceaccount:capa-eks-control-plane-system:capa-eks-control-plane-controller-manager # Include if also using EKS

With this you can then set AWS_CONTROLLER_IAM_ROLE to arn:aws:iam::${AWS_ACCOUNT_ID}:role/controllers.cluster-api-provider-aws.sigs.k8s.io

Kiam / kube2iam

When creating the IAM role the you will need to give apply the kubernetes.io/cluster/${CLUSTER_NAME}/role": "enabled" tag to the role and use the following trust policy with the AWS_ACCOUNT_ID and CLUSTER_NAME environment variables correctly replaced.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Sid": "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}.worker-node-role"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

If you plan to use the controllers.cluster-api-provider-aws.sigs.k8s.io role created by clusterawsadm then you’ll need to add the following to your AWSIAMConfiguration:

apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  clusterAPIControllers:
    disabled: false
    trustStatements:
      - Action:
        - "sts:AssumeRole"
        Effect: "Allow"
        Principal:
          Service:
          - "ec2.amazonaws.com"
      - Action:
        - "sts:AssumeRole"
        Effect: "Allow"
        Principal:
          AWS:
          - "arn:aws:iam::${AWS_ACCOUNT_ID}:role/${CLUSTER_NAME}.worker-node-role"

With this you can then set AWS_CONTROLLER_IAM_ROLE to arn:aws:iam::${AWS_ACCOUNT_ID}:role/controllers.cluster-api-provider-aws.sigs.k8s.io

Multi-AZ Control Plane

Overview

By default, the control plane of a workload cluster created by CAPA will span multiple availability zones (AZs) (also referred to as “failure domains”) when using multiple control plane nodes. This is because CAPA will, by default, create public and private subnets in all the AZs of a region (up to a maximum of 3 AZs by default). If a region has more than 3 AZs then CAPA will pick 3 AZs to use.

Configuring CAPA to Use Specific AZs

To explicitly instruct CAPA to create resources in specific AZs (and not by random), users can add a networkSpec object to the AWSCluster specification. Here is an example networkSpec that creates resources across three AZs in the “us-west-2” region:

spec:
  networkSpec:
    vpc:
      cidrBlock: 10.50.0.0/16
    subnets:
    - availabilityZone: us-west-2a
      cidrBlock: 10.50.0.0/20
      isPublic: true
    - availabilityZone: us-west-2a
      cidrBlock: 10.50.16.0/20
    - availabilityZone: us-west-2b
      cidrBlock: 10.50.32.0/20
      isPublic: true
    - availabilityZone: us-west-2b
      cidrBlock: 10.50.48.0/20
    - availabilityZone: us-west-2c
      cidrBlock: 10.50.64.0/20
      isPublic: true
    - availabilityZone: us-west-2c
      cidrBlock: 10.50.80.0/20

Specifying the CIDR block alone for the VPC is not enough; users must also supply a list of subnets that provides the desired AZ, the CIDR for the subnet, and whether the subnet is public (has a route to an Internet gateway) or is private (does not have a route to an Internet gateway).

Note that CAPA insists that there must be a public subnet (and associated Internet gateway), even if no public load balancer is requested for the control plane. Therefore, for every AZ where a control plane node should be placed, the networkSpec object must define both a public and private subnet.

Once CAPA is provided with a networkSpec that spans multiple AZs, the KubeadmControlPlane controller will automatically distribute control plane nodes across multiple AZs. No further configuration from the user is required.

Note: this method can also be used if you do not want to split your EC2 instance across multiple AZs.

Changing AZ defaults

When creating default subnets by default a maximum of 3 AZs will be used. If you are creating a cluster in a region that has more than 3 AZs then 3 AZs will be picked based on alphabetical from that region.

If this default behavior for maximum number of AZs and ordered selection method doesn’t suit your requirements you can use the following to change the behaviour:

  • availabilityZoneUsageLimit - specifies the maximum number of availability zones (AZ) that should be used in a region when automatically creating subnets.
  • availabilityZoneSelection - specifies how AZs should be selected if there are more AZs in a region than specified by availabilityZoneUsageLimit. There are 2 selection schemes:
    • Ordered - selects based on alphabetical order
    • Random - selects AZs randomly in a region

For example if you wanted have a maximum of 2 AZs using a random selection scheme:

spec:
  networkSpec:
    vpc:
      availabilityZoneUsageLimit: 2
      availabilityZoneSelection: Random

Caveats

Deploying control plane nodes across multiple AZs is not a panacea to cure all availability concerns. The sizing and overall utilization of the cluster will greatly affect the behavior of the cluster and the workloads hosted there in the event of an AZ failure. Careful planning is needed to maximize the availability of the cluster even in the face of an AZ failure. There are also other considerations, like cross-AZ traffic charges, that should be taken into account.

Restricting Cluster API to certain namespaces

Cluster-api-provider-aws controllers by default, reconcile cluster-api objects across all namespaces in the cluster. However, it is possible to restrict reconciliation to a single namespace and this document tells you how.

Contents

Use cases

  • Grouping clusters into a namespace based on the AWS account will allow managing clusters across multiple AWS accounts. This will require each cluster-api-provider-aws controller to have credentials to their respective AWS accounts. These credentials can be created as kubernetes secret and be mounted in the pod at /home/.aws or as environment variables.
  • Grouping clusters into a namespace based on their environment, (test, qualification, canary, production) will allow a phased rolling out of cluster-api-provider-aws releases.
  • Grouping clusters into a namespace based on the infrastructure provider will allow running multiple cluster-api provider implementations side-by-side and manage clusters across infrastructure providers.

Configuring cluster-api-provider-aws controllers

  • Create the namespace that cluster-api-provider-aws controller will watch for cluster-api objects
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
  name: my-pet-clusters #edit if necessary
EOF
  • Deploy/edit aws-provider-controller-manager controller statefulset

Specifically, edit the container spec for cluster-api-aws-controller, in the aws-provider-controller-manager statefulset, to pass a value to the namespace CLI flag.

        - -namespace=my-pet-clusters # edit this if necessary

Once the aws-provider-controller-manager-0 pod restarts, cluster-api-provider-aws controllers will only reconcile the cluster-api objects in the my-pet-clusters namespace.

For v1alpha1 please refer to https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/release-0.3/docs/roleassumption.md

Creating clusters using cross account role assumption using KIAM

This document outlines the list of steps to create the target cluster via cross account role assumption using KIAM. KIAM lets the controller pod(s) to assume an AWS role that enables them create AWS resources necessary to create an operational cluster. This way we wouldn’t have to mount any AWS credentials or load environment variables to supply AWS credentials to the CAPA controller. This is automatically taken care by the KIAM components. Note: If you dont want to use KIAM and rather want to mount the credentials as secrets, you may still achieve cross account role assumption by using multiple profiles.

Glossary

  • Management cluster - The cluster that runs in AWS and is used to create target clusters in different AWS accounts
  • Target account - The account where the target cluster is created
  • Source account - The AWS account where the CAPA controllers for management cluster runs.

Goals

  1. The CAPA controllers are running in an AWS account and you want to create the target cluster in another AWS account.
  2. This assumes that you start with no existing clusters.

High level steps

  1. Creating a management cluster in AWS - This can be done by running the phases in clusterctl
    • Uses the existing provider components yaml
  2. Setting up cross account IAM roles
  3. Deploying the KIAM server/agent
  4. Create the target cluster (through KIAM)
    • Uses different provider components with no secrets and annotation to indicate the IAM Role to assume.

1. Creating the management cluster in AWS

Using clusterctl command we can create a new cluster in AWS which in turn will act as the management cluster to create the target cluster(in a different AWS account. This can be achieved by using the phases in clusterctl to perform all the steps except the pivoting. This will provide us with a bare-bones functioning cluster that we can use as a management cluster. To begin with follow the steps in this getting started guide to setup the environment except for creating the actual cluster. Instead follow the steps below to create the cluster.

create a new cluster using kind for bootstrapping purpose by running:

kind create cluster --name <CLUSTER_NAME>

and get its kube config path by running

export KIND_KUBECONFIG=`kind get kubeconfig-path`

Use the following commands to create new management cluster in AWS.

clusterctl alpha phases apply-cluster-api-components --provider-components examples/out/provider-components.yaml \
--kubeconfig $KIND_KUBECONFIG

clusterctl alpha phases apply-cluster --cluster examples/out/cluster.yaml --kubeconfig $KIND_KUBECONFIG

We only need to create the control plane on the cluster running in AWS source account. Since the example includes definition for a worker node, you may delete it.

clusterctl alpha phases apply-machines --machines examples/out/machines.yaml --kubeconfig $KIND_KUBECONFIG

clusterctl alpha phases get-kubeconfig --provider aws --cluster-name <CLUSTER_NAME> --kubeconfig $KIND_KUBECONFIG

export AWS_KUBECONFIG=$PWD/kubeconfig

kubectl apply -f examples/out/addons.yaml --kubeconfig $AWS_KUBECONFIG

Verify that all the pods in the kube-system namespace are running smoothly. Also you may remove the additional node in the machines example yaml since we are only interested in running the controllers that runs in control plane node (although its not required to make any changes there). You can destroy your local kind cluster by running

kind delete cluster --name <CLUSTER_NAME>

2. Setting up cross account roles

In this step we will create new roles/policy in across 2 different AWS accounts. Let us start by creating the roles in the account where the AWS controller runs. Following the directions posted in KIAM repo create a “kiam_server” role in AWS that only has a single managed policy with a single permission “sts:AssumeRole”. Also add a trust policy on the “kiam_server” role to include the role attached to the Control plane instance as a trusted entity. This looks something like this:

In “kiam_server” role (Source AWS account):

{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Effect": "Allow",
     "Principal": {
       "AWS": "arn:aws:iam::<AWS_ACCOUNT_NUMBER>:role/control-plane.cluster-api-provider-aws.sigs.k8s.io"
     },
     "Action": "sts:AssumeRole"
   }
 ]
}

Next we must establish a link between this “kiam_server” role on source AWS account and the role on target AWS account that has the permissions to create new cluster. Begin by running the clusterawsadm cli to create a new stack on the target account where the target cluster is created. Make sure you use the credentials for target AWS account before creating the stack.

clusterawsadm alpha bootstrap create-stack

Then sign-in to the target AWS account to establish the link as mentioned above. Create a new Role with the permission policy set to “controllers.cluster-api-provider-aws.sigs.k8s.io”. Lets name this role “cluster-api” for future reference. Add a new trust relationship to include the “kiam_server” role from the source account as trusted entity. This is shown below:

In “controllers.cluster-api-provider-aws.sigs.k8s.io” role(target AWS account)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<SOURCE_AWS_ACCOUNT_NUMBER>:role/kserver"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

3. Deploying the KIAM server & agent

By now, your target cluster must be up & running. Make sure your KUBECONFIG pointing to the cluster in the target account.

Create new secrets using the steps outlined here Apply the manifest shown below: Make sure you update the argument to include your source AWS account “--assume-role-arn=arn:aws:iam::<SOURCE_AWS_ACCOUNT>:role/kiam_server” server.yaml

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-server
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: server
    spec:
      serviceAccountName: kiam-server
      nodeSelector:
        node-role.kubernetes.io/master: ""
      tolerations:
      - operator: "Exists"
      volumes:
        - name: ssl-certs
          hostPath:
            # for AWS linux or RHEL distros
            # path: /etc/pki/ca-trust/extracted/pem/
            path: /etc/ssl/certs/
        - name: tls
          secret:
            secretName: kiam-server-tls
      hostNetwork: true
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:v3.2
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --json-log
            - --level=warn
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/server.pem
            - --key=/etc/kiam/tls/server-key.pem
            - --ca=/etc/kiam/tls/ca.pem
            - --role-base-arn-autodetect
            - --assume-role-arn=arn:aws:iam::<SOURCE_AWS_ACCOUNT>:role/kiam_server
            - --sync=1m
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/server.pem
              - --key=/etc/kiam/tls/server-key.pem
              - --ca=/etc/kiam/tls/ca.pem
              - --server-address=127.0.0.1:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/server.pem
              - --key=/etc/kiam/tls/server-key.pem
              - --ca=/etc/kiam/tls/ca.pem
              - --server-address=127.0.0.1:443
              - --gateway-timeout-creation=1s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
  name: kiam-server
  namespace: kube-system
spec:
  clusterIP: None
  selector:
    app: kiam
    role: server
  ports:
  - name: grpclb
    port: 443
    targetPort: 443
    protocol: TCP

agent.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-agent
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: agent
    spec:
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      tolerations:
        - operator: "Exists"
      volumes:
        - name: ssl-certs
          hostPath:
            # for AWS linux or RHEL distros
            #path: /etc/pki/ca-trust/extracted/pem/
            path: /etc/ssl/certs/
        - name: tls
          secret:
            secretName: kiam-agent-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]
          image: quay.io/uswitch/kiam:v3.2
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=cali+
            - --json-log
            - --port=8181
            - --cert=/etc/kiam/tls/agent.pem
            - --key=/etc/kiam/tls/agent-key.pem
            - --ca=/etc/kiam/tls/ca.pem
            - --server-address=kiam-server:443
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
            - --gateway-timeout-creation=1s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

server-rbac.yaml

---
kind: ServiceAccount
apiVersion: v1
metadata:
  name: kiam-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kiam-read
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  verbs:
  - watch
  - get
  - list
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kiam-read
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-read
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: kiam-write
rules:
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - create
  - patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: kiam-write
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kiam-write
subjects:
- kind: ServiceAccount
  name: kiam-server
  namespace: kube-system

After deploying the above components make sure that the kiam_server and kiam_agent pods are up and running.

4. Create the target cluster

Make sure you create copy of the “aws/out” directory called “out2”. To create the target cluster we must update the provider_components.yaml generated in the out2 directory as shown below (to be run from the repository root directory):

cp examples/out examples/out2
vi examples/out2/provider-components.yaml
  1. Remove the credentials secret added at the bottom of the provider_components.yaml and do not mount the secret
  2. Add the following annotation to the template of aws-provider-controller-manager stateful set to specify the new role that was created in target account.
      annotations:
        iam.amazonaws.com/role: arn:aws:iam::<TARGET_AWS_ACCOUNT>:role/cluster-api
  1. Also add this below annotation to the “aws-provider-system” namespace
  annotations:
    iam.amazonaws.com/permitted: ".*"

Create a new cluster using the steps similar to the one used to create the source cluster. They are as follows:

export SOURCE_KUBECONFIG=<PATH_TO_SOURCE_CLUSTER_KUBECONFIG>

clusterctl alpha phases apply-cluster-api-components --provider-components examples/out2/provider-components.yaml \
--kubeconfig $SOURCE_KUBECONFIG

kubectl -f apply examples/out2/cluster.yaml --kubeconfig $SOURCE_KUBECONFIG

kubectl apply -f examples/out2/machines.yaml --kubeconfig $SOURCE_KUBECONFIG

clusterctl alpha phases get-kubeconfig --provider aws --cluster-name <TARGET_CLUSTER_NAME> --kubeconfig $SOURCE_KUBECONFIG
export KUBECONFIG=$PWD/kubeconfig

kubectl apply -f examples/out2/addons.yaml

This creates the new cluster in the target AWS account.

Userdata Privacy

Cluster API Provider AWS bootstraps EC2 instances to create and join Kubernetes clusters using instance user data. Because Kubernetes clusters are secured using TLS using multiple Certificate Authorities, these are generated by Cluster API and injected into the user data. It is important to note that without the configuring of host firewalls, processes can retrieve instance userdata from http://169.254.169.254/latest/api/token

Requirements

  • An AMI that includes the AWS CLI
  • AMIs using CloudInit
  • A working /bin/bash shell
  • LFS directory layout (i.e. /etc exists and is readable by CloudInit)

Listed AMIs on 1.16 and up should include the AWS CLI.

How Cluster API secures TLS secrets

In 0.5.x/v1alpha3, by default, Cluster API Provider AWS will use AWS Secrets Manager as a limited-time secret store, storing the userdata using KMS encryption at rest in AWS. The EC2 IMDS userdata will contain a boot script to download the encrypted userdata secret using instance profile permissions, then immediately delete it from AWS Secrets Manager, and then execute it.

To avoid guessing keys in the AWS Secrets Manager key-value store and to prevent collisions, the key is an encoding the Kubernetes namespace, cluster name and instance name, with a random string appended, providing ~256-bits of entropy.

Cluster API Provider AWS also stores the secret ARN in the AWSMachine spec, and will delete the secret if it isn’t already deleted and the machine has registered successfully against the workload cluster API server as a node. Cluster API Provider AWS will also attempt deletion of the secret if the AWSMachine is otherwise deleted or the EC2 instance is terminated or failed.

This method is only compatible with operating systems and distributions using cloud-init. If you are using a different bootstrap process, you will need to co-ordinate this externally and set the following in the specification of the AWSMachine types to disable the use of a cloud-init boothook:

cloudInit:
  insecureSkipSecretsManager: true

Troubleshooting

Script errors

cloud-init does not print boothook script errors to the systemd journal. Logs for the script, if it errored can be found in /var/log/cloud-init-output.log

Warning messages

Because cloud-init will attempt to read the final file at start, cloud-init will always print a /etc/secret-userdata.txt cannot be found message. This can be safely ignored.

Secrets manager console

The AWS secrets manager console should show secrets being created and deleted, with a lifetime of around a minute. No plaintext secret data will appear in the console as Cluster API Provider AWS stores the userdata as fragments of a gzipped data stream.

Troubleshooting

Resources aren’t being created

TODO

Target cluster’s control plane machine is up but target cluster’s apiserver not working as expected

If aws-provider-controller-manager-0 logs did not help, you might want to look into cloud-init logs, /var/log/cloud-init-output.log, on the controller host. Verifying kubelet status and logs may also provide hints:

journalctl -u kubelet.service
systemctl status kubelet

For reaching controller host from your local machine:

 ssh -i <private-key> -o "ProxyCommand ssh -W %h:%p -i <private-key> ubuntu@<bastion-IP>" ubuntu@<controller-host-IP>

private-key is the private key from the key-pair discussed in the ssh key pair section above.

kubelet on the control plane host failing with error: NoCredentialProviders

failed to run Kubelet: could not init cloud provider "aws": error finding instance i-0c276f2a1f1c617b2: "error listing AWS instances: \"NoCredentialProviders: no valid providers in chain. Deprecated.\\n\\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\""

This error can occur if CloudFormation stack is not created properly and IAM instance profile is missing appropriate roles. Run following command to inspect IAM instance profile:

$ aws iam get-instance-profile --instance-profile-name control-plane.cluster-api-provider-aws.sigs.k8s.io --output json
{
    "InstanceProfile": {
        "InstanceProfileId": "AIPAJQABLZS4A3QDU576Q",
        "Roles": [
            {
                "AssumeRolePolicyDocument": {
                    "Version": "2012-10-17",
                    "Statement": [
                        {
                            "Action": "sts:AssumeRole",
                            "Effect": "Allow",
                            "Principal": {
                                "Service": "ec2.amazonaws.com"
                            }
                        }
                    ]
                },
                "RoleId": "AROAJQABLZS4A3QDU576Q",
                "CreateDate": "2019-05-13T16:45:12Z",
                "RoleName": "control-plane.cluster-api-provider-aws.sigs.k8s.io",
                "Path": "/",
                "Arn": "arn:aws:iam::123456789012:role/control-plane.cluster-api-provider-aws.sigs.k8s.io"
            }
        ],
        "CreateDate": "2019-05-13T16:45:28Z",
        "InstanceProfileName": "control-plane.cluster-api-provider-aws.sigs.k8s.io",
        "Path": "/",
        "Arn": "arn:aws:iam::123456789012:instance-profile/control-plane.cluster-api-provider-aws.sigs.k8s.io"
    }
}

If instance profile does not look as expected, you may try recreating the CloudFormation stack using clusterawsadm as explained in the above sections.

Setting up Development Environment for Cluster API Provider AWS

In this post we will be deep diving into how to setup and contribute into the ClusterAPI provider AWS.

Setup Development Environment for EKS Control Plane

login to github and fork both

  • https://github.com/kubernetes-sigs/cluster-api
  • https://github.com/kubernetes-sigs/cluster-api-provider-aws

Install Prerequisites

  • Golang Version 1.13 or higher
  • direnv
  • envsubst
  • kubectl
  • Working Development Environment

install direnv

brew install direnv

install envsubst

curl -L https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-`uname -s`-`uname -m` -o envsubst
chmod +x envsubst
sudo mv envsubst $HOME/go/bin/

# or use go get
export GOPATH=~go
go get -v github.com/a8m/envsubst/cmd/envsubst

Setup Repos and GOPATH

setup this in ~/go/src/sigs.k8s.io and export GOPATH=~go

$ export GOPATH=<HOME ABS PATH>/go

$ mkdir ~/go/src/sigs.k8s.io/

$ cd ~/go/src/sigs.k8s.io/

$ git clone git@github.com:<GITHUB USERNAME>/cluster-api.git

$ git clone git@github.com:<GITHUB USERNAME>/cluster-api-provider-aws.git

and add upstream for both repos

$ cd cluster-api 

$ git remote add upstream git@github.com:kubernetes-sigs/cluster-api.git

$ git fetch upstream

$ cd ..

$ cd cluster-api-provider-aws

$ git remote add upstream git@github.com:kubernetes-sigs/cluster-api.git

$ git fetch upstream

Build clusterawsadm and setup your AWS Environment

build the clusterawsadm in cluster-api-provider-aws

$ make clusterawsadm

create bootstrap file and bootstrap IAM roles and policies using clusterawsadm

$ cat config-bootstrap.yaml
apiVersion: bootstrap.aws.infrastructure.cluster.x-k8s.io/v1alpha1
kind: AWSIAMConfiguration
spec:
  bootstrapUser:
    enable: true
  eks:
    enable: true
    iamRoleCreation: false # Set to true if you plan to use the EKSEnableIAM feature flag to enable automatic creation of IAM roles
    defaultControlPlaneRole:
      disable: false # Set to false to enable creation of the default control plane role
    managedMachinePool:
      disable: false # Set to false to enable creation of the default node role for managed machine pools

create IAM Resources that will be needed for bootstrapping EKS

$ ./bin/clusterawsadm bootstrap iam create-cloudformation-stack --config=config-bootstrap.yaml
Attempting to create AWS CloudFormation stack cluster-api-provider-aws-sigs-k8s-io

this will create cloudformation stack for those IAM resources

Following resources are in the stack:

Resource                  |Type                                                                                |Status
AWS::IAM::Group           |cluster-api-provider-aws-s-AWSIAMGroupBootstrapper-ME9XZVCO2491                     |CREATE_COMPLETE
AWS::IAM::InstanceProfile |control-plane.cluster-api-provider-aws.sigs.k8s.io                                  |CREATE_COMPLETE
AWS::IAM::InstanceProfile |controllers.cluster-api-provider-aws.sigs.k8s.io                                    |CREATE_COMPLETE
AWS::IAM::InstanceProfile |nodes.cluster-api-provider-aws.sigs.k8s.io                                          |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::xxx:policy/control-plane.cluster-api-provider-aws.sigs.k8s.io |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::xxx:policy/nodes.cluster-api-provider-aws.sigs.k8s.io         |CREATE_COMPLETE
AWS::IAM::ManagedPolicy   |arn:aws:iam::xxx:policy/controllers.cluster-api-provider-aws.sigs.k8s.io   |CREATE_COMPLETE
AWS::IAM::Role            |control-plane.cluster-api-provider-aws.sigs.k8s.io                                  |CREATE_COMPLETE
AWS::IAM::Role            |controllers.cluster-api-provider-aws.sigs.k8s.io                                    |CREATE_COMPLETE
AWS::IAM::Role            |eks-controlplane.cluster-api-provider-aws.sigs.k8s.io                               |CREATE_COMPLETE
AWS::IAM::Role            |eks-nodegroup.cluster-api-provider-aws.sigs.k8s.io                                  |CREATE_COMPLETE
AWS::IAM::Role            |nodes.cluster-api-provider-aws.sigs.k8s.io                                          |CREATE_COMPLETE
AWS::IAM::User            |bootstrapper.cluster-api-provider-aws.sigs.k8s.io                                   |CREATE_COMPLETE

create a security credentials in the bootstrapper.cluster-api-provider-aws.sigs.k8s.io IAM user and copy the AWS_ACCESS_KEY_ID and AWS_SECRETS_ACCESS_KEY

$ brew install direnv
$ touch .envrc #add the aws key

unset AWS_SESSION_TOKEN
unset AWS_SECURITY_TOKEN
export AWS_ACCESS_KEY_ID=AKIATEST
export AWS_SECRET_ACCESS_KEY=TESTTEST
export AWS_REGION=eu-west-1

then run direnv allow for each change done in .envrc

create a kind cluster where we can deploy the kubernetes manifests. This kind cluster will be a temp cluster which we will use to create the EKS cluster.

$ kind create cluster

setup cluster-api with the same .envrc file and then allow direnv

cp .envrc ../cluster-api
cd ../cluster-api
direnv allow

create tilt-settings.json change the value of AWS_B64ENCODED_CREDENTIALS

{
"default_registry": "gcr.io/<GITHUB USERNAME>",
    "provider_repos": ["../cluster-api-provider-aws"],
    "enable_providers": ["eks-bootstrap", "eks-controlplane", "kubeadm-bootstrap", "kubeadm-control-plane", "aws"],
    "kustomize_substitutions": {
        "AWS_B64ENCODED_CREDENTIALS": "W2RlZmFZSZnRg==",
        "EXP_EKS": "true",
        "EXP_EKS_IAM": "true",
        "EXP_MACHINE_POOL": "true"
    },
    "extra_args": {
        "aws": ["--v=2"],
        "eks-bootstrap": ["--v=2"],
        "eks-controlplane": ["--v=2"]
    }
  }

run dev env using tilt let it run and press space once its running to open the web browser

tilt up

createcd cluster-api-provider-aws and edit .envrc

export AWS_EKS_ROLE_ARN=arn:aws:iam::<accountid>:role/aws-service-role/eks.amazonaws.com/AWSServiceRoleForAmazonEKS
export AWS_SSH_KEY_NAME=<sshkeypair>
export KUBERNETES_VERSION=v1.15.2
export EKS_KUBERNETES_VERSION=v1.15
export CLUSTER_NAME=capi-<test-clustename>
export CONTROL_PLANE_MACHINE_COUNT=1
export AWS_CONTROL_PLANE_MACHINE_TYPE=t3.large
export WORKER_MACHINE_COUNT=1
export AWS_NODE_MACHINE_TYPE=t3.large

check and pipe output of template into a file

cat templates/cluster-template-eks.yaml 
cat templates/cluster-template-eks.yaml | $HOME/go/bin/envsubst > test-cluster.yaml

apply generate test-cluster.yaml file in the kind cluster

kubectx
kubectx kind-kind
kubectl apply -f test-cluster.yaml

Check the tilt logs and wait for the EKS Cluster to be created

Retry if theres an error when creating the test-cluster

To retry apis and services again delete the cluster and recreate it again

tilt up (ctrl-c)
press space to see the logs

kubectl delete -f test-cluster.yaml

kind delete cluster

kind create cluster

try again

tilt up

Clean up

To clean make sure you delete the Kubernetes Resources first before deleting the Kind Cluster

Troubleshooting

  • Make sure you have at least three available spaces EIP and NAT Gateways to be created
  • If your git starts throwing this error
flag provided but not defined: -variables
Usage: envsubst [options...] <input>

you might need to reinstall the system envsubst

brew install gettetxt
# or
brew reinstall gettext

Make sure you specify which envsubst you are using

Cluster API Provider AWS Roadmap

This roadmap is a constant work in progress, subject to frequent revision. Dates are approximations.

Ongoing

  • Documentation improvements

v0.5.x (v1alpha3+) - June/July 2020

v0.6 (v1alpha4) ~ Q4 2020

TBD

  • Implement MachinePools - Autoscaling groups and instances
  • Spot instance support for MachinePool ASG/Instance implementation
  • MachinePool implementation backed by Spot Fleet
  • Dual stack IPv4/IPv6 support
  • Windows nodes
  • Support for GPU instances and Elastic GPU
  • FIPS/NIST/STIG compliance