Creating an Autoscaling EKS Cluster using AWS Spot Instances
...that also pulls secrets automatically from AWS Parameter Store or Secrets Manager, automatically creates ALBs and registers DNS entries, and last but not least uses IRSA to grant AWS access to pods. This post is going to use Terraform and Helm, and will assume a working knowledge of AWS, Kubernetes, Terraform, and Helm principles. Whew. Let's get started.
The VPC
First, you're going to need a VPC to put the EKS cluster and everything you build in. We're going to use the terraform-aws-modules/vpc/aws
module to help us stand up the cluster, because it applies sane/safe defaults and there's no reason to repeat code that has already been written and battle tested. I've chosen here to allocate the largest IP address space available because it does not change the cost but allows our cluster to grow as much as we need it to.
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "~> 2"
name = terraform.workspace
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
# "10.0.0.0/18", "10.0.64.0/18", "10.0.128.0/18" are the CIDR blocks dedicated to each AZ
# Unallocated CIDR blocks are spares.
# Each subnet is a divided piece of the AZs allocated CIDR block
private_subnets = ["10.0.0.0/21", "10.0.64.0/21", "10.0.128.0/21"]
public_subnets = ["10.0.8.0/21", "10.0.72.0/21", "10.0.136.0/21"]
database_subnets = ["10.0.16.0/21", "10.0.80.0/21", "10.0.144.0/21"]
elasticache_subnets = ["10.0.24.0/21", "10.0.88.0/21", "10.0.152.0/21"]
redshift_subnets = ["10.0.32.0/21", "10.0.96.0/21", "10.0.160.0/21"]
intra_subnets = ["10.0.40.0/21", "10.0.104.0/21", "10.0.168.0/21"]
enable_nat_gateway = true
single_nat_gateway = true
enable_dns_hostnames = true
public_subnet_tags = {
"kubernetes.io/cluster/${terraform.workspace}" = "shared"
"kubernetes.io/role/elb" = "1"
}
private_subnet_tags = {
"kubernetes.io/cluster/${terraform.workspace}" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}
tags = {
env = terraform.workspace
cost_center = "devops"
"kubernetes.io/cluster/${terraform.workspace}" = "shared"
}
}
The EKS Cluster
Now that we have our VPC, let's create an EKS cluster within the VPC again using a public Terraform module from terraform-aws-modules/eks/aws
to help us apply sane defaults.
module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = terraform.workspace
vpc_id = module.vpc.vpc_id
subnets = concat(
module.vpc.private_subnets,
module.vpc.public_subnets,
module.vpc.database_subnets,
module.vpc.elasticache_subnets,
module.vpc.redshift_subnets,
module.vpc.intra_subnets
)
worker_groups_launch_template = [
{
name = "eks-spot-${terraform.workspace}"
override_instance_types = ["t3.medium", "t3.large"]
spot_instance_pools = 2 // how many spot pools per az, len matches instances types len
asg_max_size = 5
kubelet_extra_args = "--node-labels=kubernetes.io/lifecycle=spot"
public_ip = true
autoscaling_enabled = true
protect_from_scale_in = true
},
]
map_accounts = []
map_roles = []
map_users = []
tags = {
env = terraform.workspace
cost_center = "devops"
}
}
This will create an EKS cluster that uses t3.medium
and t3.large
spot instances to populate the node pool, so that if AWS raises the cost for one instance type or reclaims a node the cluster can use the other to cover the load. In production, I'd recommend using three or more instance types of either c5
or m5
class instances.
I mentioned above that we would be using IRSA (IAM Roles for Service Accounts) to grant our pods access to AWS resources. In order to do so, we need to set up an OIDC (OpenID Connect Provider) that uses our Kubernetes OIDC issuer URL to identify the service accounts making calls to it. This might not make sense now, but will later. Here's the Terraform we're going to use to setup the OIDC connection:
data "external" "thumbprint" {
program = ["bash", "${path.module}/helpers/thumbprint.sh", data.aws_region.current.name]
}
resource "aws_iam_openid_connect_provider" "eks" {
url = module.eks.cluster_oidc_issuer_url
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.external.thumbprint.result.thumbprint]
}
You might have noticed that we are using a bash script to thumbprint the EKS OIDC server for our AWS region, which is unfortunately the best way to make this code Terraform friendly and handle changes to the fingerprint of the EKS OIDC server. Here's the script being run:
#!/bin/bash
# Sourced from https://github.com/terraform-providers/terraform-provider-aws/issues/10104
THUMBPRINT=$(echo | openssl s_client -servername oidc.eks.$1.amazonaws.com -showcerts -connect oidc.eks.$1.amazonaws.com:443 2>&- | tail -r | sed -n '/-----END CERTIFICATE-----/,/-----BEGIN CERTIFICATE-----/p; /-----BEGIN CERTIFICATE-----/q' | tail -r | openssl x509 -fingerprint -noout | sed 's/://g' | awk -F= '{print tolower($2)}')
THUMBPRINT_JSON="{\"thumbprint\": \"${THUMBPRINT}\"}"
echo $THUMBPRINT_JSON
This uses openssl
to connect to the OIDC server for your region and fingerprint the certificate it presents which is required for creating the OIDC Connect Provider in AWS.
At this point, you should have a VPC with an EKS cluster that is running using AWS Spot Instances.
Cluster Autoscaling
When running an EKS cluster, it's very popular to also run the cluster-autoscaler
service within that cluster. This service will automatically detect and shut down underutilized nodes to save cost, but when you have Pending
pods it will add nodes to the cluster in order to allow all of your Pending
pods to schedule. We're going to use the helm
chart for cluster-autoscaler
which can be found here. This is pretty easy to set up, here's the values.yaml
file used to configure the deployment, followed by the bash used to deploy it into the devops
namespace in the EKS cluster above.
helm upgrade -i -n devops -f values.yaml cluster-autoscaler stable/cluster-autoscaler
Roles and permissions for these actions are automatically handled by the Terraform EKS module we used above, which is detailed here. When using cluster-autoscaler
, it is very important to require all pods to have resource requests and limits in order to prevent resource starvation in your cluster. Without resource requests and limits, cluster-autoscaler
and Kubernetes will not know how to allocate and provision resources.
ALB Ingress Controller
The ALB Ingress Controller allows ALBs to be automatically created and pointed at Kubernetes Ingresses. We're going to use a Helm chart for this as well, which can be found here. In order for the alb-ingress-controller
service to work, we are going to turn to Terraform to create a new IAM role that can be attached to the service account created by the Helm chart. Here's that (lengthy) Terraform:
data "aws_iam_policy_document" "alb_ingress_controller" {
statement {
sid = "AllowACMGets"
effect = "Allow"
actions = [
"acm:DescribeCertificate",
"acm:ListCertificates",
"acm:GetCertificate"
]
resources = ["*"]
}
statement {
sid = "AllowEC2"
effect = "Allow"
actions = [
"ec2:AuthorizeSecurityGroupIngress",
"ec2:CreateSecurityGroup",
"ec2:CreateTags",
"ec2:DeleteTags",
"ec2:DeleteSecurityGroup",
"ec2:DescribeAccountAttributes",
"ec2:DescribeAddresses",
"ec2:DescribeInstances",
"ec2:DescribeInstanceStatus",
"ec2:DescribeInternetGateways",
"ec2:DescribeNetworkInterfaces",
"ec2:DescribeSecurityGroups",
"ec2:DescribeSubnets",
"ec2:DescribeTags",
"ec2:DescribeVpcs",
"ec2:ModifyInstanceAttribute",
"ec2:ModifyNetworkInterfaceAttribute",
"ec2:RevokeSecurityGroupIngress"
]
resources = ["*"]
}
statement {
sid = "AllowELB"
effect = "Allow"
actions = [
"elasticloadbalancing:*",
]
resources = ["*"]
}
statement {
sid = "AllowIAM"
effect = "Allow"
actions = [
"iam:CreateServiceLinkedRole",
"iam:GetServerCertificate",
"iam:ListServerCertificates"
]
resources = ["*"]
}
statement {
sid = "AllowCognito"
effect = "Allow"
actions = [
"cognito-idp:DescribeUserPoolClient"
]
resources = ["*"]
}
statement {
sid = "AllowWAF"
effect = "Allow"
actions = [
"waf-regional:GetWebACLForResource",
"waf-regional:GetWebACL",
"waf-regional:AssociateWebACL",
"waf-regional:DisassociateWebACL",
"waf:GetWebACL"
]
resources = ["*"]
}
statement {
sid = "AllowTag"
effect = "Allow"
actions = [
"tag:GetResources",
"tag:TagResources"
]
resources = ["*"]
}
}
resource "aws_iam_policy" "alb_ingress_controller" {
name = "alb-ingress-controller-ps-${terraform.workspace}"
policy = data.aws_iam_policy_document.alb_ingress_controller.json
}
resource "aws_iam_role" "alb_ingress_controller" {
name = "alb-ingress-controller-${terraform.workspace}"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "${aws_iam_openid_connect_provider.eks.arn}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${aws_iam_openid_connect_provider.eks.url}:sub": "system:serviceaccount:devops:alb-ingress-controller-aws-alb-ingress-controller"
}
}
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "alb_ingress_controller" {
role = "${aws_iam_role.alb_ingress_controller.name}"
policy_arn = "${aws_iam_policy.alb_ingress_controller.arn}"
}
Now that we have the IAM Role created, let's grab the ARN and put it into our values.yaml
file for alb-ingress-conroller
before we add the service to our cluster using Helm.
helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm upgrade -i -n devops -f values.yaml alb-ingress-controller incubator/aws-alb-ingress-controller
We can now create ingresses and annotate them as follows to have this ALB Ingress Controller automatically create ALBs that point to our services.
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/tags: env=prod,cost_center=api
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:<ACCOUNT_ID>:certificate/<CERT_ID>
alb.ingress.kubernetes.io/healthcheck-path: /v1/health
External Secrets
GoDaddy has released a custom Kubernetes resource called external-secrets
that can be used to sync secret values from multiple secret providers like AWS Secret Manager, AWS Parameter Store, Hashicorp Vault, and more. You can find documentation for it here. We're going to integrate it with AWS Parameter Store today, and use IRSA to grant permissions to it. What is amazing about this setup is that zero secrets need to be put into Terraform or your Helm values.yaml
files, and you can use Kubernetes RBAC to prevent users from reading/writing to Kubernetes secrets.
This Terraform code creates an IAM Policy and Role that we can then annotate our service account with to grant these permissions to any pod that uses that service account.
data "aws_iam_policy_document" "external_secrets" {
statement {
sid = "AllowParameterStoreGets"
effect = "Allow"
actions = [
"ssm:GetParameter",
]
# We should restrict to something like "/${terrform.workspace}/*" ideally
# and if using per-team namespaces the namespace could be added as well
# to ensure applications can only access secrets they are supposed to
resources = ["*"]
}
}
resource "aws_iam_policy" "external_secrets" {
name = "external-secrets-ps-${terraform.workspace}"
policy = data.aws_iam_policy_document.external_secrets.json
}
resource "aws_iam_role" "external_secrets" {
name = "external-secrets-${terraform.workspace}"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "${aws_iam_openid_connect_provider.eks.arn}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${aws_iam_openid_connect_provider.eks.url}:sub": "system:serviceaccount:devops:external-secrets"
}
}
}
]
}
EOF
}
resource "aws_iam_role_policy_attachment" "external_secrets_ecr" {
role = "${aws_iam_role.external_secrets.name}"
policy_arn = "${aws_iam_policy.external_secrets.arn}"
}
Now that we have our role created, you'll need to know your AWS Account ID to code into the values.yaml
file but the Account ID is not considered a sensitive value by AWS.
helm repo add external-secrets https://godaddy.github.io/kubernetes-external-secrets/
helm upgrade -i -n devops -f values.yaml external-secrets external-secrets/kubernetes-external-secrets
Now that external-secrets
is running, we can start creating services that use it to fetch credentials from AWS Parameter Store.
External DNS
Now we're going to put together the alb-ingress-controller
and external-secrets
services using External DNS to automatically create DNS records for all of our ALBs that are automatically created for our ingresses. We are going to use Cloudflare for our DNS, but external-dns
works for all major DNS providers that offer an API for managing DNS.
The first order of business is to put our Cloudflare API Token into AWS Parameter Store using the path /cloudflare/api_token
. Once this manual step is done, we're going to use external-secrets
to create a Kubernetes secret with the API Token.
apiVersion: 'kubernetes-client.io/v1'
kind: ExternalSecret
metadata:
name: external-dns
namespace: devops
secretDescriptor:
backendType: systemManager
data:
- key: /cloudflare/api_token
name: cloudflare_api_token
You'll want to apply this yaml using kubectl
, after which external-secrets
will very quickly create a secret using the values specified above. Now that this secret has been created, we can install the external-dns
Helm chart linked above.
helm upgrade -i -n devops -f values.yaml external-dns stable/external-dns
The external-dns
service will look at all of your ingresses and automatically register DNS records for every host
that is specified. A full example ingress would look like this:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: "alb"
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/tags: env=prod,cost_center=api
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:<ACCOUNT_ID>:certificate/<CERT_ID>
alb.ingress.kubernetes.io/healthcheck-path: /v1/health
labels:
app: "api-web"
name: "api-web"
namespace: "apps"
spec:
rules:
- host: api.example.com
http:
paths:
- backend:
serviceName: api-web
servicePort: 80
path: /*
Closing
If you've made it this far, thanks for reading and I hope this was helpful. You should now have an EKS Cluster running in a VPC that uses cluster-autoscaler
, alb-ingress-controller
, external-secrets
, and external-dns
to create a self-service Kubernetes cluster for you and/or the engineers in your company.