terraform-aws-modules / terraform-aws-eks Goto Github PK

View Code? Open in Web Editor NEW

4.3K 90.0 4.0K 3.38 MB

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦

Home Page: https://registry.terraform.io/modules/terraform-aws-modules/eks/aws

License: Apache License 2.0

HCL 95.50% Smarty 0.76% Shell 2.11% PowerShell 1.64%

terraform terraform-module aws kubernetes eks elastic-kubernetes-service aws-eks aws-eks-cluster

terraform-aws-eks's Introduction

AWS EKS Terraform module

Terraform module which creates Amazon EKS (Kubernetes) resources

Documentation

External Documentation

Please note that we strive to provide a comprehensive suite of documentation for configuring and utilizing the module(s) defined here, and that documentation regarding EKS (including EKS managed node group, self managed node group, and Fargate profile) and/or Kubernetes features, usage, etc. are better left up to their respective sources:

Usage

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = "my-cluster"
  cluster_version = "1.30"

  cluster_endpoint_public_access  = true

  cluster_addons = {
    coredns                = {}
    eks-pod-identity-agent = {}
    kube-proxy             = {}
    vpc-cni                = {}
  }

  vpc_id                   = "vpc-1234556abcdef"
  subnet_ids               = ["subnet-abcde012", "subnet-bcde012a", "subnet-fghi345a"]
  control_plane_subnet_ids = ["subnet-xyzde987", "subnet-slkjf456", "subnet-qeiru789"]

  # EKS Managed Node Group(s)
  eks_managed_node_group_defaults = {
    instance_types = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]
  }

  eks_managed_node_groups = {
    example = {
      # Starting on 1.30, AL2023 is the default AMI type for EKS managed node groups
      ami_type       = "AL2023_x86_64_STANDARD"
      instance_types = ["m5.xlarge"]

      min_size     = 2
      max_size     = 10
      desired_size = 2
    }
  }

  # Cluster access entry
  # To add the current caller identity as an administrator
  enable_cluster_creator_admin_permissions = true

  access_entries = {
    # One access entry with a policy associated
    example = {
      kubernetes_groups = []
      principal_arn     = "arn:aws:iam::123456789012:role/something"

      policy_associations = {
        example = {
          policy_arn = "arn:aws:eks::aws:cluster-access-policy/AmazonEKSViewPolicy"
          access_scope = {
            namespaces = ["default"]
            type       = "namespace"
          }
        }
      }
    }
  }

  tags = {
    Environment = "dev"
    Terraform   = "true"
  }
}

Cluster Access Entry

When enabling authentication_mode = "API_AND_CONFIG_MAP", EKS will automatically create an access entry for the IAM role(s) used by managed node group(s) and Fargate profile(s). There are no additional actions required by users. For self-managed node groups and the Karpenter sub-module, this project automatically adds the access entry on behalf of users so there are no additional actions required by users.

On clusters that were created prior to CAM support, there will be an existing access entry for the cluster creator. This was previously not visible when using aws-auth ConfigMap, but will become visible when access entry is enabled.

Bootstrap Cluster Creator Admin Permissions

Setting the bootstrap_cluster_creator_admin_permissions is a one time operation when the cluster is created; it cannot be modified later through the EKS API. In this project we are hardcoding this to false. If users wish to achieve the same functionality, we will do that through an access entry which can be enabled or disabled at any time of their choosing using the variable enable_cluster_creator_admin_permissions

Enabling EFA Support

When enabling EFA support via enable_efa_support = true, there are two locations this can be specified - one at the cluster level, and one at the node group level. Enabling at the cluster level will add the EFA required ingress/egress rules to the shared security group created for the node group(s). Enabling at the node group level will do the following (per node group where enabled):

All EFA interfaces supported by the instance will be exposed on the launch template used by the node group
A placement group with strategy = "clustered" per EFA requirements is created and passed to the launch template used by the node group
Data sources will reverse lookup the availability zones that support the instance type selected based on the subnets provided, ensuring that only the associated subnets are passed to the launch template and therefore used by the placement group. This avoids the placement group being created in an availability zone that does not support the instance type selected.

Tip

Use the aws-efa-k8s-device-plugin Helm chart to expose the EFA interfaces on the nodes as an extended resource, and allow pods to request the interfaces be mounted to their containers.

The EKS AL2 GPU AMI comes with the necessary EFA components pre-installed - you just need to expose the EFA devices on the nodes via their launch templates, ensure the required EFA security group rules are in place, and deploy the aws-efa-k8s-device-plugin in order to start utilizing EFA within your cluster. Your application container will need to have the necessary libraries and runtime in order to utilize communication over the EFA interfaces (NCCL, aws-ofi-nccl, hwloc, libfabric, aws-neuornx-collectives, CUDA, etc.).

If you disable the creation and use of the managed node group custom launch template (create_launch_template = false and/or use_custom_launch_template = false), this will interfere with the EFA functionality provided. In addition, if you do not supply an instance_type for self-managed node group(s), or instance_types for the managed node group(s), this will also interfere with the functionality. In order to support the EFA functionality provided by enable_efa_support = true, you must utilize the custom launch template created/provided by this module, and supply an instance_type/instance_types for the respective node group.

The logic behind supporting EFA uses a data source to lookup the instance type to retrieve the number of interfaces that the instance supports in order to enumerate and expose those interfaces on the launch template created. For managed node groups where a list of instance types are supported, the first instance type in the list is used to calculate the number of EFA interfaces supported. Mixing instance types with varying number of interfaces is not recommended for EFA (or in some cases, mixing instance types is not supported - i.e. - p5.48xlarge and p4d.24xlarge). In addition to exposing the EFA interfaces and updating the security group rules, a placement group is created per the EFA requirements and only the availability zones that support the instance type selected are used in the subnets provided to the node group.

In order to enable EFA support, you will have to specify enable_efa_support = true on both the cluster and each node group that you wish to enable EFA support for:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  # Truncated for brevity ...

  # Adds the EFA required security group rules to the shared
  # security group created for the node group(s)
  enable_efa_support = true

  eks_managed_node_groups = {
    example = {
      instance_types = ["p5.48xlarge"]

      # Exposes all EFA interfaces on the launch template created by the node group(s)
      # This would expose all 32 EFA interfaces for the p5.48xlarge instance type
      enable_efa_support = true

      pre_bootstrap_user_data = <<-EOT
        # Mount NVME instance store volumes since they are typically
        # available on instance types that support EFA
        setup-local-disks raid0
      EOT

      # EFA should only be enabled when connecting 2 or more nodes
      # Do not use EFA on a single node workload
      min_size     = 2
      max_size     = 10
      desired_size = 2
    }
  }
}

Examples

EKS Managed Node Group: EKS Cluster using EKS managed node groups
Karpenter: EKS Cluster with Karpenter provisioned for intelligent data plane management
Outposts: EKS local cluster provisioned on AWS Outposts
Self Managed Node Group: EKS Cluster using self-managed node groups

Contributing

We are grateful to the community for contributing bugfixes and improvements! Please see below to learn how you can take part.

Requirements

Name	Version
terraform	>= 1.3.2
aws	>= 5.58
time	>= 0.9
tls	>= 3.0

Providers

Name	Version
aws	>= 5.58
time	>= 0.9
tls	>= 3.0

Modules

Name	Source	Version
eks_managed_node_group	./modules/eks-managed-node-group	n/a
fargate_profile	./modules/fargate-profile	n/a
kms	terraform-aws-modules/kms/aws	2.1.0
self_managed_node_group	./modules/self-managed-node-group	n/a

Resources

Name	Type
aws_cloudwatch_log_group.this	resource
aws_ec2_tag.cluster_primary_security_group	resource
aws_eks_access_entry.this	resource
aws_eks_access_policy_association.this	resource
aws_eks_addon.before_compute	resource
aws_eks_addon.this	resource
aws_eks_cluster.this	resource
aws_eks_identity_provider_config.this	resource
aws_iam_openid_connect_provider.oidc_provider	resource
aws_iam_policy.cluster_encryption	resource
aws_iam_policy.cni_ipv6_policy	resource
aws_iam_role.this	resource
aws_iam_role_policy_attachment.additional	resource
aws_iam_role_policy_attachment.cluster_encryption	resource
aws_iam_role_policy_attachment.this	resource
aws_security_group.cluster	resource
aws_security_group.node	resource
aws_security_group_rule.cluster	resource
aws_security_group_rule.node	resource
time_sleep.this	resource
aws_caller_identity.current	data source
aws_eks_addon_version.this	data source
aws_iam_policy_document.assume_role_policy	data source
aws_iam_policy_document.cni_ipv6_policy	data source
aws_iam_session_context.current	data source
aws_partition.current	data source
tls_certificate.this	data source

Inputs

Name	Description	Type	Default	Required
access_entries	Map of access entries to add to the cluster	`any`	`{}`	no
attach_cluster_encryption_policy	Indicates whether or not to attach an additional policy for the cluster IAM role to utilize the encryption key provided	`bool`	`true`	no
authentication_mode	The authentication mode for the cluster. Valid values are `CONFIG_MAP`, `API` or `API_AND_CONFIG_MAP`	`string`	`"API_AND_CONFIG_MAP"`	no
bootstrap_self_managed_addons	Indicates whether or not to bootstrap self-managed addons after the cluster has been created	`bool`	`null`	no
cloudwatch_log_group_class	Specified the log class of the log group. Possible values are: `STANDARD` or `INFREQUENT_ACCESS`	`string`	`null`	no
cloudwatch_log_group_kms_key_id	If a KMS Key ARN is set, this key will be used to encrypt the corresponding log group. Please be sure that the KMS Key has an appropriate key policy (https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/encrypt-log-data-kms.html)	`string`	`null`	no
cloudwatch_log_group_retention_in_days	Number of days to retain log events. Default retention - 90 days	`number`	`90`	no
cloudwatch_log_group_tags	A map of additional tags to add to the cloudwatch log group created	`map(string)`	`{}`	no
cluster_additional_security_group_ids	List of additional, externally created security group IDs to attach to the cluster control plane	`list(string)`	`[]`	no
cluster_addons	Map of cluster addon configurations to enable for the cluster. Addon name can be the map keys or set with `name`	`any`	`{}`	no
cluster_addons_timeouts	Create, update, and delete timeout configurations for the cluster addons	`map(string)`	`{}`	no
cluster_enabled_log_types	A list of the desired control plane logs to enable. For more information, see Amazon EKS Control Plane Logging documentation (https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html)	`list(string)`	[ "audit", "api", "authenticator" ]	no
cluster_encryption_config	Configuration block with encryption configuration for the cluster. To disable secret encryption, set this value to `{}`	`any`	{ "resources": [ "secrets" ] }	no
cluster_encryption_policy_description	Description of the cluster encryption policy created	`string`	`"Cluster encryption policy to allow cluster role to utilize CMK provided"`	no
cluster_encryption_policy_name	Name to use on cluster encryption policy created	`string`	`null`	no
cluster_encryption_policy_path	Cluster encryption policy path	`string`	`null`	no
cluster_encryption_policy_tags	A map of additional tags to add to the cluster encryption policy created	`map(string)`	`{}`	no
cluster_encryption_policy_use_name_prefix	Determines whether cluster encryption policy name (`cluster_encryption_policy_name`) is used as a prefix	`bool`	`true`	no
cluster_endpoint_private_access	Indicates whether or not the Amazon EKS private API server endpoint is enabled	`bool`	`true`	no
cluster_endpoint_public_access	Indicates whether or not the Amazon EKS public API server endpoint is enabled	`bool`	`false`	no
cluster_endpoint_public_access_cidrs	List of CIDR blocks which can access the Amazon EKS public API server endpoint	`list(string)`	[ "0.0.0.0/0" ]	no
cluster_identity_providers	Map of cluster identity provider configurations to enable for the cluster. Note - this is different/separate from IRSA	`any`	`{}`	no
cluster_ip_family	The IP family used to assign Kubernetes pod and service addresses. Valid values are `ipv4` (default) and `ipv6`. You can only specify an IP family when you create a cluster, changing this value will force a new cluster to be created	`string`	`"ipv4"`	no
cluster_name	Name of the EKS cluster	`string`	`""`	no
cluster_security_group_additional_rules	List of additional security group rules to add to the cluster security group created. Set `source_node_security_group = true` inside rules to set the `node_security_group` as source	`any`	`{}`	no
cluster_security_group_description	Description of the cluster security group created	`string`	`"EKS cluster security group"`	no
cluster_security_group_id	Existing security group ID to be attached to the cluster	`string`	`""`	no
cluster_security_group_name	Name to use on cluster security group created	`string`	`null`	no
cluster_security_group_tags	A map of additional tags to add to the cluster security group created	`map(string)`	`{}`	no
cluster_security_group_use_name_prefix	Determines whether cluster security group name (`cluster_security_group_name`) is used as a prefix	`bool`	`true`	no
cluster_service_ipv4_cidr	The CIDR block to assign Kubernetes service IP addresses from. If you don't specify a block, Kubernetes assigns addresses from either the 10.100.0.0/16 or 172.20.0.0/16 CIDR blocks	`string`	`null`	no
cluster_service_ipv6_cidr	The CIDR block to assign Kubernetes pod and service IP addresses from if `ipv6` was specified when the cluster was created. Kubernetes assigns service addresses from the unique local address range (fc00::/7) because you can't specify a custom IPv6 CIDR block when you create the cluster	`string`	`null`	no
cluster_tags	A map of additional tags to add to the cluster	`map(string)`	`{}`	no
cluster_timeouts	Create, update, and delete timeout configurations for the cluster	`map(string)`	`{}`	no
cluster_version	Kubernetes `<major>.<minor>` version to use for the EKS cluster (i.e.: `1.27`)	`string`	`null`	no
control_plane_subnet_ids	A list of subnet IDs where the EKS cluster control plane (ENIs) will be provisioned. Used for expanding the pool of subnets used by nodes/node groups without replacing the EKS control plane	`list(string)`	`[]`	no
create	Controls if resources should be created (affects nearly all resources)	`bool`	`true`	no
create_cloudwatch_log_group	Determines whether a log group is created by this module for the cluster logs. If not, AWS will automatically create one if logging is enabled	`bool`	`true`	no
create_cluster_primary_security_group_tags	Indicates whether or not to tag the cluster's primary security group. This security group is created by the EKS service, not the module, and therefore tagging is handled after cluster creation	`bool`	`true`	no
create_cluster_security_group	Determines if a security group is created for the cluster. Note: the EKS service creates a primary security group for the cluster by default	`bool`	`true`	no
create_cni_ipv6_iam_policy	Determines whether to create an `AmazonEKS_CNI_IPv6_Policy`	`bool`	`false`	no
create_iam_role	Determines whether a an IAM role is created or to use an existing IAM role	`bool`	`true`	no
create_kms_key	Controls if a KMS key for cluster encryption should be created	`bool`	`true`	no
create_node_security_group	Determines whether to create a security group for the node groups or use the existing `node_security_group_id`	`bool`	`true`	no
custom_oidc_thumbprints	Additional list of server certificate thumbprints for the OpenID Connect (OIDC) identity provider's server certificate(s)	`list(string)`	`[]`	no
dataplane_wait_duration	Duration to wait after the EKS cluster has become active before creating the dataplane components (EKS managed node group(s), self-managed node group(s), Fargate profile(s))	`string`	`"30s"`	no
eks_managed_node_group_defaults	Map of EKS managed node group default configurations	`any`	`{}`	no
eks_managed_node_groups	Map of EKS managed node group definitions to create	`any`	`{}`	no
enable_cluster_creator_admin_permissions	Indicates whether or not to add the cluster creator (the identity used by Terraform) as an administrator via access entry	`bool`	`false`	no
enable_efa_support	Determines whether to enable Elastic Fabric Adapter (EFA) support	`bool`	`false`	no
enable_irsa	Determines whether to create an OpenID Connect Provider for EKS to enable IRSA	`bool`	`true`	no
enable_kms_key_rotation	Specifies whether key rotation is enabled	`bool`	`true`	no
fargate_profile_defaults	Map of Fargate Profile default configurations	`any`	`{}`	no
fargate_profiles	Map of Fargate Profile definitions to create	`any`	`{}`	no
iam_role_additional_policies	Additional policies to be added to the IAM role	`map(string)`	`{}`	no
iam_role_arn	Existing IAM role ARN for the cluster. Required if `create_iam_role` is set to `false`	`string`	`null`	no
iam_role_description	Description of the role	`string`	`null`	no
iam_role_name	Name to use on IAM role created	`string`	`null`	no
iam_role_path	Cluster IAM role path	`string`	`null`	no
iam_role_permissions_boundary	ARN of the policy that is used to set the permissions boundary for the IAM role	`string`	`null`	no
iam_role_tags	A map of additional tags to add to the IAM role created	`map(string)`	`{}`	no
iam_role_use_name_prefix	Determines whether the IAM role name (`iam_role_name`) is used as a prefix	`bool`	`true`	no
include_oidc_root_ca_thumbprint	Determines whether to include the root CA thumbprint in the OpenID Connect (OIDC) identity provider's server certificate(s)	`bool`	`true`	no
kms_key_administrators	A list of IAM ARNs for key administrators. If no value is provided, the current caller identity is used to ensure at least one key admin is available	`list(string)`	`[]`	no
kms_key_aliases	A list of aliases to create. Note - due to the use of `toset()`, values must be static strings and not computed values	`list(string)`	`[]`	no
kms_key_deletion_window_in_days	The waiting period, specified in number of days. After the waiting period ends, AWS KMS deletes the KMS key. If you specify a value, it must be between `7` and `30`, inclusive. If you do not specify a value, it defaults to `30`	`number`	`null`	no
kms_key_description	The description of the key as viewed in AWS console	`string`	`null`	no
kms_key_enable_default_policy	Specifies whether to enable the default key policy	`bool`	`true`	no
kms_key_override_policy_documents	List of IAM policy documents that are merged together into the exported document. In merging, statements with non-blank `sid`s will override statements with the same `sid`	`list(string)`	`[]`	no
kms_key_owners	A list of IAM ARNs for those who will have full key permissions (`kms:*`)	`list(string)`	`[]`	no
kms_key_service_users	A list of IAM ARNs for key service users	`list(string)`	`[]`	no
kms_key_source_policy_documents	List of IAM policy documents that are merged together into the exported document. Statements must have unique `sid`s	`list(string)`	`[]`	no
kms_key_users	A list of IAM ARNs for key users	`list(string)`	`[]`	no
node_security_group_additional_rules	List of additional security group rules to add to the node security group created. Set `source_cluster_security_group = true` inside rules to set the `cluster_security_group` as source	`any`	`{}`	no
node_security_group_description	Description of the node security group created	`string`	`"EKS node shared security group"`	no
node_security_group_enable_recommended_rules	Determines whether to enable recommended security group rules for the node security group created. This includes node-to-node TCP ingress on ephemeral ports and allows all egress traffic	`bool`	`true`	no
node_security_group_id	ID of an existing security group to attach to the node groups created	`string`	`""`	no
node_security_group_name	Name to use on node security group created	`string`	`null`	no
node_security_group_tags	A map of additional tags to add to the node security group created	`map(string)`	`{}`	no
node_security_group_use_name_prefix	Determines whether node security group name (`node_security_group_name`) is used as a prefix	`bool`	`true`	no
openid_connect_audiences	List of OpenID Connect audience client IDs to add to the IRSA provider	`list(string)`	`[]`	no
outpost_config	Configuration for the AWS Outpost to provision the cluster on	`any`	`{}`	no
prefix_separator	The separator to use between the prefix and the generated timestamp for resource names	`string`	`"-"`	no
putin_khuylo	Do you agree that Putin doesn't respect Ukrainian sovereignty and territorial integrity? More info: https://en.wikipedia.org/wiki/Putin_khuylo!	`bool`	`true`	no
self_managed_node_group_defaults	Map of self-managed node group default configurations	`any`	`{}`	no
self_managed_node_groups	Map of self-managed node group definitions to create	`any`	`{}`	no
subnet_ids	A list of subnet IDs where the nodes/node groups will be provisioned. If `control_plane_subnet_ids` is not provided, the EKS cluster control plane (ENIs) will be provisioned in these subnets	`list(string)`	`[]`	no
tags	A map of tags to add to all resources	`map(string)`	`{}`	no
vpc_id	ID of the VPC where the cluster security group will be provisioned	`string`	`null`	no

Outputs

Name	Description
access_entries	Map of access entries created and their attributes
access_policy_associations	Map of eks cluster access policy associations created and their attributes
cloudwatch_log_group_arn	Arn of cloudwatch log group created
cloudwatch_log_group_name	Name of cloudwatch log group created
cluster_addons	Map of attribute maps for all EKS cluster addons enabled
cluster_arn	The Amazon Resource Name (ARN) of the cluster
cluster_certificate_authority_data	Base64 encoded certificate data required to communicate with the cluster
cluster_endpoint	Endpoint for your Kubernetes API server
cluster_iam_role_arn	IAM role ARN of the EKS cluster
cluster_iam_role_name	IAM role name of the EKS cluster
cluster_iam_role_unique_id	Stable and unique string identifying the IAM role
cluster_id	The ID of the EKS cluster. Note: currently a value is returned only for local EKS clusters created on Outposts
cluster_identity_providers	Map of attribute maps for all EKS identity providers enabled
cluster_ip_family	The IP family used by the cluster (e.g. `ipv4` or `ipv6`)
cluster_name	The name of the EKS cluster
cluster_oidc_issuer_url	The URL on the EKS cluster for the OpenID Connect identity provider
cluster_platform_version	Platform version for the cluster
cluster_primary_security_group_id	Cluster security group that was created by Amazon EKS for the cluster. Managed node groups use this security group for control-plane-to-data-plane communication. Referred to as 'Cluster security group' in the EKS console
cluster_security_group_arn	Amazon Resource Name (ARN) of the cluster security group
cluster_security_group_id	ID of the cluster security group
cluster_service_cidr	The CIDR block where Kubernetes pod and service IP addresses are assigned from
cluster_status	Status of the EKS cluster. One of `CREATING`, `ACTIVE`, `DELETING`, `FAILED`
cluster_tls_certificate_sha1_fingerprint	The SHA1 fingerprint of the public key of the cluster's certificate
cluster_version	The Kubernetes version for the cluster
eks_managed_node_groups	Map of attribute maps for all EKS managed node groups created
eks_managed_node_groups_autoscaling_group_names	List of the autoscaling group names created by EKS managed node groups
fargate_profiles	Map of attribute maps for all EKS Fargate Profiles created
kms_key_arn	The Amazon Resource Name (ARN) of the key
kms_key_id	The globally unique identifier for the key
kms_key_policy	The IAM resource policy set on the key
node_security_group_arn	Amazon Resource Name (ARN) of the node shared security group
node_security_group_id	ID of the node shared security group
oidc_provider	The OpenID Connect identity provider (issuer URL without leading `https://`)
oidc_provider_arn	The ARN of the OIDC Provider if `enable_irsa = true`
self_managed_node_groups	Map of attribute maps for all self managed node groups created
self_managed_node_groups_autoscaling_group_names	List of the autoscaling group names created by self-managed node groups

License

Apache 2 Licensed. See LICENSE for full details.

Additional information for users from Russia and Belarus

Russia has illegally annexed Crimea in 2014 and brought the war in Donbas followed by full-scale invasion of Ukraine in 2022.
Russia has brought sorrow and devastations to millions of Ukrainians, killed hundreds of innocent people, damaged thousands of buildings, and forced several million people to flee.
Putin khuylo!

terraform-aws-eks's People

Contributors

Stargazers

Watchers

Forkers

run-at-scale tanmng nextdeveloperteam artursmet markfreebairn rms1000watt navitastech stafot smalltown ingussneilands lcharkiewicz ozbillwang tsub prasannakalan yamaszone kamilhristov acbharat bypasslane bshelton229 billyteves charliec3 laverya hobbsh infra-modules xavierdavidgarcia nmettu-lcg jeff-french perryao dpiddock mandrean prakasha4devops lambertpan kpankonen irinafakotakis strygin madhukishore123 rochacon erks eric-gonzales geota urbanos-examples benashz sethlindberg cargill gbolo email2liyang nicgrayson newlyregistered26 sethpollack baileyvw tedchang77 danylohetmantsev monsterxx03 rgposadas subnova jayers99 mmcaya hobsons aweis89 briskgopesh crowd-ai vjremotegithub bkmeneguello rubiooo zihaoyu abnamrocoesd ryanli-me chris-mac dominik-k etopeter engineerchick sandipan6d daneharrigan xinzhang opengov annnnieee nkrendel rmakram-ims ctproject4 bkono mickengland larslevie interrobangc marky-mark asantos2000 onophris skang0601 jipengxiang seanclerkin stewb iliasbertsimas michaelyak kumarchatla nidcode aimanparvaiz thecrimsinghost lhegdal md2k saic-devsecops hmarquetant

terraform-aws-eks's Issues

Security group "workers_ingress_cluster" is very limiting

Currently, in workers.tf, we have this security group:

resource "aws_security_group_rule" "workers_ingress_cluster" {
  description              = "Allow workers Kubelets and pods to receive communication from the cluster control plane."
  protocol                 = "tcp"
  security_group_id        = "${aws_security_group.workers.id}"
  source_security_group_id = "${local.cluster_security_group_id}"
  from_port                = 1025
  to_port                  = 65535
  type                     = "ingress"
  count                    = "${var.worker_security_group_id == "" ? 1 : 0}"
}

Basically, this setting makes it impossible for Kubernetes services to access pods that have have containerPort set to anything below 1025, which is a huge issue since so many of them use the 80 port (e.g. nginx). So, from_port should be set to 0, not 1025.

I realize this is copied from CloudFormation in the official EKS guide, so I'll also submit an issue there.

AWS Profile in kubeconfig template

I'm submitting a

feature request

For my current delivery, the customer has credentials with multiple profiles and not only needs to specify different profiles per cluster, but has no default profile.

It would be great if the kubeconfig.tpl could be modified:

...
users:
- name: aws
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1alpha1
      command: heptio-authenticator-aws
      args:
        - "token"
        - "-i"
        - "${cluster_name}"
      env:
        - name: AWS_PROFILE
          value: ${aws_profile}

where the default Terraform value to populate the template would be default to ensure no regression.

I wanted to start a discussion before a PR to ensure best path forward on this. Thanks!

Feature Request: Worker Configuration

I'm submitting a

feature request

What is the current behavior

Worker node number and instance type cannot be configured.

What's the expected behavior

Configuration options for worker node number and instance type can be specified in module inputs.

Specify multiple cluster/worker security groups

I have issues

I'm submitting a

bug report
feature request
support request

What is the current behavior

The EKS plugin currently supports being able to pass in 1 cluster and worker security group by id.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I think it would make sense to support specifying an array of security group ids.

Environment

Affected module version:
OS:
Terraform version:

Other relevant info

We have a use case where we need to attach multiple security groups. Some of which are predefined.

Bug - EKS can not create load balancers after module provisioned in new AWS account

I have issues

Provisioning EKS cluster in new AWS account will result in an error when attempting to provision a load balancer if no load balancers of any kind have been provisioned before.

I'm submitting a...

bug report

What is the current behavior?

No previous load balancers ( i.e. service-link role AWSServiceRoleForElasticLoadBalancing doesn't exist)

AccessDenied: User: <MODULE-PROVISIONED-ROLE> is not authorized to perform: iam:CreateServiceLinkedRole on resource: arn:aws:iam::<ACCOUNT-ID>:role/aws-service-role/elasticloadbalancing.amazonaws.com/AWSServiceRoleForElasticLoadBalancing

because EKS is attempting to create the ELB service-link role for you, and the roles created by the module lack iam:CreateServiceLinkedRole

If this is a bug, how to reproduce? Please include a code sample if relevant.

Provision EKS cluster using module into new account (or ensure service-link role AWSServiceRoleForElasticLoadBalancing doesn't exist)
Attempt to provision a Service of type LoadBalancer via kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 1 
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.15.2
        ports:
        - containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
  name: nginxservice
spec:
  type : LoadBalancer
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80

What's the expected behavior?

EKS should provision load balancer.

Module should optionally provision (via flag) a resource "aws_iam_service_linked_role", or include updated IAM policies (iam:CreateServiceLinkedRole) to allow the EKS cluster to provision the required service-link role. Alternatively, if this is deemed not the responsibility of the module, the "Assumptions" section in README.md should note the issue.

Are you able to fix this problem and submit a PR? Link here if you have already.

Possibly, depending on the choice of solution (implementation change, documentation update)

Environment details

Affected module version: All

Any other relevant info

AWS Service Link FAQ:
https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/elb-service-linked-roles.html#create-service-linked-role

Workstation cidr possibly not doing what's intended?

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

If this is a bug, how to reproduce? Please include a code sample if relevvant.

This is much more a question than an issue. I see the workstation cidr being allowed to access 443 in the security group attached to the eks cluster. I see the same thing in the terraform eks getting started post. My assumption was this would limit access to the kubernetes api (control plane) to that cidr. It doesn't do that and these control planes are fully accessible on the internet. Was the intention to allow only that cidr to access the control plane? I really wish that were possible. I'm likely completely missing the reason to allow that ingress.

What's the expected behavior?

I expected the control plane to be limited to the IP address in the cidr. My expectation may be completely wrong in which case maybe a different variable description could help.

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

Generated config not saved to correct output path

I have issues with when generated config-map is output to a folder

I'm submitting a...

[ * ] bug report

What is the current behavior?

The generated config-map-aws-auth***.yaml and kubeconfig.yaml file is saved to the root folder, the file name is appended with the value of the config_output_path variable

If this is a bug, how to reproduce? Please include a code sample if relevvant.

Create a cluster with a custom config_output_path

What's the expected behavior?

The config-map-aws-auth***.yaml file should be saved to the config_output_path

Are you able to fix this problem and submit a PR? Link here if you have already.

aws_auth.tf
line 3 - missing /
filename = "${var.config_output_path}/config-map-aws-auth_${var.cluster_name}.yaml"

line 9 - missing /
command = "kubectl apply -f ${var.config_output_path}/config-map-aws-auth_${var.cluster_name}.yaml --kubeconfig ${var.config_output_path}/kubeconfig_${var.cluster_name}"

kubectl.tf
line 3 - missing /
filename = "${var.config_output_path}/kubeconfig_${var.cluster_name}"

Is there a way to modify the aws-k8s-cni yaml before creating the worker groups?

I have issues

The AWS CNI by default pre-allocates the max number of IPs per node which results in unnecessary depletion of my IP pool. As of CNI 1.1, you can fix this by setting WARM_IP_TARGET in the aws-k8s-cni.yaml but this needs to be applied before the EC2 instances are created.

Is there a way I can have Terraform apply a k8s config between creating the cluster and creating the worker groups? My current workaround is specifying 0 nodes in the Terraform module, applying my custom aws-k8s-cni.yaml, then changing the worker node count to my actual desired number.

Thanks!

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

The cluster is created using the release aws-k8s-cni.yaml which does not have WARM_IP_TARGET set.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: 1.3.0
OS: MacOS
Terraform version: v0.11.7

Any other relevant info

Support for the new amazon-eks-node-* AMI with bootstrap script

I have issues

The new amazon-eks-node-* AMI with bootstrap script has been released. However, it's not backward compatible with the old AMI and doesn't work with this module.

https://aws.amazon.com/blogs/opensource/improvements-eks-worker-node-provisioning/

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

This module only works with the eks-woker-* AMIs.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

N/A

What's the expected behavior?

This module should also work with the new amazon-eks-node-* AMI. The entire userdata.sh.tpl can be reduced to something like this:

# Allow user supplied pre userdata code
${pre_userdata}

# Bootstrap and join the cluster
/etc/eks/bootstrap.sh --b64-cluster-ca '${cluster_auth_base64}' --apiserver-endpoint '${endpoint}' --kubelet-extra-args '${kubelet_extra_args}' '${cluster_name}'

# Allow user supplied userdata code
${additional_userdata}

Are you able to fix this problem and submit a PR? Link here if you have already.

I can contribute, but would like to discuss on how we want to approach backward compatibility first.

Environment details

Affected module version: 1.4.0
OS: all
Terraform version: all

Any other relevant info

See:

awslabs/amazon-eks-ami#16 for more info of the change.
https://aws.amazon.com/blogs/opensource/improvements-eks-worker-node-provisioning/

Experience with blue/green using this module?

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Currently, in order to achieve blue/green deployment with worker groups (i.e. updating to a new AMI), I have to add a new worker group with the updated AMI, let them spin up, drain the old nodes so pods transition, then scale down the old worker group (set min/max/desired to 0).

This is not a terrible way of doing it but the problem is that the old ASG (and related resources) sticks around forever and there doesn't seem to be a way to clean up the old stuff without major surgery. If I change the AMI 3 times, I now have 3 worker groups - 2 inactive and scaled to 0 and one active.

Is there a better way of doing this with this module? There's a distinct possibility I'm missing some fundamental terraform concepts but this seems like a complex issue to me. My code ends up looking like this after a new worker group is fully deployed and the old is scaled down (you can see how even semi-frequent deployments would make this list long and leave a lot of trailing garbage):

                  map(
                      "name", "k8s-worker-179fc16f",
                      "ami_id", "ami-179fc16f",
                      "asg_desired_capacity", "0",
                      "asg_max_size", "0",
                      "asg_min_size", "0",
                      ),
                  map(
                      "name", "k8s-worker-67a0841f",
                      "ami_id", "ami-67a0841f",
                      "asg_desired_capacity", "5",
                      "asg_max_size", "8",
                      "asg_min_size", "5",
                      "instance_type","${lookup(var.worker_sizes, "${terraform.workspace}")}",
                      "key_name", "${aws_key_pair.infra-deployer.key_name}",
                      "root_volume_size", "48"
                      )

Environment details

Affected module version: latest with customizations
OS: OSX and AL2
Terraform version: 0.11.7

Any other relevant info

I have seen other code around the internet that does blue/green ASGs but those are for much simpler use-cases IMO - a create_before_destroy and letting it rip would bring a K8s cluster down. I have no qualms with multiple apply steps - its the cleanup part that I'm after.

Bug: Module ignores custom AMI ID

I have issues

I want to use the Ubuntu EKS image (custom) AMI and my settings are ignored by the module.

I'm submitting a...

bug report

What is the current behaviour?

A custom AMI ID is ignored by the module. I tried to set the AMI ID to the following ami_id in the workers_group_defaults section:

workers_group_defaults = {
      ...
      ami_id               = "ami-39397a46"  # Ubuntu Image
      ...
}

The specified AMI is the Ubuntu EKS image for us-east-1. Ubuntu released a statement that they will support and update an image specifically for EKS. The image AMI ID's can be found here: https://cloud-images.ubuntu.com/aws-eks/?_ga=2.56651242.1343651116.1533683680-508754220.1533683680

I would like to use the Ubuntu image instead of the Amazon AMI to make my environment more portable.

If this is a bug, how to reproduce? Please include a code sample if relevant.

Use try to set the ami_id in the workers_group_defaults to use the Ubuntu EKS AMI ID ami-39397a46 (us-east-1) or ami-6d622015 (us-west-2).

What's the expected behaviour?

Custom AMI ID's can be used for worker nodes. Specifically, Ubuntu EKS can be used with this module.

Are you able to fix this problem and submit a PR? Link here if you have already.

Maybe this is just a misunderstanding or it is super easy to fix. If not let me know.

Environment details

Affected module version: 1.4.0
OS: Ubuntu 18.04 LTS (Container)
Terraform version: v0.11.7

Allow adding new users, roles, and accounts to the configmap/aws-auth

I have issues

Amazon's EKS access control is managed via the aws-auth configmap which allows multiple IAM users and roles (cross-account capable) to be granted group membership. The current implementation only allows worker node access, this should be configurable to allow more access control rules to be specified per the documentation: https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html

I'm submitting a

bug report
feature request
support request

What is the current behavior

The current implementation only allows worker node access.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

Ability to specify role/user/account mappings for group membership.

Environment

Affected module version: 1.1.0
OS: Linux
Terraform version: 0.11.7

Other relevant info

it works with multiple worker groups in one EKS, thanks.

I need thank for the hide feature that I can manage multiple worker groups in one EKS, test version is v1.3.0.

So below codes work properly.
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_test_fixture/main.tf#L19-L34

My use case is, I need manage two groups of nodes, one for applications, one for monitoring service only. Later will add more node groups (label for nodeSelector ) for different purpose.

I'm submitting a...

kudos, thank you, warm fuzzy

Any other relevant info

Should we uncomment these lines or make another test cases?

It seems the cluster it is running with Authorization enabled (like RBAC) and there is no permissions for the ingress controller. Please check the configuration

I have issues when deploy alb-ingress-controller

It seems the cluster it is running with Authorization enabled (like RBAC) and there is no permissions for the ingress controller. Please check the configuration

I'm submitting a

bug report
support request

What is the current behavior

I can't deploy alb-ingress-controller with the above error.

If this is a bug, how to reproduce? Please include a code sample

I am not 100% sure it is a bug.

After create EKS cluster with this module, I went through the steps to step 4, I got this error.

This step has additional help, but not sure how to set in EKS cluster with this module.

Deploy the modified alb-ingress-controller.

$ kubectl apply -f alb-ingress-controller.yaml

The manifest above will deploy the controller to the kube-system namespace. If you deploy it outside of kube-system and are using RBAC, you may need to adjust RBAC roles and bindings.

What's the expected behavior

Should work without error.

Environment

Affected module version: latest (a80c6e6)
OS: AWS EKS
Terraform version: 0.11.7

Other relevant info

Allow pre-userdata script on worker launch config

I have issues

I want to be able to run additional user data before the plugin user data on the worker launch config.
I am behind a proxy and need to configure the proxy information before anything else happens.

I'm submitting a

bug report
feature request
support request

What is the current behavior

The plugin only provides a way to specify additional user data that runs after the plugins user data.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

Broader support of kubelet flags

I have a need to to taint nodes and did a quick implementation to support passing taints to the register-with-taints kubelet flag. Take a look here: perryao@435f62e.

But figured it might be worth talking through a solution that supports more of the flags without the need for one-off PRs for each one.

What do you think?

Flag docs here: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

Bring your own security group

I have issues...

I'm submitting a

bug report
feature request
support request

What is the current behavior

Currently the module only supports creation of the security groups for the cluster and workers from within the module itself. Some of the rules are 100% necessary and others are just commonplace and therefore useful. The rules rely on dynamic values but could be applied just the same to a security group passed to the module instead of created within the module. This would give flexibility to the module consumer to provide their own security group and a predefined set of rules that might be tighter than what the module currently prescribes.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

The module should be able to accept a security group ID as input for both the cluster and workers with rules defined outside the module.

Environment

Affected module version: current (0.2.0)
OS: All
Terraform version: 0.11.x

How to launch worker in private subnet

In the getting started example

module "vpc" {
  source             = "terraform-aws-modules/vpc/aws"
  version            = "1.14.0"
  name               = "test-vpc"
  cidr               = "10.0.0.0/16"
  azs                = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]
  private_subnets    = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets     = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]
  enable_nat_gateway = true
  single_nat_gateway = true
  tags               = "${merge(local.tags, map("kubernetes.io/cluster/${local.cluster_name}", "shared"))}"
}

module "eks" {
  source             = "../.."
  cluster_name       = "${local.cluster_name}"
  subnets            = ["${module.vpc.public_subnets}", "${module.vpc.private_subnets}"]
  tags               = "${local.tags}"
  vpc_id             = "${module.vpc.vpc_id}"
  worker_groups      = "${local.worker_groups}"
  worker_group_count = "1"
  map_roles          = "${var.map_roles}"
  map_users          = "${var.map_users}"
  map_accounts       = "${var.map_accounts}"
}

Both private and public subnets are passed to eks module as a single variable. How does eks module determine which subnets are public and which subnets are private and thus launch worker into private subnets only?

Assign public IPs to EKS workers in private subnets.

I have issues

I created EKS cluster in private subnets, we also have discussed about this topic in several tickets, we agree to create EKS workers in private subnets only.

Now it's time to decide, should we keep the feature to assign public IP to EKS workers?

If it is not required any more, I will raise PR to remove this line directly. Otherwise, I have to update with a condition, which way you like?

resource "aws_launch_configuration" "workers" {
  name_prefix                 = "${var.cluster_name}-${lookup(var.worker_groups[count.index], "name", count.index)}"
-  associate_public_ip_address = "${lookup(var.worker_groups[count.index], "public_ip", lookup(var.workers_group_defaults, "public_ip"))}"
  security_groups             = ["${local.worker_security_group_id}"]
  iam_instance_profile        = "${aws_iam_instance_profile.workers.id}"
  image_id                    = "${lookup(var.worker_groups[count.index], "ami_id", data.aws_ami.eks_worker.id)}"
  instance_type               = "${lookup(var.worker_groups[count.index], "instance_type", lookup(var.workers_group_defaults, "instance_type"))}"

I'm submitting a...

bug report
feature request

What is the current behavior?

When create workers in private subnets, public IPs are assigned to these workers.

Are you able to fix this problem and submit a PR? Link here if you have already.

Yes, I will

Environment details

Affected module version: v1.1.0
OS: ubuntu
Terraform version: 0.11.7

Any other relevant info

Feature Request Addons

I'm submitting a

feature request

Feature Request Addons

I would be nice to have an option which allows to install addons together with the cluster. There is another Terraform Registry module which does that: https://registry.terraform.io/modules/scholzj/kubernetes/aws/1.3.4#addons

Not Clear on EC2PrivateDNSName

I have issues

[ * ] support request

Should I be exporting my bastion's Ec2 private DNS NAME before I do terraform apply ?
I used bastion host to provision the EKS cluster
I am not clear on EC2PrivateDNSName variable

Error running command to update_config_map_aws_auth

I have issues

Please Help! I ran everything with defaults other then setting the VPC and Subnet

I'm submitting a...

bug report
feature request
[X ] support request
kudos, thank you, warm fuzzy

What is the current behavior?

I get this error running the Terraform apply
Error: Error applying plan:

1 error(s) occurred:

null_resource.update_config_map_aws_auth: Error running command 'kubectl apply -f .//config-map-aws-auth_EKSClusterTest.yaml --kubeconfig .//kubec
onfig_EKSClusterTest': exit status 1. Output: Unable to connect to the server: getting token: exec: exec: "aws-iam-authenticator": executable file n
ot found in %PATH%

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Set the config on the instance

Are you able to fix this problem and submit a PR? Link here if you have already.

No can I please have help?

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

First time setting this up!

can't update launch configuration.

I have issues

Recently I upgrade release from 1.1.0 to 1.3.0 and add some changes to launch configuration, such as key name,

I'm submitting a...

bug report

What is the current behavior?

* aws_launch_configuration.workers (deposed #0): ResourceInUse: Cannot delete launch configuration project-prod-020180630105107074900000001 because it is attached to AutoScalingGroup project-prod-monitoring
	status code: 400, request id: 46f32656-8661-11e8-9e77-51ef4818b760

If this is a bug, how to reproduce? Please include a code sample if relevvant.

change ami image id, add/remove key pair name, or other which need re-create a new launch configuration.

What's the expected behavior?

Smoothly updated.

Are you able to fix this problem and submit a PR? Link here if you have already.

I am still investigating this issue, if I can fix, will raise PR.

Environment details

Affected module version: v1.1.0 -> v1.3.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

Here is the fix someone mentioned:

hashicorp/terraform#532 (comment)

AMI eks-worker-* query returned no results

I have issues

I'm submitting a...

bug report

What is the current behavior?

If region is set to us-west-1:

Error: Error refreshing state: 1 error(s) occurred:

module.eks.data.aws_ami.eks_worker: 1 error(s) occurred:
module.eks.data.aws_ami.eks_worker: data.aws_ami.eks_worker: Your query returned no results. Please change your search criteria and try again.

If this is a bug, how to reproduce? Please include a code sample if relevant.

module "eks" {
source = "terraform-aws-modules/eks/aws"
cluster_name = "test-eks-cluster"
vpc_id = "${module.vpc.default_vpc_id}"
subnets = "${module.vpc.public_subnets}"

tags = {
Environment = "test"
Terraform = "true"
}
}

What's the expected behavior?

An available AWS AMI ID.

Are you able to fix this problem and submit a PR? Link here if you have already.

Not sure how.

Environment details

Affected module version: 1.3.0
OS: Linux
Terraform version: 0.11.7

Any other relevant info

Feature Request: Key Pair for Worker Nodes

I'm submitting a

feature request

What is the current behavior

I want to access the worker nodes via SSH using a Key Pair (Private-Public Key). However there is no option which allows me to specify a key pair to be installed on the worker nodes.

What's the expected behavior

Provide an option to specify a key pair (local or already in AWS) and installed it on the worker nodes. It should be possible to set the key name via variable. For inspiration have a look at Terraform-DC/OS module: https://github.com/dcos/terraform-dcos/tree/master/aws#configure-aws-ssh-keys

Automatic deployment of Cluster Autoscaler

I have issues

Although worker nodes are deployed as autoscaling group, when EKS cannot schedule more pods because of missing resources i.e. CPU more nodes are not started by ASG. Would be nice to deploy Cluster Autoscaler automatically (or at least put some information in README how to do this) so we can benefit from ASG.

I'm submitting a...

feature request

What is the current behavior?

ASG workers are not started even if EKS cannot schedule more pods because of missing CPU and we are still below the maximum size of workers ASG.

What's the expected behaviour?

The expected behaviour would be:

deploy Cluster Autoscaler: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md
(
this permissions needs to be added for EC2 EKS IAM role:

"autoscaling:DescribeAutoScalingGroups",
                "autoscaling:DescribeAutoScalingInstances",
                "autoscaling:SetDesiredCapacity",
                "autoscaling:TerminateInstanceInAutoScalingGroup"

)
2. Scale sample application so it needs more than whole CPU from single VM.
3. See that Autoscaler is adding more nodes.

Environment details

ESK in us-east-1

Terraform version:
v0.11.7

Name tags are too prescriptive; allow them more flexibility but provide sensible defaults

I have issues

I'm submitting a

bug report
feature request
support request

What is the current behavior

It's not possible to define what the Name tag of any resource will absolutely be. This should be able to be user defined.

Other relevant info

I think another variable map containing tag_defaults or a local (since computing is needed) variable will come in handy here. Will explore in next week's cycles.

ASG workers on spot instances

I have issues

It is great that with this module I can use more than one ASG worker pool. Would be nice to be able to use spot instances i.e. for background jobs or any applications that can recover very fast from replaced node.

I'm submitting a...

feature request

What is the current behavior?

Cannot use spot instances as worker nodes (or at least do not know how)

What's the expected behavior?

I could define that one (or all) of my worker nodes ASG are using spot instances.

Using computed values in worker group parameters results in `value of 'count' cannot be computed` error

I have issues

I'm submitting a...

bug report

What is the current behavior?

terraform plan produces this output when any worker group parameters are computed values:

laverya:~/dev$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

data.http.workstation_external_ip: Refreshing state...
data.aws_region.current: Refreshing state...
data.aws_availability_zones.available: Refreshing state...
data.aws_iam_policy_document.cluster_assume_role_policy: Refreshing state...
data.aws_iam_policy_document.workers_assume_role_policy: Refreshing state...
data.aws_ami.eks_worker: Refreshing state...

Error: Error refreshing state: 1 error(s) occurred:

* module.eks.data.template_file.userdata: data.template_file.userdata: value of 'count' cannot be computed

If this is a bug, how to reproduce? Please include a code sample if relevant.

provider "aws" {
  version = "~> 1.27"
  region  = "us-east-1"
}

data "aws_availability_zones" "available" {}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "1.37.0"
  name    = "eks-vpc"
  cidr    = "10.0.0.0/16"
  azs     = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]

  private_subnets    = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
  public_subnets     = ["10.0.4.0/24", "10.0.5.0/24", "10.0.6.0/24"]

  tags = "${map("kubernetes.io/cluster/terraform-eks", "shared")}"
}

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "1.3.0"
  cluster_name = "terraform-eks"
  subnets = ["${module.vpc.private_subnets}", "${module.vpc.public_subnets}"]
  tags    = "${map("Environment", "test")}"
  vpc_id = "${module.vpc.vpc_id}"

  worker_groups = [
    {
      name          = "default-m5-large"
      instance_type = "m5.large"

      subnets = ""
      # subnets = "${join(",", module.vpc.private_subnets)}"
    },
  ]
}

Uncomment subnets = "${join(",", module.vpc.private_subnets)}" to replace subnets = "" in the worker_groups config and run terraform plan.

What's the expected behavior?

terraform plan completes and a plan is produced.

Are you able to fix this problem and submit a PR? Link here if you have already.

I have not yet identified the root cause.

Environment details

Affected module version: 1.3.0
OS: Ubuntu 16.04
Terraform version: Terraform v0.11.7

Any other relevant info

This makes it rather difficult to assign subnets to worker groups.

kube-proxy doesn't exist in the latest AWS worker node AMI

I have issues

I'm submitting a

bug report

What is the current behavior

kube-proxy doesn't exist in the latest AWS worker node AMI, but the userdata teamplate try to restart it, that will encounter error as below

Failed to restart kube-proxy.service: Unit not found.

What's the expected behavior

Remove the kube-proxy from restart step

investigate adding create_before_destroy to worker asg to prevent downtime when recreating

I have issues

to change the instance type

I'm submitting a

bug report

What is the current behavior

* module.eks.aws_launch_configuration.workers: 1 error(s) occurred:

* aws_launch_configuration.workers: Error creating launch configuration: AlreadyExists: Launch Configuration by this name already exists - A launch configuration already exists with the name eks-path-prod-0
	status code: 400, request id: xxx-xxx-xxx-xxx-xxx

If this is a bug, how to reproduce? Please include a code sample

deploy EKS cluster, then change the instance type and apply again.

What's the expected behavior

Should be no issue.

Environment

Affected module version: 1.0.0
OS: Ubuntu
Terraform version: 0.11.7

Other relevant info

asg size changes should be ignored.

I have issues

asg size changes should be ignored.

I'm submitting a

feature request

What is the current behavior

Updated asg sized after deploy, terraform apply detects the changes, which should be ignored.

At least change indesired_capacity should be ignored.

  ~ module.eks.aws_autoscaling_group.workers
      desired_capacity:         "2" => "1"
      max_size:                 "5" => "3"
      min_size:                 "2" => "1"

What's the expected behaviour

Ignore the changes, since we don't want the running system to be re-sized.

Environment

Affected module version: 1.0.0
OS: ubuntu
Terraform version: 0.11.7

Other relevant info

If you are fine to ignore change in desired_capacity, I can raise PR for this feature, please confirm.

Cluster and worker security group specification doesn't work

I have issues

I am creating an eks cluster with providing a cluster_security_group_id and worker_security_group_id.

I'm submitting a

bug report
feature request
support request

What is the current behavior

When specifying a security group aka cluster_security_group_id = "sg-123" or worker_security_group_id = "sg-123". I get

Error: Error running plan: 1 error(s) occurred:
module.eks.local.cluster_security_group_id: local.cluster_security_group_id: Resource 'aws_security_group.cluster' not found for variable 'aws_security_group.cluster.id'`<br>

If this is a bug, how to reproduce? Please include a code sample

Create an eks cluster with a cluster_security_group_id or worker_security_group_id specified.

Terraform does not support short circut evaluation in it's ternary operator. The fix to the issue is specified here

What's the expected behavior

We should be able to create an EKS cluster while specifying cluster sg or worker sg as the documentation currently specifies.

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

Use of name_prefix

Currently we have:

resource "aws_iam_role" "workers" {
  name_prefix        = "${aws_eks_cluster.this.name}"
  assume_role_policy = "${data.aws_iam_policy_document.workers_assume_role_policy.json}"
}

resource "aws_iam_instance_profile" "workers" {
  name_prefix = "${aws_eks_cluster.this.name}"
  role        = "${aws_iam_role.workers.name}"
}

Is there a reason to use name_prefix instead of just name? I ask because the resultant names are things like my-cluster-20180808095045107900000005.

We have to create cross account IAM policies for things like ECR and it would be nice to have a predictable and consistent name for the roles 🙂

Fix for AWS EKS “is not authorized to perform: iam:CreateServiceLinkedRole”

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

After deployingeks via this TF module in a brand new AWS account, the internet-facing k8s service I created could not create a load balancer. Turns out it's because this is a brand new AWS account and no ELB has been created in it before and the AWS user guide (as well as this module) assumes that AWSServiceRoleForElasticLoadBalancing already exists.

https://stackoverflow.com/questions/51597410/aws-eks-is-not-authorized-to-perform-iamcreateservicelinkedrole

Recommend adding

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "iam:CreateServiceLinkedRole",
                "Resource": "arn:aws:iam::*:role/aws-service-role/*"
            },
            {
                "Effect": "Allow",
                "Action": [
                    "ec2:DescribeAccountAttributes"
                ],
                "Resource": "*"
            }
        ]
    }

To the cluster role policy.

Allow worker nodes to be created in private subnets if eks cluster has both private and public subnets

I have issues

I'm submitting a

bug report
[x ] feature request
support request

What is the current behavior

Based on this guide from aws, it is recommended that you specify both public and private subnets when creating your eks cluster, but that you only create your worker nodes in your private subnets. Current behaviour in this module will use the same subnets for creating the eks cluster as for placing the worker nodes within.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I believe it would be a good feature to add an additional (optional) list variable to the module called worker_subnets that will be used to create the worker nodes within. This means you can add private and public subnets to the subnets variable, but only add private subnets to the worker_subnets variable.

Environment

Affected module version:
OS:
Terraform version:

Other relevant info

I have a branch with this feature on a fork, I will add a PR to be looked at.

Include autoscaling related IAM policies for workers for the cluster-autoscaler

Currently we have to add the policy outside this module but I think 90% of people will use the cluster-autoscaler so it would be cool to have it included in this module and perhaps enabled with a variable.
kops currently has this by default here.

The policy would look something like this:

data "aws_iam_policy_document" "eks_node_autoscaling" {
  statement {
    sid    = "eksDemoNodeAll"
    effect = "Allow"

    actions = [
      "autoscaling:DescribeAutoScalingGroups",
      "autoscaling:DescribeAutoScalingInstances",
      "autoscaling:DescribeLaunchConfigurations",
      "autoscaling:DescribeTags",
      "autoscaling:GetAsgForInstance",
    ]

    resources = ["*"]
  }

  statement {
    sid    = "eksDemoNodeOwn"
    effect = "Allow"

    actions = [
      "autoscaling:SetDesiredCapacity",
      "autoscaling:TerminateInstanceInAutoScalingGroup",
      "autoscaling:UpdateAutoScalingGroup",
    ]

    resources = ["*"]

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/Name"
      values   = ["xxxx-eks_asg"]
    }
  }
}

This allows would allow the cluster-autoscaler the access it needs to run correctly.

What do you think?

Should we manage k8s resources with this module?

This is a general question about the direction of this module.

We get requests that would require this module to manage or create Kubernetes resources. Some examples:

Modifying CNI configuration before worker ASG creation: #96
Deploying cluster autoscaler: #71
Manage add-ons: #19

I think we should have a clear position on this types of issues.

Be able to define per ASGs tags

I have issues

Tags currently are too prescriptive. I have a use case where I need to tag different ASGs with different tags. Im using the ability to push these tags down to node labels and taints to drive different workloads on my kubernetes cluster. Atm, it seems that I can only define tags once on the top level EKS module and these tags are used globally throughout. I would like to be able to define tags per ASGs. A sensible place to provide these seems to be the list of worker_groups maps.

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Tags are defined once in var.tags and used throughout both to tag the cluster resources itself, as well as, all ASGs that are created.

If this is a bug, how to reproduce? Please include a code sample if relevvant.

What's the expected behavior?

Should be able to provide tags in the list of worker_groups maps and these should be used to tagging the corresponding ASGs that are created for each respective worker group. If the tags are not set, they can default to the existing top-level global tag variable that is used.

Are you able to fix this problem and submit a PR? Link here if you have already.

I can submit a PR if this is reasonable.

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

Cluster DNS does not function

I have issues

I'm submitting a...

support request

DNS

How is cluster DNS supposed to work? I have not been able to get pods to resolve any cluster addresses (including kubernetes.default) using EKS. I suspect it's a function of how the AWS VPC CNI works (or doesn't) and figured other people using this module must be running into the same problem however I can't seem to find much on the internet about this in EKS.

locals {
  worker_groups = "${list(
                  map(
                      "name", "k8s-worker",
                      "ami_id", "ami-73a6e20b",
                      "asg_desired_capacity", "5",
                      "asg_max_size", "8",
                      "asg_min_size", "5",
                      "instance_type","m4.large",
                      "key_name", "${aws_key_pair.infra-deployer.key_name}"
                      ),
  )}"
  tags = "${map("Environment", "${terraform.workspace}")}"
}

data "aws_vpc" "vpc" {
  filter {
    name   = "tag:env"
    values = ["${terraform.workspace}"]
  }

  filter {
    name   = "tag:Name"
    values = ["${terraform.workspace}-us-west-2"]
  }
}

data "aws_subnet_ids" "eks_subnets" {
  vpc_id = "${data.aws_vpc.vpc.id}"

  tags {
    env  = "${terraform.workspace}"
    Name = "${terraform.workspace}-eks*"
  }
}

module "eks" {
  source                = "terraform-aws-modules/eks/aws"
  cluster_name          = "${terraform.workspace}"
  subnets               = "${data.aws_subnet_ids.eks_subnets.ids}"
  vpc_id                = "${data.aws_vpc.vpc.id}"
  kubeconfig_aws_authenticator_env_variables = "${map("AWS_PROFILE", "infra-deployer" )}"
  map_accounts          = ["${lookup(var.aws_account_ids, "prod")}"]
  worker_groups         = "${local.worker_groups}"
  tags                  = "${local.tags}"
}

Trying DNS on a brand new cluster:

$ kubectl exec -ti busybox -- nslookup kubernetes.default
Server:		172.20.0.10
Address:	172.20.0.10:53

** server can't find kubernetes.default: NXDOMAIN

*** Can't find kubernetes.default: No answer

$ kubectl exec -ti busybox -- cat /etc/resolv.conf
nameserver 172.20.0.10
search default.svc.cluster.local svc.cluster.local cluster.local staging.thinklumo.com us-west-2.compute.internal
options ndots:5

I've tried too many things to list here and at this point suspect its an issue with EKS so I'm hoping someone has been down this path already.

Are you able to fix this problem and submit a PR? Link here if you have already.

N/A

Environment details

Affected module version: latest
OS: AL2
Terraform version:

Terraform v0.11.7
+ provider.aws v1.25.0

Any other relevant info

root_block_device missed

I have issues

running EKS cluster for a week, get disk space issue, need feature to control and extend with larger size.

I'm submitting a...

[X ] feature request

What is the current behavior?

No control on root block device, disk space is used out quickly

What's the expected behavior?

The root_block_device mapping supports the following:

volume_type - (Optional) The type of volume. Can be "standard", "gp2", or "io1". (Default: "standard").
volume_size - (Optional) The size of the volume in gigabytes.
iops - (Optional) The amount of provisioned IOPS. This must be set with a volume_type of "io1".
delete_on_termination - (Optional) Whether the volume should be destroyed on instance termination (Default: true).

Currently, only enable delete_on_termination

Are you able to fix this problem and submit a PR? Link here if you have already.

Yes, I will

Environment details

Affected module version: v1.1.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

We should ignore changes to node ASG desired_capacity

I have issues

I'm submitting a

feature request

The reason is that after cluster creation, almost everyone will run the k8s node autoscaler. This autoscaler is changing the desired_capacity to suit resources required by the cluster. So when the cluster autoscales, then TF is run again later, you see something like this:

  ~ module.cluster_1.aws_autoscaling_group.workers
      desired_capacity:   "5" => "3"

You can just add a lifecycle statement to resource aws_autoscaling_group.workers:

  lifecycle {
    ignore_changes = [ "desired_capacity" ]
  }

aws_auth config fails to apply while getting started

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Terraform apply fails with

* module.eks.null_resource.update_config_map_aws_auth: Error running command 'kubectl apply -f ./config-map-aws-auth_beam-eks.yaml --kubeconfig ./kubeconfig_beam-eks': exit status 1. Output: error: unable to recognize "./config-map-aws-auth_beam-eks.yaml": Unauthorized

If this is a bug, how to reproduce? Please include a code sample if relevvant.

This is my configuration for the eks module.

I have a really basic vpc created via terraform-aws-modules/vpc/aws.

module "eks" {
  source       = "terraform-aws-modules/eks/aws"
  cluster_name = "beam-eks"
  subnets      = "${module.vpc.public_subnets}"
  vpc_id       = "${module.vpc.vpc_id}"
}

What's the expected behavior?

Apply succeeds

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Affected module version: "1.4.0"
OS: MacOS 10.13.3 (17D47)
Terraform v0.11.8

provider.aws v1.33.0
provider.http v1.0.1
provider.local v1.1.0
provider.null v1.0.0
provider.template v1.0.0

No any logs exported to Cloudwatch

I have issues

I have difficulty to troubleshooting EKS nodes issue, for example, OutOfDisk issue. I go to cloudwatch, there is no any instance logs or /var/log/message logs.

OutOfDisk Unknown Fri, 13 Jul 2018 00:11:01 +0000 Fri, 13 Jul 2018 00:11:43 +0000 NodeStatusUnknown Kubelet stopped posting node status.

Currently there is no key pair set and I can't login the eks nodes to do further check.

I'm submitting a...

feature request
support request

What is the current behavior?

No any logs from EKS cluster.

What's the expected behavior?

I need some ways to review the logs when something happened.

Are you able to fix this problem and submit a PR? Link here if you have already.

not sure how to fix this issue, need help.

Environment details

Affected module version: v1.1.0
OS: Ubuntu
Terraform version: 0.11.7

Any other relevant info

Assumption Missing: Install Kubectl

I have issues

I'm submitting a

bug report

What is the current behavior

There is no mention of the requirement to have kubectl installed before running the script. The module will fail while applying the plan.

Error: Error applying plan:

1 error(s) occurred:

* module.eks.null_resource.configure_kubectl: Error running command 'kubectl apply -f .//config-map-aws-auth.yaml --kubeconfig .//kubeconfig': exit status 127. Output: /bin/sh: 1: kubectl: not found

What's the expected behavior

Have the a note and a link to the install instructions in the Assumption section of the README.md.

Avoid using hardcoded value for max pod per node

Right now in the user-data script we have

sed -i s,MAX_PODS,20,g /etc/systemd/system/kubelet.service

The value 20 is hardcoded right now. Since AWS released the numbers in their CloudFormation template, I think we can extract the value and use a lookup function to get the proper value.

A proposal:

locals {
  # Mapping from the node type that we selected and the max number of pods that it can run
  # Taken from https://amazon-eks.s3-us-west-2.amazonaws.com/1.10.3/2018-06-05/amazon-eks-nodegroup.yaml
  max_pod_per_node = {
    c4.large    = 29
    c4.xlarge   = 58
    c4.2xlarge  = 58
    c4.4xlarge  = 234
    c4.8xlarge  = 234
    c5.large    = 29
    c5.xlarge   = 58
    c5.2xlarge  = 58
    c5.4xlarge  = 234
    c5.9xlarge  = 234
    c5.18xlarge = 737
    i3.large    = 29
    i3.xlarge   = 58
    i3.2xlarge  = 58
    i3.4xlarge  = 234
    i3.8xlarge  = 234
    i3.16xlarge = 737
    m3.medium   = 12
    m3.large    = 29
    m3.xlarge   = 58
    m3.2xlarge  = 118
    m4.large    = 20
    m4.xlarge   = 58
    m4.2xlarge  = 58
    m4.4xlarge  = 234
    m4.10xlarge = 234
    m5.large    = 29
    m5.xlarge   = 58
    m5.2xlarge  = 58
    m5.4xlarge  = 234
    m5.12xlarge = 234
    m5.24xlarge = 737
    p2.xlarge   = 58
    p2.8xlarge  = 234
    p2.16xlarge = 234
    p3.2xlarge  = 58
    p3.8xlarge  = 234
    p3.16xlarge = 234
    r3.xlarge   = 58
    r3.2xlarge  = 58
    r3.4xlarge  = 234
    r3.8xlarge  = 234
    r4.large    = 29
    r4.xlarge   = 58
    r4.2xlarge  = 58
    r4.4xlarge  = 234
    r4.8xlarge  = 234
    r4.16xlarge = 737
    t2.small    = 8
    t2.medium   = 17
    t2.large    = 35
    t2.xlarge   = 44
    t2.2xlarge  = 44
    x1.16xlarge = 234
    x1.32xlarge = 234
  }

  workers_userdata = <<USERDATA
#!/bin/bash -xe
CA_CERTIFICATE_DIRECTORY=/etc/kubernetes/pki
CA_CERTIFICATE_FILE_PATH=$CA_CERTIFICATE_DIRECTORY/ca.crt
mkdir -p $CA_CERTIFICATE_DIRECTORY
echo "${aws_eks_cluster.this.certificate_authority.0.data}" | base64 -d >  $CA_CERTIFICATE_FILE_PATH
INTERNAL_IP=$(curl -s http://169.254.169.254/latest/meta-data/local-ipv4)
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.this.endpoint},g /var/lib/kubelet/kubeconfig
sed -i s,CLUSTER_NAME,${var.cluster_name},g /var/lib/kubelet/kubeconfig
sed -i s,REGION,${data.aws_region.current.name},g /etc/systemd/system/kubelet.service
sed -i s,MAX_PODS,${lookup(local.max_pod_per_node, var. workers_instance_type)},g /etc/systemd/system/kubelet.service
sed -i s,MASTER_ENDPOINT,${aws_eks_cluster.this.endpoint},g /etc/systemd/system/kubelet.service
sed -i s,INTERNAL_IP,$INTERNAL_IP,g /etc/systemd/system/kubelet.service
DNS_CLUSTER_IP=10.100.0.10
if [[ $INTERNAL_IP == 10.* ]] ; then DNS_CLUSTER_IP=172.20.0.10; fi
sed -i s,DNS_CLUSTER_IP,$DNS_CLUSTER_IP,g /etc/systemd/system/kubelet.service
sed -i s,CERTIFICATE_AUTHORITY_FILE,$CA_CERTIFICATE_FILE_PATH,g /var/lib/kubelet/kubeconfig
sed -i s,CLIENT_CA_FILE,$CA_CERTIFICATE_FILE_PATH,g  /etc/systemd/system/kubelet.service
systemctl daemon-reload
systemctl restart kubelet kube-proxy
USERDATA
}

@brandoconnor Please let me know if this is OK, I'll create a fork and a pull request later

Worker ASG names should be exposed

I have issues

I'm submitting a...

bug report
feature request
support request
kudos, thank you, warm fuzzy

What is the current behavior?

Worker ASG ARNs are exposed, but not names. ASG names are used in aws_autoscaling_attachment among other things.

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Both are exposed

Are you able to fix this problem and submit a PR? Link here if you have already.

PR: #77

Environment details

Affected module version:
OS:
Terraform version:

Any other relevant info

Better support for multiple clusters

I'm submitting a

[] feature request

A couple of changes would be it easier to work with multiple clusters.

Include the cluster name in the file name here by default. This way other clusters won't overwrite the same file.
Include the cluster name in the configuration here. This will make some keys in here unique, which makes it easier to merge the configuration without manual adjustments.

Override the default ingress rule that allows communication with the EKS cluster API.

I have issues

I would prefer to use the default security groups created for the cluster, but do not want the default API/32 to be used.

I'm submitting a

bug report
feature request
support request

What is the current behavior

Currently, if you use the default security groups it will create a security group role that allows communication with the eks cluster over the current API/32 cidr.

If this is a bug, how to reproduce? Please include a code sample

What's the expected behavior

I want to override the API/32 cidr and specify my own.

Environment

Affected module version: 1.1.0
OS:
Terraform version: 0.11.7

Other relevant info

How to define nodeSelector with autoscaling?

I have issues

On how to define nodeSelector with autoscaling?

I'm submitting a

support request

What is the current behavior

pods can be deployed to any nodes.

What's the expected behavior

I can manually set nodeSelector with label command to several nodes, but in autoscaling environment, how to work it out?

I found there are codes about worker_groups, but not sure how to use for labelling.

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_test_fixture/main.tf#L19-L34

  # the commented out worker group list below shows an example of how to define
  # multiple worker groups of differing configurations
  # worker_groups = "${list(
  #                   map("asg_desired_capacity", "2",
  #                       "asg_max_size", "10",
  #                       "asg_min_size", "2",
  #                       "instance_type", "m4.xlarge",
  #                       "name", "worker_group_a",
  #                   ),
  #                   map("asg_desired_capacity", "1",
  #                       "asg_max_size", "5",
  #                       "asg_min_size", "1",
  #                       "instance_type", "m4.2xlarge",
  #                       "name", "worker_group_b",
  #                   ),
  # )}"

Environment

Affected module version: 1,0.0
OS: Ubuntu
Terraform version: 0.11.7