Code Monkey home page Code Monkey logo

terraform-aws-mwaa's Introduction

Amazon Managed Workflows for Apache Airflow(MWAA) Module

This terraform module can be used to deploy Amazon Managed Workflows for Apache Airflow(MWAA) environment.

✅ Deployment examples can be found under examples folder.

✅ Amazon MWAA documentation for more details about Amazon MWAA

✅ Amazon MWAA for Analytics Workshop

Amazon MWAA Architecture

example of Amazon MWAA Architecture for an example public deployment

Usage

The example below builds Amazon MWAA environment with existing VPC and Private Subnets. Amazon MWAA supporting resources S3 bucket, IAM role and Security groups created by this module by default. This module allows you to bring your own S3 bucket, IAM role and Security group.

module "mwaa" {
  source = "aws-ia/mwaa/aws"

  name                 = "basic-mwaa"
  airflow_version      = "2.2.2"
  environment_class    = "mw1.medium"

  vpc_id                = "<ENTER_VPC_ID>"
  private_subnet_ids    = ["<ENTER_SUBNET_ID1>","<ENTER_SUBNET_ID2>"]

  min_workers           = 1
  max_workers           = 25
  webserver_access_mode = "PUBLIC_ONLY" # Default PRIVATE_ONLY for production environments

  iam_role_additional_policies = {
    "additional-policy-1" = "<ENTER_POLICY_ARN1>"
    "additional-policy-2" = "<ENTER_POLICY_ARN2>"
  }

  logging_configuration = {
    dag_processing_logs = {
      enabled   = true
      log_level = "INFO"
    }

    scheduler_logs = {
      enabled   = true
      log_level = "INFO"
    }

    task_logs = {
      enabled   = true
      log_level = "INFO"
    }

    webserver_logs = {
      enabled   = true
      log_level = "INFO"
    }

    worker_logs = {
      enabled   = true
      log_level = "INFO"
    }
  }

  airflow_configuration_options = {
    "core.load_default_connections" = "false"
    "core.load_examples"            = "false"
    "webserver.dag_default_view"    = "tree"
    "webserver.dag_orientation"     = "TB"
    "logging.logging_level"         = "INFO"
  }
}

Security

See CONTRIBUTING for more information.

License

Apache-2.0 Licensed. See LICENSE.

Requirements

Name Version
terraform >= 1.0.0
aws >= 4.63.0

Providers

Name Version
aws >= 4.63.0

Modules

No modules.

Resources

Name Type
aws_iam_role.mwaa resource
aws_iam_role_policy.mwaa resource
aws_iam_role_policy_attachment.mwaa resource
aws_mwaa_environment.mwaa resource
aws_s3_bucket.mwaa resource
aws_s3_bucket_public_access_block.mwaa resource
aws_s3_bucket_server_side_encryption_configuration.mwaa resource
aws_s3_bucket_versioning.mwaa resource
aws_security_group.mwaa resource
aws_security_group_rule.mwaa_sg_inbound resource
aws_security_group_rule.mwaa_sg_inbound_vpn resource
aws_security_group_rule.mwaa_sg_outbound resource
aws_caller_identity.current data source
aws_iam_policy_document.mwaa data source
aws_iam_policy_document.mwaa_assume data source
aws_partition.current data source
aws_region.current data source

Inputs

Name Description Type Default Required
airflow_configuration_options (Optional) The airflow_configuration_options parameter specifies airflow override options. any null no
airflow_version (Optional) Airflow version of your environment, will be set by default to the latest version that MWAA supports. string null no
create_iam_role Create IAM role for MWAA bool true no
create_s3_bucket Create new S3 bucket for MWAA. string true no
create_security_group Create security group for MWAA bool true no
dag_s3_path (Required) The relative path to the DAG folder on your Amazon S3 storage bucket. For example, dags. string "dags" no
environment_class (Optional) Environment class for the cluster. Possible options are mw1.small, mw1.medium, mw1.large, mw1.xlarge, mw1.2xlarge.
Will be set by default to mw1.small. Please check the AWS Pricing for more information about the environment classes.
string "mw1.small" no
execution_role_arn (Required) The Amazon Resource Name (ARN) of the task execution role that the Amazon MWAA and its environment can assume
Mandatory if create_iam_role=false
string null no
force_detach_policies IAM role Force detach policies bool false no
iam_role_additional_policies Additional policies to be added to the IAM role map(string) {} no
iam_role_name IAM Role Name to be created if execution_role_arn is null string null no
iam_role_path IAM role path string "/" no
iam_role_permissions_boundary IAM role Permission boundary string null no
kms_key (Optional) The Amazon Resource Name (ARN) of your KMS key that you want to use for encryption.
Will be set to the ARN of the managed KMS key aws/airflow by default.
string null no
logging_configuration (Optional) The Apache Airflow logs which will be send to Amazon CloudWatch Logs. any null no
max_workers (Optional) The maximum number of workers that can be automatically scaled up.
Value need to be between 1 and 25. Will be 10 by default
number 10 no
min_workers (Optional) The minimum number of workers that you want to run in your environment. Will be 1 by default. number 1 no
name (Required) The name of the Apache Airflow MWAA Environment string n/a yes
plugins_s3_object_version (Optional) The plugins.zip file version you want to use. string null no
plugins_s3_path (Optional) The relative path to the plugins.zip file on your Amazon S3 storage bucket. For example, plugins.zip. If a relative path is provided in the request, then plugins_s3_object_version is required. string null no
private_subnet_ids (Required) The private subnet IDs in which the environment should be created.
MWAA requires two subnets.
list(string) n/a yes
requirements_s3_object_version (Optional) The requirements.txt file version you want to use. string null no
requirements_s3_path (Optional) The relative path to the requirements.txt file on your Amazon S3 storage bucket. For example, requirements.txt. If a relative path is provided in the request, then requirements_s3_object_version is required. string null no
schedulers (Optional) The number of schedulers that you want to run in your environment. string null no
security_group_ids Security group IDs for MWAA list(string) [] no
source_bucket_arn (Required) The Amazon Resource Name (ARN) of your Amazon S3 storage bucket. For example, arn:aws:s3:::airflow-mybucketname string null no
source_bucket_name New bucket will be created with the given name for MWAA when create_s3_bucket=true.
If set to null, then the default bucket name prefix will be set, irrespective of the value of var.use_source_bucket_name_as_prefix
string null no
source_cidr (Required) Source CIDR block which will be allowed on MWAA SG to access Airflow UI
Used only if create_security_group=true
list(string) [] no
startup_script_s3_object_version (Optional) The version of the startup shell script you want to use. You must specify the version ID that Amazon S3 assigns to the file every time you update the script. string null no
startup_script_s3_path (Optional) The relative path to the script hosted in your bucket. The script runs as your environment starts before starting the Apache Airflow process. Use this script to install dependencies, modify configuration options, and set environment variables. string null no
tags (Optional) A map of resource tags to associate with the resource map(string) {} no
use_source_bucket_name_as_prefix Whether or not to use the var.source_bucket_name as the S3 bucket name prefix bool true no
vpc_id (Required) VPC ID to deploy the MWAA Environment.
Mandatory if create_security_group=true
string "" no
webserver_access_mode (Optional) Specifies whether the webserver should be accessible over the internet or via your specified VPC. Possible options: PRIVATE_ONLY (default) and PUBLIC_ONLY string "PRIVATE_ONLY" no
weekly_maintenance_window_start (Optional) Specifies the start date for the weekly maintenance window string null no

Outputs

Name Description
aws_s3_bucket_name S3 bucket Name of the MWAA Environment
mwaa_arn The ARN of the MWAA Environment
mwaa_role_arn IAM Role ARN of the MWAA Environment
mwaa_role_name IAM role name of the MWAA Environment
mwaa_security_group_id Security group id of the MWAA Environment
mwaa_service_role_arn The Service Role ARN of the Amazon MWAA Environment
mwaa_status The status of the Amazon MWAA Environment
mwaa_webserver_url The webserver URL of the MWAA Environment

terraform-aws-mwaa's People

Contributors

almenon avatar bjones325 avatar chili-man avatar dafresh avatar drewmullen avatar harshvardhan-j avatar maiconrocha avatar mkirlin avatar oscarmendoza123 avatar ricsue-aws avatar samuzad avatar vara-bonthu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-aws-mwaa's Issues

What is the intended deployment pattern for changing requirements and plugins?

I see that the lifecycle settings ignore changes to both the requirements and plugins objects

  lifecycle {
    ignore_changes = [
      plugins_s3_object_version,
      requirements_s3_object_version
    ]
  }

Is there some kind of bad/unintended behavior if these changes are not ignored? Or is this an intentional decision to encourage these these updates via the console only?

var.source_bucket_name should be respected as the S3 bucket name and not the prefix

When setting var.source_bucket_name, it turns out that it's only setting the bucket name prefix (see here) and not the actual name of the bucket. I'm not sure why the behavior is different given the variable's name and description. The other name variables, var.name and var.iam_role_name, are respected and passed through as the names for those resources, so I'm not sure why it's inconsistent for the s3 bucket. Is this something that y'all would be open to? I am more than happy to contribute this change.

I understand that this proposal will be a breaking change, so we could have another variable var.source_bucket_name_prefix that preserves the original behavior.

Configuring tags don't apply to the MWAA assets created

In the variables.tf, when configuring tags like the following

variable "tags" {
  description = "Default tags"
  default     = {"env": "test", "service": "MWAA Apache AirFlow"}
  type        = map(string)
}

These will get applied to the VPC resources configured, but not the MWAA resources.

aws_mwaa_environment gets destroyed and recreated unnecessarily

I'm seeing a behavior in aws_mwaa_environment that I was not seeing previously. With hashicorp/aws v4.64.0, every time I deploy, it destroys and recreates the aws_mwaa_environment with this message shown in the terraform apply:

  ~ network_configuration {
      ~ security_group_ids = [
          - "sg-xxxxxxxxx",
        ] -> (known after apply)
      ~ subnet_ids         = [ # forces replacement
          - "subnet-xxxxxxxxx",
          - "subnet-yyyyyyyyy",
        ] -> (known after apply)
    }
}

Even though the subnet_ids are exactly the same every single time, it indicates that they have changed and therefore that is forcing replacement.

Originally these 2 subnet_ids were coming from an aws_subnet data block, with filter that resulted in those 2 exact same subnet_ids every time. Just as a test, I simply hardcoded the 2 subnet_ids in the aws_mwaa_environment resource block:

subnet_ids = ["subnet-xxxxxxxxx", "subnet-yyyyyyyyy"]

But even hardcoded, I get the same behavior.

iam_role_additional_policies and external IAM Roles

When bringing external iam role with below config

  create_iam_role       = false
  execution_role_arn    = data.aws_iam_role.mwaa.arn
  iam_role_additional_policies = []

TF throws below error

│ Error: Invalid object key
│ 
│   on .terraform/modules/mwaa/locals.tf line 14, in locals:
│   14:   iam_role_additional_policies = { for k, v in toset(concat([var.iam_role_additional_policies])) : k => v if var.execution_role_arn != null }
│ 

Upon verification, terraform-aws-eks uses a similar pattern, but with different variable types

iam_role_additional_policies in var should be map(string) rather than list(string)

Also, the if conditional should not be checking external role, it should be checking create_iam_role

The concact should enclose var.iam_role_additional_policies with []. Detail see below screenshot


> { for k, v in toset(concat([[]])) : k => v if "asdf" != null }
╷
│ Error: Invalid object key
│ 
│   on <console-input> line 1:
│   (source code not available)
│ 
│ The key expression produced an invalid result: string required.
╵


> { for k, v in toset(concat([[]])) : k => v if null != null }
{}


> { for k, v in toset(concat([])) : k => v if "asdf" != null }
{}

if needed, we can discuss about the detail using aws internal channels.

S3 Bucket Arn is mandatory argument when providing S3 Bucket Name

When we run the mwaa artefact we get below error message :

│ Error: Missing required argument

│ with aws_mwaa_environment.mwaa,
│ on main.tf line 23, in resource "aws_mwaa_environment" "mwaa":
│ 23: source_bucket_arn = local.source_bucket_arn

│ The argument "source_bucket_arn" is required, but no definition was found.

I have a PR which resolves this dependency. #36

Add support for startup scripts

The mwaa_environment resource now supports startup_script_s3_path and startup_script_s3_object_version.

This would be a nice addition to the provider :)

(note: this issue was created with the intention of making a PR for it myself, I will edit this comment if I cannot do so for some reason)

How do you set log retention?

What's the easiest way to configure the log retention policy on MWAA cloudwatch logs? This module allows users to toggle the logs and configure the level but it doesn't seem to allow us to configure the log retention period, nor does the module output the log group arn.

Feature Request: Allow passthrough arns for cloudwatch log groups

Currently, the terraform resource for MWAA has the capability to create cloudwatch logs using the logging_configuration block. Our organization would like to be able to output logs to an existing cloudwatch log group by specifying an arn. This would allow us to get access to all of the features available when creating a log group using the aws_cloudwatch_log_group resource (such as log retention).

Let me know if this is the wrong location for such a feature request - I'd be happy to open it in the appropriate location

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.