Code Monkey home page Code Monkey logo

terraform-aws-datadog's Introduction

terraform-aws-datadog

cicd terraformregistry

This module configures the AWS / Datadog integration.

There are two main components:

  1. Datadog core integration, enabling datadog's AWS integration
  2. Datadog logs_monitoring forwarder, enabling logshipping watched S3 buckets
  • Forward CloudWatch, ELB, S3, CloudTrail, VPC and CloudFront logs to Datadog
  • Forward S3 events to Datadog
  • Forward Kinesis data stream events to Datadog, only CloudWatch logs are supported
  • Forward custom metrics from AWS Lambda functions via CloudWatch logs
  • Forward traces from AWS Lambda functions via CloudWatch logs
  • Generate and submit enhanced Lambda metrics (aws.lambda.enhanced.*) parsed from the AWS REPORT log: duration, billed_duration, max_memory_used, and estimated_cost

Usage

Set up all supported AWS / Datadog integrations

module "datadog" {
  source                = "scribd/datadog/aws"
  version               = "~>3"
  aws_account_id        = data.aws_caller_identity.current.account_id
  datadog_api_key       = var.datadog_api_key
  env                   = "prod"
  namespace             = "team_foo"

  cloudtrail_bucket_id  = aws_s3_bucket.org-cloudtrail-bucket.id
  cloudtrail_bucket_arn = aws_s3_bucket.org-cloudtrail-bucket.arn

  cloudwatch_log_groups = ["cloudwatch_log_group_1", "cloudwatch_log_group_2"]

  account_specific_namespace_rules = {
    elasticache = true
    network_elb = true
    lambda      = true
  }
}

Note: The full integration setup should only be done within one terraform stack per account since some of the resources it creates are global per account. Creating this module in multiple terraform stacks will cause conflicts.

Limit to only Cloudwatch log sync

module "datadog" {
  source                         = "scribd/datadog/aws"
  version                        = "~>3"
  datadog_api_key                = var.datadog_api_key
  create_elb_logs_bucket         = false
  enable_datadog_aws_integration = false
  env                            = "prod"
  namespace                      = "project_foo"

  cloudwatch_log_groups = ["cloudwatch_log_group_1", "cloudwatch_log_group_2"]
}

Note: It is safe to create multiple Cloudwatch only modules across different Terraform stacks within a single AWS account since all resouces used for Cloudwatch log sync are namspaced by module.

**Be certain to use unique namespace/env combinations, to avoid conflict with other instances of this module.

Module Versions

Version 3.x.x and greater require terraform version > 0.13.x and AWS provider > 4.0.0.
Version 2.x.x and greater require terraform version > 0.13.x and AWS provider < 4.0.0.
Version 1.x.x is the latest version that support terraform version 0.12.x and AWS provider < 4.0.0.
When using this module, please be sure to pin to a compatible version.

Examples

Development

Releases are cut using semantic-release.

Please write commit messages following Angular commit guidelines

Release flow

Semantic-release is configured with the default branch workflow

For this project, releases will be cut from master as features and bugs are developed.

In commit message summary, use feat: to cut new minor version, use fix: to cut new patch version. BREAKING CHANGE: in body of the commit message for new major version.

Maintainers

Troubleshooting

If you should encounter Datadog is not authorized to perform action sts:AssumeRole Accounts affected: 1234567890, 1234567891 Regions affected: every region Errors began reporting 18m ago, last seen 5m ago Then perhaps the external ID has changed. Execute ./terraform taint module.datadog.datadog_integration_aws.core[0] in the root module of the account repo to force a refresh.

terraform-aws-datadog's People

Contributors

andrew-wiggins avatar bcha avatar breakeverything avatar flaaming-sideburns avatar frozensolid avatar ghettoburger avatar jim80net avatar jjkoh95 avatar kuntalkumarbasu avatar libc avatar miend avatar mukta-puri avatar nbaec avatar rustychain avatar saikiranburle avatar semantic-release-bot avatar zbstof avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-aws-datadog's Issues

Use aws_secretsmanager_secret prefix instead of hard name to prevent conflicts

Secrets manager takes 30 days to delete a key, which doesn't fit into a terraform workflow if ever a secret needs deleting and recreation.

Use name prefix for aws_secretsmanager_secret instead to prevent the following error:

module.datadog.aws_secretsmanager_secret.datadog_api_key: Creating...
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [10s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [20s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [30s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [40s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [50s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m0s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m10s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m20s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m30s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m40s elapsed]
module.datadog.aws_secretsmanager_secret.datadog_api_key: Still creating... [1m50s elapsed]

Error: error creating Secrets Manager Secret: InvalidRequestException: You can't create this secret because a secret with this name is already scheduled for deletion.

  on .terraform/modules/datadog/terraform-aws-datadog-1.0.0/logs_monitoring.tf line 19, in resource "aws_secretsmanager_secret" "datadog_api_key":
  19: resource aws_secretsmanager_secret "datadog_api_key" {

Add IAM role permissions for states:ListStateMachines and elasticfilesystem:DescribeAccessPoints

Datadog Lambda Forwarder 3.17.0 with version v1.3.0 is causing following
errors:

  1. User:
    arn:aws:sts::XXX:assumed-role/datadog-integration-role/vault-app3.eu1.prod.dog-datadog-delancie-crawler
    is not authorized to perform: states:ListStateMachines on resource:
    arn:aws:states:XXX:XXX:stateMachine:*

  2. User:
    arn:aws:sts::XXX:assumed-role/datadog-integration-role/vault-app3.eu1.prod.dog-datadog-delancie-crawler
    is not authorized to perform: elasticfilesystem:DescribeAccessPoints on
    the specified resource

This could be resolved by adding following rows to main.tf:
`@@ -76,6 +76,7 @@ resource "aws_iam_policy" "datadog-core" {
"elasticfilesystem:DescribeAccessPoints",
@@ -115,6 +116,7 @@ resource "aws_iam_policy" "datadog-core" {
"states:ListStateMachines",

Error: Invalid count argument (Due to S3 resources not yet created at apply time)

The count logic here is relying on resources pre-existing with known outputs:

count = var.cloudtrail_bucket_id != "" ? 1 : 0

count = var.cloudtrail_bucket_id != "" ? 1 : 0

For example, If comment out the module instantiation, create my s3 bucket, then uncomment the module I can deploy (as the s3 bucket outputs are known beforehand).

If I attempt a DR deployment with the bucket and module instantiation in the same workspace I get the below errors;

Error: Invalid count argument

  on .terraform/modules/datadog/logs_monitoring_cloudtrail.tf line 3, in resource "aws_lambda_permission" "allow-ctbucket-trigger":
   3:   count         = var.cloudtrail_bucket_id != "" ? 1 : 0

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.


Error: Invalid count argument

  on .terraform/modules/datadog/logs_monitoring_cloudtrail.tf line 13, in resource "aws_s3_bucket_notification" "ctbucket-notification-dd-log":
  13:   count  = var.cloudtrail_bucket_id != "" ? 1 : 0

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.

Switching to for_each would be the solution but of course lose backwards compatibility.

extra_policy_arns: Document module variable in README.md

Added in an August, 2021 change, the extra_policy_arns module variable offers one way to support the optional Datadog / AWS CloudTrail integration.

The implementation's use of for_each may, however, lead to errors of the form:

│ Error: Invalid for_each argument
[...]
│  NNN:   for_each   = toset(var.extra_policy_arns)
│     ├────────────────
│     │ var.extra_policy_arns is list of string with 1 element
│ 
│ The "for_each" value depends on resource attributes that cannot be determined
│ until apply, so Terraform cannot predict how many instances will be created.
│ To work around this, use the -target argument to first apply only the
│ resources that the for_each depends on.

which I think we should alert users to.

TF 1.0.0

Any plans to go upgrade the module to TF 1.0.0 or 0.15.0?

ReservedConcurrency setting for forwarder lambda

When deploying the stack I'm running into an issue where the 100 value for ReservedConcurrency defined in the yaml template for the stack is causing a terraform deployment failure as it is failing the minimum value for unreserved concurrency for all functions.

In our case, we have a significant number of lambdas in the same account. In order to successfully forward logs for them, we have to create multiple log forwarder stacks (no problem there). We decided to break them up by microservice for the time being. That leaves us with around 10 log forwarder stacks. Given that the ReservedConcurrency default is 100, those 10 log forwarders quickly burn through the ReservedConcurrency in an account.

I'll submit a PR for an update that would expose a new variable to allow overriding that value while maintaining the 100 default. If this isn't the desired path forward, how have others gotten around this if they have run into it?

Refactor aws s3 bucket to be compatible with Aws provider 4.0

resource "aws_s3_bucket" "elb_logs" {

terraform-provider-aws 4.0.0 was released, introducing a breaking change to how s3 buckets are defined.

This resource should be refactored to comply with the 4.0 standard. Since doing so breaks compatibility with <4.0.0, this should also increment the major versions and create a new branch.

A migration guide should also be published for users who are on terraform-provider-aws 3.x and desire to upgrade to 4.x

fix: only enforce minimum version constraint, not maximum

Hey @jim80net, it's pretty uncommon for modules to specify a cap on the required Terraform version, normally they only enforce a minimum. (Any of the aws managed Terraform modules should serve as an example)

Specifying a cap blocks users from upgrading to newer versions and requires that these small PRs are opened and reviewed every time a new version of Terraform is released. Would you guys consider only enforcing a minimum?

Thanks to @dtw45 for the suggestion.

Enable supplying arbitrary tags

I want to supply a tags parameter with an arbitrary map of tags, potentially eliminating the hard coded Namespace and Env parameters. So that I don't have to be limited by only env and namespace.

Given we've already an expectation to support these parameters, perhaps in implementation we can simply add tags, and deprecate the namespace and env parameters until users have converged on the new mechanism.

CloudTrail option side effect: aws_s3_bucket_notification is overwritten

The module's CloudTrail option depends on references to an externally-created S3 bucket. If the bucket already has a notification configuration, the module overwrites it. Conversely, adding a notification configuration outside the module, for some other CloudTrail consumer, overwrites the one created by the module.

A bucket can have only one s3_bucket_notification configuration. Terraform gives no warning at the time a conflicting configuration is introduced. The old configuration is overwritten in AWS, but both the old and new ones end up in Terraform state.

Though there can only be one configuration, it may point to multiple destinations. So, if we went beyond the first step of documenting the side effect, we might be able to accept configuration contents as an optional input, and append. Or, we could eliminate the side effect by removing the s3_bucket_notification resource from the module and leaving it to the user to create or modify their CloudTrail bucket's notification configuration.

Incompatible changes introduced in v 1.3.4

Error: Unsuitable value type
On .terraform/modules/datadog/versions.tf line 5: Unsuitable value: string
required

Release 1.3.4 should be skipped. and its changes published to v2.

Multiple uses requires `namespace`/`env` parameters be set and be set to unique values, but this isn't clear.

Actual Behavior

When invoking this module more than once (possibly by defining distinct cloudwatch log groups to forward), the module should generate unique lambda function names to avoid collision of the lambda function resource.

The unique names must be generated from outside the module by supplying unique namespace/env combinations. The namespace and env parameters are used to generate the local stack_prefix.

The documentation isn't clear about this.

If possible, it would be ideal to make this simpler, perhaps by generating the lambda function name some other way.

Expected Behavior

Clear documentation, or, no need to worry about supplying unique namespace/env combinations.

Steps to Reproduce the Problem

  1. use the module once to initialize an aws account.
  2. use the module again to add cloudwatch log groups, using the same namespace/env combination

Allow using Datadog provider 3.x

Currently, version.tf does not allow using Datadog provider 3.x versions:

  required_providers {
    datadog = {
      source  = "DataDog/datadog"
      version = ">= 2.10, < 3"
    }
  }

This is unfortunate, because it forces everything else in the same deployment to use 2.x version of the provider as well and all the new features come only to 3.x. To me it looks like there is no reason to hold back the version in here, but I might have missed something.

Would it be possible to relax the requirement to allow 3.x versions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.