Code Monkey home page Code Monkey logo

datadog-lambda-extension's Introduction

datadog-lambda-extension

Slack License

Note: This repository contains release notes, issues, instructions, and scripts related to the Datadog Lambda Extension. The extension is a special build of the Datadog Agent. The source code can be found here.

The Datadog Lambda Extension is an AWS Lambda Extension that supports submitting custom metrics, traces, and logs asynchronously while your AWS Lambda function executes.

Installation

Follow the installation instructions, and view your function's enhanced metrics, traces and logs in Datadog.

Upgrading

To upgrade, update the Datadog Extension version in your Lambda layer configurations or Dockerfile (for Lambda functions deployed as container images). View the latest releases and corresponding changelogs before upgrading.

Configurations

Follow the configuration instructions to tag your telemetry, capture request/response payloads, filter or scrub sensitive information from logs or traces, and more.

Overhead

The Datadog Lambda Extension introduces a small amount of overhead to your Lambda function's cold starts (that is, the higher init duration), as the Extension needs to initialize. Datadog is continuously optimizing the Lambda extension performance and recommend always using the latest release.

You may notice an increase of your Lambda function's reported duration (aws.lambda.duration or aws.lambda.enhanced.duration). This is because the Datadog Lambda Extension needs to flush data back to the Datadog API. Although the time spent by the extension flushing data is reported as part of the duration, it's done after AWS returns your function's response back to the client. In other words, the added duration does not slow down your Lambda function. See this AWS blog post for more technical information. To monitor your function's actual performance and exclude the duration added by the Datadog extension, use the metric aws.lambda.enhanced.runtime_duration.

By default, the Extension flushes data back to Datadog at the end of each invocation (for example, cold starts always trigger flushing). This avoids delays of data arrival for sporadic invocations from low-traffic applications, cron jobs, and manual tests. Once the Extension detects a steady and frequent invocation pattern (more than once per minute), it batches data from multiple invocations and flushes periodically at the beginning of the invocation when it's due. This means that the busier your function is, the lower the average duration overhead per invocation. In other words, for low-traffic applications, the duration overhead would be noticeable while the associated cost overhead is typically negligible; for high-traffic applications, the duration overhead would be barely noticeable. To understand the duration overhead that is used by the Datadog extension to flush data, use the metric aws.lambda.post_runtime_extensions_duration or aws.lambda.enhanced.post_runtime_duration.

For Lambda functions deployed in a region that is far from the Datadog site, for example, a Lambda function deployed in eu-west-1 reporting data to the US1 Datadog site, can observe a higher duration (and therefore, cost) overhead due to the network latency. To reduce the overhead, configure the extension to flush data less often, such as every minute DD_SERVERLESS_FLUSH_STRATEGY=periodically,60000.

In some rare cases where a very short timeout is configured (the definition of what is short depends on the runtime), it is possible to notice that the Lambda handler code may not get run on subsequent invocations. This is a possibility when the first invocation times out, requiring the INIT phase to be started again in the next invocation. In the subsequent invocation, should the function time out before the INIT phase is completed, the function is terminated by Lambda and the handler code is not run. You can identify these failures using INIT_REPORT logs. Datadog recommends increasing the timeout for a Lambda function where this has been identified.

Opening Issues

If you encounter a bug with this package, we want to hear about it. Before opening a new issue, search the existing issues to avoid duplicates.

When opening an issue, include the Extension version, and stack trace if available. In addition, include the steps to reproduce when appropriate.

You can also open an issue for a feature request.

Contributing

If you find an issue with this package and have a fix, please feel free to open a pull request following the procedures.

Testing

To test a change to the Datadog Serverless-Init in Google Cloud Run:

  1. Clone this repo and the Datadog Agent repo into the same parent directory.
  2. Run VERSION=0 SERVERLESS_INIT=true ./scripts/build_binary_and_layer_dockerized.sh in this repo to build the serverless-init binary.
  3. Create a "Hello World" serverless application as described here.
  4. Follow the public instructions to add the Serverless-Init to your serverless application.
  5. Copy the binary file that you built to the same location as your Dockerfile:
cp datadog-lambda-extension/.layers/datadog_extension-amd64/extensions/datadog-agent ~/hello-world-app/datadog-init
  1. In your Dockerfile, replace
COPY --from=datadog/serverless-init:1 /datadog-init /app/datadog-init

with

COPY datadog-init /app/datadog-init

Deploy your serverless application, and it will run with a version of the Serverless-Init that includes your changes to the code.

Community

For product feedback and questions, join the #serverless channel in the Datadog community on Slack.

License

Unless explicitly stated otherwise all files in this repository are licensed under the Apache License Version 2.0.

This product includes software developed at Datadog (https://www.datadoghq.com/). Copyright 2021 Datadog, Inc.

datadog-lambda-extension's People

Contributors

agocs avatar alexgallotta avatar astuyve avatar blt avatar darcyraynerdd avatar dependabot[bot] avatar duncanista avatar duncanpharvey avatar dylanlovescoffee avatar ganeshkumarsv avatar ivantopolcic avatar jcstorms1 avatar kimi-p avatar lucaspimentel avatar maxday avatar nhinsch avatar nine5two7 avatar pgimalac avatar purple4reina avatar thedavl avatar tianchu avatar yshapiro-57 avatar zarodz11z avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datadog-lambda-extension's Issues

Tags concatenated

I'm using an extension in Java 11 environment.
Lambda gradle part is: implementation 'com.datadoghq:datadog-lambda-java:1.4.3'
Here is the code I use to create metrics:

    final Map<String, Object> tags = Map.of(REQUEST_TYPE_TAG, queryType.getAlias(), CUSTOMER_TAG, customerUUID);
    this.datadogAgentSupplier.get().metric(METRIC_NAME, 1, tags);
    
    final Map<String, Object> tags = Map.of(CUSTOMER_TAG, customerUUID);
    this.datadogAgentSupplier.get().metric(METRIC_NAME, 1, tags);

I'm using Lambda with docker. The dockerfile part is:

FROM datadog/lambda-extension:21 as datadog-extension

# Building runtime container
FROM amazon/aws-lambda-java:11

COPY --from=datadog-extension /opt /opt

The lambda env variables are:

    DD_API_KEY : data.aws_ssm_parameter.datadog_secret.value
    DD_LOGS_INJECTION : "false"
    DD_JMXFETCH_ENABLED : "false"
    DD_TRACE_ENABLED : "false"
    DD_SERVERLESS_LOGS_ENABLED: "false"
    DD_ENV : var.environment_id
    DD_SERVICE : "querying"

The issue I have is that tags are concatenated, e.g. if I submit both customer and query-type, I receive query-type:incremental_querycustomer:x5ncbrfd54k8jka3hwij6ur, I've tried a couple of extension versions(17, 19, 21), unfortunately, it gives me the same result.

I've also tried to change the extension log level, here is what I see when it's set to debug:

"tags": [
                "env:staging-iad",
                "service:querying",
                "query-type:incremental_querycustomer:x5ncbrfd54k8jka3hwij6ur"

The debugger shows that {"m":"dap.request","v":1.0,"t":["query_type:incremental_query","customer:x5ncbrfd54k8jka3hwij6ur"],"e":1648642299} is written to stdout from datadog lambda java. I've seen this log in the production
2022-03-30T17:16:58.399+03:00 | {"level":"DEBUG","message":"datadog: Setting the writer to extension"} this means that extension is used.
I also tried with simple tag values like "1" and "2". Example

    final Map<String, Object> tags = Map.of(REQUEST_TYPE_TAG, "1", CUSTOMER_TAG, "2");
    this.datadogAgentSupplier.get().metric(METRIC_NAME, 1, tags);

unfortunately, the tags are still concatenated(group by customer_tag value is 2query-type:1)

UAE region layer doesn't exist

Hey everyone,

I was trying to create lambda functions with datadog extension, I was not able to do so in UAE region me-central-1.

I checked it trough awscli it is not existing. Your code seems to be creating releases on all regions but I guess the awscli version was not up-to-date.

➜ aws lambda get-layer-version --layer-name arn:aws:lambda:me-south-1:464622532012:layer:Datadog-Extension --version-number 31 --region me-south-1
{
    "Content": {
        "Location": "...,
        "CodeSize": 9561912,
        "SigningProfileVersionArn": "arn:aws:signer:us-east-1:464622532012:/signing-profiles/DatadogLambdaSigningProfile/9vMI9ZAGLc",
        "SigningJobArn": "arn:aws:signer:us-east-1:464622532012:/signing-jobs/d5e3fbc9-cbd7-45b8-a74c-dfa147b0e0d0"
    },
    "LayerArn": "arn:aws:lambda:me-south-1:464622532012:layer:Datadog-Extension",
    "LayerVersionArn": "arn:aws:lambda:me-south-1:464622532012:layer:Datadog-Extension:31",
    "Description": "Datadog Lambda Extension",
    "CreatedDate": "2022-10-20T22:33:20.790+0000",
    "Version": 31
}
➜ aws lambda get-layer-version --layer-name arn:aws:lambda:me-central-1:464622532012:layer:Datadog-Extension --version-number 31 --region me-central-1

An error occurred (AccessDeniedException) when calling the GetLayerVersion operation: User: arn:aws:sts::...:assumed-role/... is not authorized to perform: lambda:GetLayerVersion on resource: arn:aws:lambda:me-central-1:464622532012:layer:Datadog-Extension:31 because no resource-based policy allows the lambda:GetLayerVersion action

It would be great if you could release it to this new region.

Is logs being not flushed until the end of lambda function spun down still an issue?

Hello!

More than a year ago there was an issue: #29

Let me quote the main problem that the user had:

I my testing I have seen that logs take ~10 minutes to appear in datadog log explorer from my lambda function. The reason seems to be that the logs are not flushed until the function is spun down by AWS.

A person from developer side of datadog-lambda-extension answered:

We are in touch with AWS about possible improvements to resolve this issue.

My question is it still a problem after one year passed? Are there any workarounds ?

How to integrate lambda extension with proxy

Hello,

I'm trying to integrate lambda extension logs forwarding with datadoghq.eu and following documentation I need proxy to do so, but linked page didn't specify how it should be done with lambdas. Is the proxy instance have to be in the same VPC that lambdas are or can I use peering conection, how proxy should be configured? Is there need to build agent between or can I threat lambda extension as agent?

Best,
Rafal Rabenda

Wrapper overrides JAVA_TOOL_OPTIONS

export JAVA_TOOL_OPTIONS="-javaagent:$DD_Agent_Jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1"

On startup the datadog wrapper overrides any JAVA_OPTS the user may have set. This causes those configurations to be ignored despite that environment variable being set on the lambda. Recommend appending the required config instead of overriding.

export JAVA_TOOL_OPTIONS="-javaagent:$DD_Agent_Jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1 $JAVA_TOOL_OPTIONS"

However this could cause issues if an option is set twice so some logic to merge to two values may be required

DD_EXTENSION | ERROR | SyncForwarder.sendHTTPTransactions i/o timeout

We are using version 21 of the datadog lambda extension layer on the AL2 lambda runtime. We are frequently seeing errors like this for each concurrent lambda running:

DD_EXTENSION | ERROR | SyncForwarder.sendHTTPTransactions final attempt: error while sending transaction, rescheduling it: Post "https://6-0-0-app.agent.datadoghq.eu/api/beta/sketches?api_key=***************************8e464": dial tcp 34.107.172.23:443: i/o timeout

It seems to be really slowing down the execution of our lambda, resulting in timeouts at 15 seconds when the lambda timeout is exceeded.

We tried both datadoghq.com and datadoghq.eu for DD_SITE but experienced similar behavior.

How to bundle when `addLayers: false`

The docs say:

When false, you must include the Datadog Lambda library in your functions’ deployment packages.

But there are no instructions on how to actually do this. I tried adding import "datadog-lambda-js"; to the top of my handlers, so the datadog libs would get bundled (and they did), but that didn't work.


Then after browsing through your repos, I came upon this:

export const jsHandler = "node_modules/datadog-lambda-js/dist/handler.handler";

So I assumed, I'd have to add the entire folder on that specific location, but that didn't work either as it then started throwing errors as it was missing other libraries (Error: Cannot find module 'hot-shots').

So, how are we supposed to "include the Datadog Lambda library in your functions' deployment packages"?

why?

Having to exclude these libraries from the bundler, makes our functions break when trying to run them locally with sam local start-api, so we'd have to always comment the exclude rule just to run them locally.
Also I thought that by using our bundler (esbuild), we should be able to get better minification and reduce the overall package size.

XRAY TraceId value is missing when using the lambda extension but present when using forwarder

When using the lambda forwarder with tracing active using aws x-ray, the information is being sent as expected. However, when using the lambda extension layer (setting it up using https://github.com/DataDog/datadog-cdk-constructs), the trace id is not being sent.

Screenshot from a log that is forwarded by the forwarder that includes the XRAY value:

image

Screenshot from a log that is sent by the lambda extension that does not include the XRAY value:

image

Is this something that is handled by the lambda extension?

Thanks!

Filtering Ignored on Tracing

It seems that we try to filter out spans using the env variable DD_TRACE_DISABLED_PLUGINS=dns, this does not seem to work. After further investigation, it seems that the port which statsd uses (8125) has been entirely missed as part of filtering checks. Original work was done here: DataDog/datadog-agent#11687.

External/Auto instrumented services don't appear to be tagged as a separate service?

Hi all,

We have a lambda which is using the datadog-python layer (3.54) and log forwarding (3.42). We just added this layer (v21) to the lambda, but all of our RDS (postgres) spans immediately switched from being shown as a separate service to no longer being shown as a separate service. Is there an obvious fix to this problem here?

Misleading information when DD_API_KEY_SECRET_ARN points to non-existing secret

Let's say I made a mistake when typing ARN of SecretsManager secret into DD_API_KEY_SECRET_ARN.

The chain of information in DEBUG log is a little bit misleading:

  1. Retrieving - ok
  2. Found - that's not possible
  3. Secrets Manager read error: AccessDeniedException - I assume this is from AWS API...?
  4. No API key configured - actually it is, but not found :)
2023-06-23T10:42:03.581+02:00	2023-06-23 08:42:03 UTC | DD_EXTENSION | DEBUG | Retrieving DD_API_KEY_SECRET_ARN=arn:aws:secretsmanager:us-east-2:012345678901:secret:secret-datacat from secrets manager
2023-06-23T10:42:03.581+02:00	2023-06-23 08:42:03 UTC | DD_EXTENSION | DEBUG | Found arn:aws:secretsmanager:us-east-2:012345678901:secret:secret-datacat value, trying to use it.
2023-06-23T10:42:03.701+02:00	2023-06-23 08:42:03 UTC | DD_EXTENSION | DEBUG | Couldn't read API key from Secrets Manager: Secrets Manager read error: AccessDeniedException: User: arn:aws:sts::012345678901:assumed-role/role-my-lambda-use2/my-lambda-use2 is not authorized to perform: secretsmanager:GetSecretValue on resource: arn:aws:secretsmanager:us-east-2:012345678901:secret:secret-datacat because no identity-based policy allows the secretsmanager:GetSecretValue action
2023-06-23T10:42:03.701+02:00	status code: 400, request id: b507ebca-3b2d-4f6d-bf68-cb5fe57e42ad
2023-06-23T10:42:03.701+02:00	2023-06-23 08:42:03 UTC | DD_EXTENSION | ERROR | No API key configured

Duplicated javaagent when using the version v31 with datadog forwarder

In this PR #86 the behavior of the extension was changed.

was

export JAVA_TOOL_OPTIONS="-javaagent:$DD_Agent_Jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1"

and now it's

export JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -javaagent:$DD_Agent_Jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1"

In the documentation it suggests to set JAVA_TOOL_OPTIONS env variable:

JAVA_TOOL_OPTIONS: -javaagent:"/opt/java/lib/dd-java-agent.jar" -XX:+TieredCompilation -XX:TieredStopAtLevel=1

But when we do that we get the JAVA_TOOL_OPTIONS with a duplicated javaagent

-javaagent:/opt/java/lib/dd-java-agent.jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -javaagent:/opt/java/lib/dd-java-agent.jar -XX:+TieredCompilation -XX:TieredStopAtLevel=1

Am I doing something wrong? Is there a new way to set the javaagent?

`service` tag is not correctly applied to the lambda function logs (and traces)

Reproduction Steps:

  1. Create a lambda function with the DD extension, some logs, and a service tag.
  2. Wait approximately 15m for the AWS integration to pick up the tags of the new lambda function
  3. Invoke the lambda

Pull up a log of the invocation from DataDog and export it as JSON. The log will contain content.tags.service with the value you provided as a service tag, but it will also contain content.service and content.attributes.service with the the name of the function.

The value of 'service' that DataDog associates with the log is the name of the function, not the one explicitly set as a service tag.

v39 causes docker-based lambda to time out

Context

  • Using docker image based lambda.
  • Building said image by copying DataDog lambda extensions layer contents into an image provided by AWS the way AWS Blogs and DataDog documentation suggest.
  • Not necessarily, but can be related and contribute to the issue: API_KEY in some circumstances is not provided intentionally. Before v39 it caused no problems and worked as expected.

Problem

  • In AWS environment problem manifests itself with as a lambda time out for v39 extensions. v38 extensions work just fine.

Reproduction

Dockerfile v38

FROM public.ecr.aws/lambda/python:3.8
COPY --from=public.ecr.aws/datadog/lambda-extension:38 /opt/extensions/ /opt/extensions
docker run -p 9000:8080 -it tmp datadog_lambda.handler.handler
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'

Lines in the log that are similar in both v38 and v39 are skipped for brevity.

Traceback (most recent call last): Unable to import module 'datadog_lambda.handler': No module named 'datadog_lambda'
^C
02 Mar 2023 06:28:13,954 [WARNING] (rapid) AwaitAgentsReady() = errResetReceived
02 Mar 2023 06:28:13,954 [WARNING] (rapid) Reset initiated: SandboxTerminated
END RequestId: a1a359e3-c551-439a-8fc8-5cf776ec331d
REPORT RequestId: a1a359e3-c551-439a-8fc8-5cf776ec331d  Init Duration: 0.89 ms  Duration: 144255.73 ms  Billed Duration: 144256 ms      Memory Size: 3008 MB    Max Memory Used: 3008 MB
02 Mar 2023 06:28:13,962 [INFO] (rapid) runtime exited
02 Mar 2023 06:28:15,963 [WARNING] (rapid) Killing agent datadog-agent (e552523e-77ad-4423-99e9-6e7ec981c632) which failed to shutdown

Dockerfile v39

The difference here is that it uses v39 image instead of v38.

FROM public.ecr.aws/lambda/python:3.8
COPY --from=public.ecr.aws/datadog/lambda-extension:39 /opt/extensions/ /opt/extensions
docker run -p 9000:8080 -it tmp datadog_lambda.handler.handler

Lines in the log that are similar in both v38 and v39 are skipped for brevity.

2023-03-02 06:30:35 UTC | DD_EXTENSION | ERROR | Unexpected nil instance of the trace-agent
^C
02 Mar 2023 06:31:43,773 [ERROR] (rapid) Init failed error=errResetReceived InvokeID=
02 Mar 2023 06:31:43,773 [WARNING] (rapid) Reset initiated: SandboxTerminated
02 Mar 2023 06:31:43,779 [INFO] (rapid) runtime exited
Mar 2023 06:31:45,780 [WARNING] (rapid) datadog-agent (eaaa4547-4642-43c9-8c75-30e0307101ff) failed to transition to ShutdownFailed: State transition is not allowed (current state: Registered)
02 Mar 2023 06:31:45,781 [WARNING] (rapid) Killing agent datadog-agent (eaaa4547-4642-43c9-8c75-30e0307101ff) which failed to shutdown

Notable difference in logs

  • v39 does not attempt to invoke the handler and does not output "handler not found" error.
  • v39 does not output usual END RequestId: and REPORT RequestId: log lines denoting lambda execution completion.

Additional notes

Any insight or ideas as to how to work around the issue are appeciated.

Lambda Initialization Failure Log Not Sent By Extension

We recently started using the lambda extension v40 over the forwarder. In cases where the lambda fails to boot resulting in an Uncaught Exception, the log doesn't show up in Datadog despite being in Cloudwatch.

image

It appears that this is because the datadog-agent is not yet ready at the time that log is produced.

Is there a recommendation on getting initialization time errors via extension?

Question about source maps and stack traces

Is it possible to upload source maps using the datadog-cli and get datadog to use that source map to turn a stack trace from minified gobbledygook to something human readable?

eg if I uploaded a source map like this

yarn datadog-ci sourcemaps upload ./dist/lambda \
  --service stack-name-lambdafunction \
  --release-version=$BUILD_NUMBER \
  --minified-path-prefix=/

and then tagged my services using DD_VERSION and DD_SERVCE that match the source map upload?

NOTE: I have tried the above and it isn't working but I wonder if I am missing something additional or it is not as yet currently possible.

Lambda tags are not passed on to datadog

Tags added on my lambdas are not passed on to datadog. I have setup env version and service in all my lambdas, but logs don't show those (or others I might have added on top). Setting env var (ie. DD_ENV) works, but I'd rather see my tags in the logs.

Is there a prameter I have missed ?

I am using version 21 of the extension.

Cheers.

Edit: I use seilf made pipeline using pulumi to build and deploy lambda, add layers, tags, etc...

datadog extension delays lambda execution

context:

Vpc: None
Runtime: Node.js 14.x
Handler: /opt/nodejs/node_modules/datadog-lambda-js/handler.handler
Layer 1: Datadog-Node14-x - version 76 - arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Node14-x:76
Layer 2: Datadog-Extension - 21 - arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Extension:21

Cdk Version: 2.17.0 (build f9cd009)
datadog-cdk-constructs-v2: ^0.2.0

problem: the lambda execution is being delayed by 3 seconds.

Note the delay between these two sequential logs I got from cloudwatch

2022-05-20T11:10:11.639-10:00	2022-05-20T21:10:11.638Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Didn't attempt to find parent for aws.lambda span","mergeDatadogXrayTraces":false,"traceSource":"xray"}
2022-05-20T11:10:13.338-10:00	2022-05-20 21:10:13 UTC | DD_EXTENSION | DEBUG | Hit on the serverless.Hello route.
2

cdk:

createDatadogIntegration(
    lambdaFunction: lambda.Function,
    datadogApiKeySecretArnFragment: string,
    gitHash: string
) {
    const apiKeySecretArn = `arn:aws:secretsmanager:${this.region}:${this.account}:secret:${datadogApiKeySecretArnFragment}`
    const datadog = new Datadog(this, "DatadogLogging", {
        nodeLayerVersion: 76,
        extensionLayerVersion: 21,
        apiKeySecretArn: apiKeySecretArn,
    })
    datadog.addLambdaFunctions([lambdaFunction])
    datadog.addGitCommitMetadata([lambdaFunction], gitHash)

    lambdaFunction.addToRolePolicy(
        new iam.PolicyStatement({
            effect: Effect.ALLOW,
            actions: ["secretsmanager:GetSecretValue"],
            resources: [apiKeySecretArn],
        })
    )
}
👉 full_lambda_logs_datadog_debug
2022-05-20T11:10:10.125-10:00	START RequestId: 9625d679-ae4d-4cb7-a81a-b708d1c7e81a Version: $LATEST
2022-05-20T11:10:10.125-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Datadog extension version : 21|Datadog environment variables: DD_API_KEY_SECRET_ARN=***|DD_CAPTURE_LAMBDA_PAYLOAD=false|DD_FLUSH_TO_LOG=false|DD_LAMBDA_HANDLER=index.main|DD_LOGS_INJECTION=true|DD_LOG_LEVEL=debug|DD_SERVERLESS_LOGS_ENABLED=true|DD_SITE=datadoghq.com|DD_TAGS=git.commit.sha:6376f01|DD_TRACE_ENABLED=true|
2022-05-20T11:10:10.126-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Starting daemon to receive messages from runtime...
2022-05-20T11:10:10.129-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Unable to restore the state from file
2022-05-20T11:10:10.233-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Found DD_API_KEY_SECRET_ARN value, trying to use it.
2022-05-20T11:10:10.233-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | Using API key set in Secrets Manager.
2022-05-20T11:10:10.233-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Using a SyncForwarder with a 5s timeout
2022-05-20T11:10:10.235-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | Retry queue storage on disk is disabled
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | 'telemetry.dogstatsd.aggregator_channel_latency_buckets' is empty, falling back to default values
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | 'telemetry.dogstatsd.listeners_latency_buckets' is empty, falling back to default values
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Forwarder started
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | 'telemetry.dogstatsd.listeners_channel_latency_buckets' is empty, falling back to default values
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | dogstatsd-udp: 127.0.0.1:8125 successfully initialized
2022-05-20T11:10:10.236-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | dogstatsd-udp: starting to listen on 127.0.0.1:8125
2022-05-20T11:10:10.237-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Enabling logs collection HTTP route
2022-05-20T11:10:10.237-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | WARN | Error loading config: open /var/task/datadog.yaml: no such file or directory
2022-05-20T11:10:10.237-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Subscribing to Logs for types: [platform function extension]
2022-05-20T11:10:10.237-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | Features detected from environment:
2022-05-20T11:10:10.238-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Could not get hostname via gRPC: grpc client disabled via cmd_port: -1. Falling back to other methods.
2022-05-20T11:10:10.239-10:00	LOGS Name: datadog-agent State: Subscribed Types: [platform,function,extension]
2022-05-20T11:10:10.239-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Acquired hostname from OS: "169.254.152.77". Core agent was unreachable at "/opt/datadog-agent/bin/agent/agent": fork/exec /opt/datadog-agent/bin/agent/agent: no such file or directory.
2022-05-20T11:10:10.239-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Trace writer initialized (climit=200 qsize=78)
2022-05-20T11:10:10.240-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Stats writer initialized (climit=20 qsize=83)
2022-05-20T11:10:10.240-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | Listening for traces at http://localhost:8126
2022-05-20T11:10:10.240-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Starting concentrator
2022-05-20T11:10:10.241-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | Starting a serverless logs-agent...
2022-05-20T11:10:10.241-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | INFO | logs-agent started
2022-05-20T11:10:10.241-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | Adding AWS Logs collection source
2022-05-20T11:10:11.342-10:00	2022-05-20 21:10:10 UTC | DD_EXTENSION | DEBUG | serverless agent ready in 128.086103ms
2022-05-20T11:10:11.362-10:00	2022-05-20T21:10:11.338Z undefined ERROR {"status":"error","message":"datadog:api key not configured, see https://dtdg.co/sls-node-metrics"}
2022-05-20T11:10:11.362-10:00	EXTENSION Name: datadog-agent State: Ready Events: [INVOKE,SHUTDOWN]
2022-05-20T11:10:11.518-10:00	2022-05-20T21:10:11.479Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG {"status":"debug","message":"datadog:Patched console output with trace context"}
2022-05-20T11:10:11.538-10:00	2022-05-20T21:10:11.518Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG {"status":"debug","message":"datadog:Not patching HTTP libraries","autoPatchHTTP":true,"tracerInitialized":true}
2022-05-20T11:10:11.558-10:00	2022-05-20T21:10:11.539Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG {"status":"debug","message":"datadog:Reading trace context from env var Root=1-628803b1-724437cb121730d910154f56;Parent=38e64147639a5bed;Sampled=0"}
2022-05-20T11:10:11.580-10:00	2022-05-20T21:10:11.580Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG {"status":"debug","message":"datadog:extracted trace context from xray context","trace":{"parentID":"4100036285636959213","sampleMode":-1,"source":"xray","traceID":"1303564325982916438"},"header":"Root=1-628803b1-724437cb121730d910154f56;Parent=38e64147639a5bed;Sampled=0"}
2022-05-20T11:10:11.639-10:00	2022-05-20T21:10:11.638Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Didn't attempt to find parent for aws.lambda span","mergeDatadogXrayTraces":false,"traceSource":"xray"}
2022-05-20T11:10:13.338-10:00	2022-05-20 21:10:13 UTC | DD_EXTENSION | DEBUG | Hit on the serverless.Hello route.
2022-05-20T11:10:13.560-10:00	2022-05-20T21:10:13.560Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Extension present: true"}
2022-05-20T11:10:13.599-10:00	2022-05-20T21:10:13.560Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Using StatsD client"}
2022-05-20T11:10:13.698-10:00	2022-05-20T21:10:13.678Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Creating the aws.lambda span"}
2022-05-20T11:10:13.738-10:00	2022-05-20T21:10:13.719Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a INFO [dd.trace_id=2590960962667438476 dd.span_id=2590960962667438476] asdf
2022-05-20T11:10:13.898-10:00	2022-05-20T21:10:13.898Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a INFO [dd.trace_id=2590960962667438476 dd.span_id=2590960962667438476] { key1: 'value1', key2: 'value2', key3: 'value3' } { callbackWaitsForEmptyEventLoop: [Getter/Setter], succeed: [Function: modifiedLegacySucceedCallback], fail: [Function: modifiedLegacyFailCallback], done: [Function: modifiedLegacyDoneCallback], functionVersion: '$LATEST', functionName: 'HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY', memoryLimitInMB: '128', logGroupName: '/aws/lambda/HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY', logStreamName: '2022/05/20/[$LATEST]a3718c0678974465ba1ed1e723a0e50a', clientContext: undefined, identity: undefined, invokedFunctionArn: 'arn:aws:lambda:us-east-1:<account_id_redacted>:function:HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY', awsRequestId: '9625d679-ae4d-4cb7-a81a-b708d1c7e81a', getRemainingTimeInMillis: [Function: getRemainingTimeInMillis] }
2022-05-20T11:10:13.959-10:00	{"level":"info","timestamp":1653081013959,"git_hash":"6376f01","logger":{"name":"handler"},"query_dict":{},"headers_dict":{},"inputs_parsed":{"slug":"","originEndpoint":"","offsetMs":0,"videoService":"","segmentsEndpoint":"","useDynamoDb":false,"flags":{"modifyMasterQuery":false}},"lambda":{"request_id":"9625d679-ae4d-4cb7-a81a-b708d1c7e81a","version":"$LATEST"},"service":"hls-markup-handler","env":"dev","message":"inbound request: undefined"}
2022-05-20T11:10:13.998-10:00	2022-05-20 21:10:13 UTC | DD_EXTENSION | DEBUG | Received invocation event...
2022-05-20T11:10:14.360-10:00	2022-05-20 21:10:14 UTC | DD_EXTENSION | DEBUG | The flush strategy end has decided to not flush at moment: starting
2022-05-20T11:10:19.878-10:00	{"level":"error","timestamp":1653081014058,"git_hash":"6376f01","logger":{"name":"handler"},"error":{"kind":"TypeError","message":"Cannot read property 'replace' of undefined","stack":"TypeError: Cannot read property 'replace' of undefined\n at getUrl (/var/asset-input/src/markup.handler.ts:167:35)\n at fetchManifest (/var/asset-input/src/markup.handler.ts:143:23)\n at main (/var/asset-input/src/markup.handler.ts:43:32)\n at /opt/nodejs/node_modules/datadog-lambda-js/utils/handler.js:156:25\n at /opt/nodejs/node_modules/datadog-lambda-js/index.js:175:62\n at step (/opt/nodejs/node_modules/datadog-lambda-js/index.js:44:23)\n at Object.next (/opt/nodejs/node_modules/datadog-lambda-js/index.js:25:53)\n at /opt/nodejs/node_modules/datadog-lambda-js/index.js:19:71\n at new Promise (<anonymous>)\n at __awaiter (/opt/nodejs/node_modules/datadog-lambda-js/index.js:15:12)"},"lambda":{"request_id":"9625d679-ae4d-4cb7-a81a-b708d1c7e81a","version":"$LATEST"},"service":"hls-markup-handler","env":"dev","message":"Cannot read property 'replace' of undefined"}
2022-05-20T11:10:20.258-10:00	2022-05-20T21:10:20.217Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Flushing statsD"}
2022-05-20T11:10:20.818-10:00	2022-05-20T21:10:20.798Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Flushing Extension"}
2022-05-20T11:10:20.958-10:00	2022-05-20T21:10:20.958Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:sending payload with body {}"}
2022-05-20T11:10:23.380-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Hit on the serverless.Flush route.
2022-05-20T11:10:23.518-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | INFO | [lang:nodejs lang_version:v14.19.1 interpreter:v8 tracer_version:2.4.0 endpoint_version:v0.4] -> traces received: 1, traces filtered: 0, traces amount: 587 bytes, events extracted: 0, events sampled: 0
2022-05-20T11:10:23.578-10:00	2022-05-20T21:10:23.558Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:Reading trace context from env var Root=1-628803b1-724437cb121730d910154f56;Parent=38e64147639a5bed;Sampled=0"}
2022-05-20T11:10:23.598-10:00	2022-05-20T21:10:23.578Z 9625d679-ae4d-4cb7-a81a-b708d1c7e81a DEBUG [dd.trace_id=1303564325982916438 dd.span_id=4100036285636959213] {"status":"debug","message":"datadog:discarding xray metadata subsegment due to sampling"}
2022-05-20T11:10:23.885-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Impossible to compute aws.lambda.enhanced.runtime_duration due to an invalid interval
2022-05-20T11:10:23.885-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Received a runtimeDone log message for the current invocation 9625d679-ae4d-4cb7-a81a-b708d1c7e81a
2022-05-20T11:10:23.885-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | The flush strategy end has decided to flush at moment: stopping
2022-05-20T11:10:23.885-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Beginning metrics flush at time 1653081023
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Received a Flush trigger
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Beginning traces flush at time 1653081023
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Beginning logs flush at time 1653081023
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | INFO | Triggering a flush in the logs-agent
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Flush in the logs-agent done.
2022-05-20T11:10:23.898-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Finished logs flush that was started at time 1653081023
2022-05-20T11:10:24.078-10:00	2022-05-20 21:10:23 UTC | DD_EXTENSION | DEBUG | Serializing 3 tracer payloads.
2022-05-20T11:10:24.238-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Sending trace payload : {HostName: Env:none TracerPayloads:[languageName:"nodejs" languageVersion:"v14.19.1" tracerVersion:"2.4.0" chunks:<priority:1 origin:"lambda" spans:<service:"hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty-http-client" name:"http.request" resource:"GET" traceID:8601525117646379897 spanID:8601525117646379897 start:1653081011738045184 duration:380723633 error:1 meta:<key:"_dd.origin" value:"lambda" > meta:<key:"git.commit.sha" value:"6376f01" > meta:<key:"http.method" value:"GET" > meta:<key:"http.url" value:"http://127.0.0.1:8124/lambda/hello" > meta:<key:"runtime-id" value:"5d8b4890-a44e-4176-a7b2-d9571caaf20c" > meta:<key:"service" value:"HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY" > meta:<key:"span.kind" value:"client" > metrics:<key:"_dd.agent_psr" value:1 > metrics:<key:"_dd.measured" value:1 > metrics:<key:"_sampling_priority_v1" value:1 > metrics:<key:"_top_level" value:1 > type:"http" > > languageName:"nodejs" languageVersion:"v14.19.1" tracerVersion:"2.4.0" chunks:<priority:1 origin:"lambda" spans:<service:"hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty-http-client" name:"http.request" resource:"POST" traceID:48843491771455817 spanID:48843491771455817 start:1653081020959006464 duration:818924072 error:1 meta:<key:"_dd.compute_stats" value:"1" > meta:<key:"_dd.origin" value:"lambda" > meta:<key:"account_id" value:"<account_id_redacted>" > meta:<key:"architecture" value:"x86_64" > meta:<key:"aws_account" value:"<account_id_redacted>" > meta:<key:"dd_extension_version" value:"21" > meta:<key:"function_arn" value:"arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty" > meta:<key:"functionname" value:"hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty" > meta:<key:"git.commit.sha" value:"6376f01" > meta:<key:"http.method" value:"POST" > meta:<key:"http.url" value:"http://127.0.0.1:8124/lambda/flush" > meta:<key:"memorysize" value:"128" > meta:<key:"region" value:"us-east-1" > meta:<key:"runtime" value:"nodejs14.x" > meta:<key:"runtime-id" value:"5d8b4890-a44e-4176-a7b2-d9571caaf20c" > meta:<key:"service" value:"HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY" > meta:<key:"span.kind" value:"client" > metrics:<key:"_dd.agent_psr" value:1 > metrics:<key:"_dd.measured" value:1 > metrics:<key:"_sampling_priority_v1" value:1 > metrics:<key:"_top_level" value:1 > type:"http" > > languageName:"nodejs" languageVersion:"v14.19.1" tracerVersion:"2.4.0" chunks:<priority:1 origin:"lambda" spans:<service:"aws.lambda" name:"aws.lambda" resource:"HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY" traceID:2590960962667438476 spanID:2590960962667438476 start:1653081013718006784 duration:6239594238 meta:<key:"_dd.compute_stats" value:"1" > meta:<key:"_dd.origin" value:"lambda" > meta:<key:"account_id" value:"<account_id_redacted>" > meta:<key:"architecture" value:"x86_64" > meta:<key:"aws_account" value:"<account_id_redacted>" > meta:<key:"datadog_lambda" value:"5.76.0" > meta:<key:"dd_extension_version" value:"21" > meta:<key:"function_arn" value:"arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty" > meta:<key:"function_version" value:"$LATEST" > meta:<key:"functionname" value:"hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty" > meta:<key:"git.commit.sha" value:"6376f01" > meta:<key:"memorysize" value:"128" > meta:<key:"region" value:"us-east-1" > meta:<key:"request_id" value:"9625d679-ae4d-4cb7-a81a-b708d1c7e81a" > meta:<key:"resource_names" value:"HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY" > meta:<key:"runtime" value:"nodejs14.x" > meta:<key:"runtime-id" value:"5d8b4890-a44e-4176-a7b2-d9571caaf20c" > meta:<key:"service" value:"HlsMarkupService-dev-wbend-HlsMarkupLambda1FD8C381-5MNIydJtsUTY" > metrics:<key:"_dd.agent_psr" value:1 > metrics:<key:"_dd.measured" value:1 > metrics:<key:"_sampling_priority_v1" value:1 > metrics:<key:"_top_level" value:1 > metrics:<key:"cold_start" value:1 > type:"serverless" > > ] Tags:map[] AgentVersion:6.0.0 TargetTPS:10 ErrorTPS:10}
2022-05-20T11:10:24.320-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Flushing 1 sketches to the forwarder
2022-05-20T11:10:24.379-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Sending sketches payload : {"sketches":[{"metric":"aws.lambda.enhanced.invocations","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:true"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":1,"Max":1,"Sum":1,"Avg":1,"Cnt":1}},"ts":1653081010}]}]}
2022-05-20T11:10:24.438-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:10:24.518-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | INFO | Successfully posted payload to "https://6-0-0-app.agent.datadoghq.com/api/beta/sketches?api_key=***************************7b865", the agent will only log transaction success every 500 transactions
2022-05-20T11:10:24.538-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:10:24.538-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Flushing 2 series to the forwarder
2022-05-20T11:10:24.538-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:10:24.541-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:10:24.541-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Flushing 1 service checks to the forwarder
2022-05-20T11:10:24.558-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:10:24.561-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:10:24.561-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Finished metrics flush that was started at time 1653081023
2022-05-20T11:10:24.600-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Flushed traces to the API; time: 200.890177ms, bytes: 1003
2022-05-20T11:10:24.610-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Finished traces flush that was started at time 1653081023
2022-05-20T11:10:24.610-10:00	2022-05-20 21:10:24 UTC | DD_EXTENSION | DEBUG | Finished flushing
2022-05-20T11:10:24.613-10:00	END RequestId: 9625d679-ae4d-4cb7-a81a-b708d1c7e81a
2022-05-20T11:10:24.613-10:00	REPORT RequestId: 9625d679-ae4d-4cb7-a81a-b708d1c7e81a Duration: 13246.87 ms Billed Duration: 13247 ms Memory Size: 128 MB Max Memory Used: 128 MB Init Duration: 1502.96 ms
2022-05-20T11:16:10.160-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Flushing 2 series to the forwarder
2022-05-20T11:16:10.160-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:10.160-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Flushing 1 service checks to the forwarder
2022-05-20T11:16:10.160-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:10.501-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | INFO | [lang:nodejs lang_version:v14.19.1 interpreter:v8 tracer_version:2.4.0 endpoint_version:v0.4] -> traces received: 2, traces filtered: 0, traces amount: 1505 bytes, events extracted: 0, events sampled: 0
2022-05-20T11:16:10.562-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Enhanced metrics: {durationMs:13246.87 billedDurationMs:13247 memorySizeMB:128 maxMemoryUsedMB:128 initDurationMs:1502.96}
2022-05-20T11:16:10.660-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Flushing 1 service checks to the forwarder
2022-05-20T11:16:10.660-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:10.722-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Received shutdown event. Reason: spindown
2022-05-20T11:16:10.740-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Waiting to shut down HTTP server
2022-05-20T11:16:10.761-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | flushing bucket 1653081010000000000
2022-05-20T11:16:10.761-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Flushing 2 series to the forwarder
2022-05-20T11:16:10.761-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:10.761-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:10.801-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:10.802-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:10.840-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:10.861-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | flushing bucket 1653081020000000000
2022-05-20T11:16:10.861-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | update oldestTs to 1653081360000000000
2022-05-20T11:16:10.861-10:00	2022-05-20 21:16:10 UTC | DD_EXTENSION | DEBUG | Flushing 3 entries (buckets=2 client_payloads=1)
2022-05-20T11:16:11.660-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Shutting down HTTP server
2022-05-20T11:16:11.662-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Beginning metrics flush at time 1653081371
2022-05-20T11:16:11.680-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Received a Flush trigger
2022-05-20T11:16:11.680-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Beginning traces flush at time 1653081371
2022-05-20T11:16:11.682-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Flushing 6 sketches to the forwarder
2022-05-20T11:16:11.682-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Sending sketches payload : {"sketches":[{"metric":"aws.lambda.enhanced.max_memory_used","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":128,"Max":128,"Sum":128,"Avg":128,"Cnt":1}},"ts":1653081020}]},{"metric":"aws.lambda.enhanced.memorysize","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":128,"Max":128,"Sum":128,"Avg":128,"Cnt":1}},"ts":1653081020}]},{"metric":"aws.lambda.enhanced.billed_duration","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":13.247,"Max":13.247,"Sum":13.247,"Avg":13.247,"Cnt":1}},"ts":1653081020}]},{"metric":"aws.lambda.enhanced.duration","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":13.246870000000001,"Max":13.246870000000001,"Sum":13.246870000000001,"Avg":13.246870000000001,"Cnt":1}},"ts":1653081020}]},{"metric":"aws.lambda.enhanced.estimated_cost","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":0.0000277979719,"Max":0.0000277979719,"Sum":0.0000277979719,"Avg":0.0000277979719,"Cnt":1}},"ts":1653081020}]},{"metric":"aws.lambda.enhanced.init_duration","tags":["function_arn:arn:aws:lambda:us-east-1:<account_id_redacted>:function:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","functionname:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","git.commit.sha:6376f01","account_id:<account_id_redacted>","aws_account:<account_id_redacted>","memorysize:128","dd_extension_version:21","region:us-east-1","resource:hlsmarkupservice-dev-wbend-hlsmarkuplambda1fd8c381-5mniydjtsuty","architecture:x86_64","runtime:nodejs14.x","cold_start:false"],"host":"","interval":10,"points":[{"sketch":{"summary":{"Min":1.50296,"Max":1.50296,"Sum":1.50296,"Avg":1.50296,"Cnt":1}},"ts":1653081020}]}]}
2022-05-20T11:16:11.682-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Beginning logs flush at time 1653081371
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | INFO | Triggering a flush in the logs-agent
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Flush in the logs-agent done.
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Finished logs flush that was started at time 1653081371
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:11.720-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Flushing 2 series to the forwarder
2022-05-20T11:16:11.740-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:11.741-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:11.741-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Flushing 1 service checks to the forwarder
2022-05-20T11:16:11.741-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | The payload was not too big, returning the full payload
2022-05-20T11:16:11.760-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | SyncForwarder has flushed 1 transactions
2022-05-20T11:16:11.760-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Finished metrics flush that was started at time 1653081371
2022-05-20T11:16:11.786-10:00	2022-05-20 21:16:11 UTC | DD_EXTENSION | DEBUG | Flushed stats to the API; time: 65.808446ms, bytes: 560

Add Support For Alpine Linux

Per discussion in slack with Maxime David this extension does not currently support Alpine Linux.

Unsure of why this is, but let me know if there is anything I can do to help get this working.

Seeing `runtime error: invalid memory address or nil pointer dereference` with extension v39 and v40

Problem

Runtimes: java, dotnet, go
Extension: v39 and v40
Universal instrumentation: enabled, either by default (java/dotnet) or via DD_UNIVERSAL_INSTRUMENTATION=true (go)
Appsec: disabled (ie DD_SERVERLESS_APPSEC_ENABLED=false)

After the lambda function has returned its response, the Datadog Lambda Extension panics with the following error.

2023-03-14 18:06:21 UTC | DD_EXTENSION | DEBUG | [lifecycle] onInvokeEnd --------
2023-03-14 18:06:21 UTC | DD_EXTENSION | DEBUG | [lifecycle] Invocation has finished at: 2023-03-14 18:06:21.701256499 +0000 UTC m=+5.080367241
2023-03-14 18:06:21 UTC | DD_EXTENSION | DEBUG | [lifecycle] Invocation isError is: false
2023-03-14 18:06:21 UTC | DD_EXTENSION | DEBUG | [lifecycle] ---------------------------------------
2023-03-14 18:06:21 UTC | DD_EXTENSION | DEBUG | [lifecycle] No http status code found in the response payload
2023/03/14 18:06:21 http: panic serving 127.0.0.1:38412: runtime error: invalid memory address or nil pointer dereference
goroutine 263 [running]:
net/http.(*conn).serve.func1()
/usr/local/go/src/net/http/server.go:1850 +0xbf
panic({0x163b3a0, 0x26d6fa0})
/usr/local/go/src/runtime/panic.go:890 +0x262
github.com/DataDog/datadog-agent/pkg/serverless/appsec/httpsec.(*InvocationSubProcessor).OnInvokeEnd(0x0, 0xc0000a17c0, 0xc000489bc0)
/tmp/dd/datadog-agent/pkg/serverless/appsec/httpsec/http.go:136 +0x778
github.com/DataDog/datadog-agent/pkg/serverless/invocationlifecycle.(*LifecycleProcessor).OnInvokeEnd(0xc00048acd0, 0xc0000a17c0)
/tmp/dd/datadog-agent/pkg/serverless/invocationlifecycle/lifecycle.go:206 +0x2d7
github.com/DataDog/datadog-agent/pkg/serverless/daemon.(*EndInvocation).ServeHTTP(0xc0000bf700, {0x1c1d8a0, 0xc00001c540}, 0xc00012f300)
/tmp/dd/datadog-agent/pkg/serverless/daemon/routes.go:111 +0x34c
net/http.(*ServeMux).ServeHTTP(0x0?, {0x1c1d8a0, 0xc00001c540}, 0xc00012f300)
/usr/local/go/src/net/http/server.go:2487 +0x149
net/http.serverHandler.ServeHTTP({0x1c16fd0?}, {0x1c1d8a0, 0xc00001c540}, 0xc00012f300)
/usr/local/go/src/net/http/server.go:2947 +0x30c
net/http.(*conn).serve(0xc0001439a0, {0x1c1e8c8, 0xc000662b70})
/usr/local/go/src/net/http/server.go:1991 +0x607
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:3102 +0x4db

Impact

The customer function completes as expected. However, the following data may not be collected and sent by the extension.

  • aws.lambda.enhanced.errors metric
  • aws.lambda invocation span
  • Any inferred spans normally created by the extension

Mitigation

At this point, if you are needing the above missing data, the best solution is to roll back your extension version to v38.

A fix is currently in review to address the underlying issue (see DataDog/datadog-agent#16054). However, due to the incident the week of March 7, we have chosen to hold off releasing a fix at this time as we are prioritizing product stability.

Possible China

Possible to build this in China aws account, manually? The arns and support are for US and Gov, but whats the possibility of deploying this extension yourself in china; using your own ARN rather than datadogs.

Sending logs, metrics or traces to Vector

Actually we can easily configure Datadog agent to send logs, metrics or traces to a vector aggregator. But seems it doesn't work for the lambda extension.

Used both bellow configurations, but the datadog extension keep sending log to the datadog intake directly.

vector:
  logs:
    enabled: true
    url: "http://<OPW_HOST>:8282"
DD_VECTOR_LOGS_ENABLED="true"
DD_VECTOR_LOGS_URL="http://<OPW_HOST>:8282"

Do you know if there is a plan to fix this or anyone can direct me where/how to fix ?

Thanks

Version 8 not accessable in eu-central-1

Looks like version 8 of the lambda layer extension is not accessible in the eu-central-1 region.

$ aws lambda get-layer-version --layer-name arn:aws:lambda:eu-central-1:464622532012:layer:Datadog-Extension --version-number 8

An error occurred (AccessDeniedException) when calling the GetLayerVersion operation: User: <REDACTED> is not authorized to perform: lambda:GetLayerVersion on resource: arn:aws:lambda:eu-central-1:464622532012:layer:Datadog-Extension:8

Extension size causing lambda to exceed size limit

We are trying to use this extension for our lambdas, but getting a size limit error. The extension appears to include the folder listed below, which when I pull the code down locally, is showing 163MB. The entire code base extracted is only 164 MB, so this seems to be taking up the bulk of the size. Just going based on the name of the folder, but should this folder be included in the extension?

..\datadog-lambda-extension-main\datadog-lambda-extension-main\scripts\go-binsize-viz\example-data

We are using version 43 and 44 currently and seeing the same results in both.

Looks like this might have been reported before, but closed on issue #79 , but there was no resolution.

Increased lambda duration during march 8th incident

During the 2023 march 8th incident, our lambdas duration average execution increased very significantly (about x2-x3), causing an increase in concurrency and additional load across the board. These lambdas are executed thousands of times per minute, and can take ~90ms on average to complete. We ended up disabling the lambda extension to restore to our normal duration.

Looking at https://github.com/DataDog/datadog-lambda-extension#overhead it seems we shouldn't have seen this. My interpretation is that it would be expected to have 1 invocation every minute to have a larger than normal duration as it flushes the buffered metrics/spans, but most lambda invocations should have kept working at same speed.

Wanted to check on that interpretation and see if there was any other reasons that could have caused such an increase in average duration, and if there was anything that could be done to prevent those in the future in the light of today's incident.

This is our current configuration for the extension:

  enableDDTracing: false
  # logs are forwarded from CW to DD
  enableDDLogs: false
  subscribeToAccessLogs: false
  # as tracing is disable, do not add DD context to logs
  injectLogContext: false

Thank you, and #hugops as I bet this one was a hard one!

Reduce size

Hi guys,
At first I would like to thank you for the extension.
It's really easy to integrate it to a lambda function, but we face some issues with the final size of a lambda as it counts all layers against the limit of 250MB.
The size of an extension is about 174Mb, so we have just 72 Mb left for our other layers.

Looks like most of the size is coming from go-binsize-viz/example-data/

du -hs ~/Downloads/datadog-lambda-extension-29/scripts/go-binsize-viz/example-data/
164M    /Users/eliskovets_pdl/Downloads/datadog-lambda-extension-29/scripts/go-binsize-viz/example-data/

I'm not exactly sure if it would be fine to exclude this directory from the extension or not, but if it can be excluded, it would really help us and would reduce the size of extension to just about 10Mb.

Thank you!

DD_SERVICE and DD_ENV enviroment variables not getting created

The docs state that the DD_SERVICE and DD_ENV environment variables will be created on the Lambda function if service and env are configured on the transform. Using the latest version (v67) -- we are not seeing this behavior.

Environment variables are not being created, but tags are.

Here is our transform:

Transform:
  - AWS::Serverless-2016-10-31
  - Name: DatadogServerless
    Parameters:
      pythonLayerVersion: 67
      extensionLayerVersion: 36
      site: "datadoghq.com"
      apiKey: !Ref DatadogAPIKey
      env: staging
      service: my-awesome-service

Support SSM Parameter Store secure string for the DD_AP_KEY

hey , whats the whole purpose of this project if it cannot be secured properly ?

Transform:
  - AWS::Serverless-2016-10-31
  - Name: DatadogServerless
    Parameters:
      stackName: !Ref "AWS::StackName"
      apiKey: <DATADOG_API_KEY>
      pythonLayerVersion: 47
      extensionLayerVersion: 10
      service: "<SERVICE>" # Optional
      env: "<ENV>" # Optional

we have to be able to resolve the apiKey from somewhere...

Extension crashing

Hello. My lambda is deployed with the datadog lambda extension.

My lambda intermittently crashes here and there and I believe its because of the extension crashing.

The logs are available here:
image
image

DD_EXTENSION warn

Hi

I'm currently monitoring some lambdas with DD and I'm getting a lot of warn messages from the DD_EXTENSION layer.

2023-04-12 10:38:54 UTC | DD_EXTENSION | WARN | Failed to identify cgroups version due to err: unable to detect cgroup version from detected mount points: map[]. APM data may be missing containerIDs.

Previously we had this warning but since I've updated to the latest version it no longer appears
DD_EXTENSION | WARN | Error loading config: open /var/task/datadog.yaml: no such file or directory

I'm using the latest versions of the layers and running node16

"arn:aws:lambda:eu-west-1:464622532012:layer:Datadog-Node16-x:90"
"arn:aws:lambda:eu-west-1:464622532012:layer:Datadog-Extension:41",

and these are the variables that I'm passing to the lambda

DD_ENV | dev
DD_FLUSH_TO_LOG | false
DD_LAMBDA_HANDLER | index.handler
DD_LOG_LEVEL | warn
DD_LOGS_CONFIG_PROCESSING_RULES | [{"type": "exclude_at_match", "name": "exclude_start_and_end_logs", "pattern": "(START\|END\|REPORT) RequestId"}]
DD_MERGE_XRAY_TRACES | false
DD_SERVICE | my-analyses
DD_SITE | datadoghq.eu
DD_TRACE_ENABLED | true

Do you guys know why this could be happening or a way to stop logging it since its costly and not really relevant for us
Thanks in advance

Support for GraalVM

We have several functions that are compiled with GraalVM and deployed to run in Lambda's custom runtime. The Lambda zip bundle contains two files application which is the binary generated by GraalVM's compiler and bootstrap which contains the following script:

#!/bin/sh
set -euo pipefail
./application -Xmx512m -Djava.library.path=$(pwd)

The wrapper only will set the Java appropriate variables in the runtime matches *java*. Is there a way for us to get this working with GraalVM? Will APM even work with a GraalVM compiled Java application?

Extension adds significant execution overhead

This is on AWS Lambda, Python 3.9, with extension version 83 on ARM64.

image

According to the extension's description, it's supposed to ship logs asynchronously, but it appears to be doing so synchronously. Setting DD_SERVERLESS_FLUSH_STRATEGY=periodically,60000 only defers the latency -- most executions go down to around 50ms, but every minute or so, they spike to ~1000ms, which is not ideal.

Extension Enabled

image

Extension Disabled, with DD_FLUSH_TO_LOG=true

image

As you can see, the difference is an order of magnitude.

Is this expected/normal?

Missing logs and APM traces when a lambda timeout happens

During some tests I had timeout in lambda, was running more than 15 minutes and the final logs were missed in datadogs and traces around the whole execution.

I reduced the timeout to 850 seconds and enabled the DD_TRACE_DEBUG = true but I couldn't see anything in logs.

See screenshots of latest logs of lambda execution in datadog and cloudwatch (Highlighted the ones missing in cloudwatch END, REPORT and task timeout):

Screenshot 2022-10-27 at 10 35 16

Screenshot 2022-10-27 at 10 18 09

Libraries in use:

datadog-lambda 4.63.0 The Datadog AWS Lambda Library
├── datadog >=0.41,<0.42
│   ├── decorator >=3.3.2 
│   └── requests >=2.6.0 
│       ├── certifi >=2017.4.17 
│       ├── charset-normalizer >=2,<3 
│       ├── idna >=2.5,<4 
│       └── urllib3 >=1.21.1,<1.27 
├── ddtrace >=1.4.1,<2.0.0
│   ├── attrs >=19.2.0 
│   ├── bytecode * 
│   ├── cattrs * 
│   │   ├── attrs >=20 (circular dependency aborted here)
│   │   └── exceptiongroup * 
│   ├── ddsketch >=2.0.1 
│   │   ├── protobuf >=3.0.0 
│   │   └── six * 
│   ├── envier * 
│   ├── jsonschema * 
│   │   ├── attrs >=17.4.0 (circular dependency aborted here)
│   │   └── pyrsistent >=0.14.0,<0.17.0 || >0.17.0,<0.17.1 || >0.17.1,<0.17.2 || >0.17.2 
│   ├── packaging >=17.1 
│   │   └── pyparsing >=2.0.2,<3.0.5 || >3.0.5 
│   ├── protobuf >=3 (circular dependency aborted here)
│   ├── six >=1.12.0 (circular dependency aborted here)
│   ├── tenacity >=5 
│   ├── typing-extensions * 
│   └── xmltodict >=0.12 
└── wrapt >=1.11.2,<2.0.0

"InvalidSignatureException: SignatureExpired" due to invoke delay

So, every so often our Lambda behind an API gateway fails with the error "InvalidSignatureException: Signature expired", and it claims that the 5 minute timer on the signature has failed. I dig through the logs, and find that there's a weird gap between the Datadog Extension layer starting and the actual invoke of our function, often in the order of about 6 minutes, and hence the signature issue. This doesn't happen every time, but enough to be worrying, and I've got no idea why.

Any ideas? This is with serverless 3.32.2 and serverless-plugin-datadog 5.48.0 with the extension enabled.

Send OpenTelemetry traces to the datadog via the extension

I would like to instrument my Lambda function using OpenTelementry. I know ADOT (AWS Distro for OpenTelementry) exists but it seems that I would have to sacrifice some features (appsec, enhanced metrics, log & trace correlation, etc) if I only used that. If I used both ADOT and the Datadog extension then I'd have to include more Lambda Layers.

I would like to activate the OTLP ingestion. I tried setting the DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=localhost:4318 environment variable but when I check the port is not in use. Is it possible to run the OTLP ingestion in lambda?

Rate exceeded (toomanyrequests) in public.ecr.aws/datadog/lambda-extension

I've had a few failures in CI building our docker image please could you increase the rate?

Message while building our container:

Step 13/14 : COPY --from=public.ecr.aws/datadog/lambda-extension:35 /opt/extensions/ /opt/extensions
invalid from flag value public.ecr.aws/datadog/lambda-extension:35: toomanyrequests: Rate exceeded

DD_EXTENSION | ERROR | Cannot decode v0.4 traces payload

We are using version 20 of the datadog lambda extension layer on the AL2 lambda runtime. We are sometimes seeing errors like the below on the Lambda:

arn:aws:lambda:eu-central-1:464622532012:layer:Datadog-Extension:21

2022-04-25 16:00:37 UTC | DD_EXTENSION | ERROR | Cannot decode v0.4 traces payload: read tcp 127.0.0.1:8126->127.0.0.1:49232: i/o timeout

We are upgrading to version 21 to see if it either solves the problem or provides more diagnostics.

Any advice would be great

After removing Version tag in lambda getting error retrieving datadog API key from secret manager

Hi, I've been using the datadog_lambda_extension for a while and I didn't experience any problem and with docker image in lambda

COPY --from=public.ecr.aws/datadog/lambda-extension:35 /opt/extensions/ /opt/extensions

Now, I'm doing a refactor of my terraform code and I've removed Version tag from lambda and I'm getting this error,

2023-02-20 16:56:16 UTC \| DD_EXTENSION \| ERROR \| Error while trying to read an API Key from Secrets Manager: Secrets Manager read error: RequestError: send request failed
--
  | 2023-02-20T17:56:16.880+01:00 | caused by: Post "https://secretsmanager.eu-west-1.amazonaws.com/": dial tcp 52.16.54.217:443: i/o timeout
  | 2023-02-20T17:56:16.880+01:00 | 2023-02-20 16:56:16 UTC \| DD_EXTENSION \| ERROR \| No API key configured, exiting
  | 2023-02-20T17:56:16.883+01:00 | TELEMETRY Name: datadog-agent State: Subscribed Types: [Platform, Function, Extension]
  | 2023-02-20T17:56:16.883+01:00 | 2023-02-20 16:56:16 UTC \| DD_EXTENSION \| ERROR \| Unable to load trace agent config: you must specify an API Key, either via a configuration file or the DD_API_KEY env var
  | 2023-02-20T17:56:16.886+01:00 | EXTENSION Name: datadog-agent State: Registered Events: [INVOKE,SHUTDOWN]

I've followed https://docs.datadoghq.com/serverless/installation/python/?tab=containerimage and I've checked

4. Configure the Datadog site, API key, and tracing

Set the environment variable DD_SITE to datadoghq.com (ensure the correct SITE is selected on the right).
Set the environment variable DD_API_KEY_SECRET_ARN with the ARN of the AWS secret where your [Datadog API key](https://app.datadoghq.com/organization-settings/api-keys) is securely stored. The key needs to be stored as a plaintext string (not a JSON blob). The secretsmanager:GetSecretValue permission is required. For quick testing, you can use DD_API_KEY instead and set the Datadog API key in plaintext.
Set the environment variable DD_TRACE_ENABLED to true.

which I've followed before and checked that key is there. If I revert the commit and put Version tag back again it works.

Any idea why this is so important? I cannot find anywhere in the docs where it says lambda has to be tagged with Version.

Trace span longer than invocation

I don't know if this is supposed to be this way or not, but when I have my lambdas hooked up to the v8 extension my traces look something like this. This is with x-ray merging in place as well. The green and yellow trace spans are from xray, and everything in orange is from the datadog-lambda-extension

image

  1. Why are there varying span lengths for Invoke? (I assume this is just aws nonsense, and less of a concern than the next question)
  2. Why does the first span from datadog-lambda-extension go far beyond the invocation spans?

The lambda has the following layers:
image

with the following environment variables (api_key not shown):
image

v33 does not include features that it should

Installed v33 expecting that it would include the new Lambda Telemetry features but didn't see any changes. Looking at the comparison of tags it seems like only a local testing change was included in v33? I think at a minimum dd-trace needs to be bumped to 1.6.0 to see the telemetry and in the latest traces I see that it still says 1.5.3 after upgrading to v33.

Expected Features:
image

Actual Changes:
image

Logged exception is split per line resulting in multiple log entries in DD

Hello,

We switched our lambda function running in AWS from a zip lambda to a docker image.
We followed the setup instructions from https://docs.datadoghq.com/serverless/installation/python/?tab=containerimage

But now for error entries logged with exc_info= we get these multiple entries in DD, like it's being split per line instead of having a single entry as expected - see the attached image

All those entries in the middle are actually the result of a single logger.error call

Screenshot 2023-05-09 at 17 44 59

Any idea on what might cause this?

Logs are not flushed until the function is spun down

I my testing I have seen that logs take ~10 minutes to appear in datadog log explorer from my lambda function. The reason seems to be that the logs are not flushed until the function is spun down by AWS.

I am using the Datadog Lamda Extension without Datadog Forwarder. My function is deployed with SAM and uses the following transform to enable the Lamda Extension:

Transform:
  - AWS::Serverless-2016-10-31
  - Name: DatadogServerless
    Parameters:
      stackName: !Ref "AWS::StackName"
      apiKey: <redacted>
      site: "datadoghq.eu"
      nodeLayerVersion: 64
      extensionLayerVersion: 11
      service: <redacted>
      env: <redacted>

My Datadog environment is:

DD_API_KEY | <redacted>
DD_CAPTURE_LAMBDA_PAYLOAD | false
DD_ENHANCED_METRICS | true
DD_FLUSH_TO_LOG | false
DD_LAMBDA_HANDLER | <redacted>
DD_LOG_LEVEL | debug
DD_MERGE_XRAY_TRACES | false
DD_SERVERLESS_LOGS_ENABLED | true
DD_SITE | datadoghq.eu
DD_TRACE_ENABLED | true

Here is the full debug log from when this happens. The function was only invoked once.

datadog-debug.log

VPC Bound AWS Lambda

Hi team

I am getting timeout while using Datadog Layer Extension. Any workaround to do away with timeout?

Random logging duplication

I'm using the latest(48) version of the DataDog Lambda Extention layer with the .NET 7 AOT function to send logs to the DataDog.
The issue is that I have duplicate logs with the same request_id for some requests.
CloudWatch shows logs correctly. No DataDog Forwarder is configured to send logs to the DataDog. No dd-tracer layer is installed as it is not compatible with the .NET AOT.
Does anyone have an idea what is the issue?

public.ecr.aws/datadog/lambda-extension only has ARM images for versions >=11

We currently use the datadog-lambda-extension to monitor our containerized x86 architected lambdas (base image is FROM amazon/aws-lambda-python:3.8)

We have been successfully following installation instructions to import the lambda extension into our Docker image from public.ecr.aws/datadog/lambda-extension

COPY --from=public.ecr.aws/datadog/lambda-extension:<TAG> /opt/extensions/ /opt/extensions

However, trying to use any lambda-extension version >= 11 currently results in the lambda failing with the below error:
Error: fork/exec /opt/extensions/datadog-agent: exec format error Extension.LaunchError

Running docker inspect on the lambda-extension images in public.ecr.aws/datadog/lambda-extension reveals that lambda-extension versions 1-10 are amd64 (x86) while versions >=11 has changed to arm64. Guessing this architecture incompatibility between the extension and our lambda is causing the error.

Is there a ecr repo where we can pull x86 compatible lambda-extension images? Also recommend fixing the architecture tags for the public ecr repo.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.