Code Monkey home page Code Monkey logo

aws-otel-collector's Introduction

C/I GitHub release (latest by date)

Overview

AWS Distro for OpenTelemetry Collector (ADOT Collector) is an AWS supported version of the upstream OpenTelemetry Collector and is distributed by Amazon. It supports the selected components from the OpenTelemetry community. It is fully compatible with AWS computing platforms including EC2, ECS, and EKS. It enables users to send telemetry data to AWS CloudWatch Metrics, Traces, and Logs backends as well as the other supported backends.

See the AWS Distro for OpenTelemetry documentation for more information. Additionally, the ADOT Collector is now generally available for metrics.

Getting Help

Use the community resources below for getting help with the ADOT Collector.

  • Open a support ticket with AWS Support.
  • Use GitHub issues to report bugs and request features.
  • Join our GitHub Community for AWS Distro for OpenTelemetry to ask your questions, file issues, or request enhancements.
  • If you think you may have found a bug, open a bug report.
  • For contributing guidelines, refer to CONTRIBUTING.md.

Notice: ADOT Collector v0.41.0 Breaking Changes

ADOT Collector Built-in Components

This table represents the supported components of the ADOT Collector. The highlighted components below are developed by AWS in-house. The rest of the components in the table are the essential default components that the ADOT Collector will support.

Receiver Processor Exporter Extensions
prometheusreceiver attributesprocessor awsxrayexporter healthcheckextension
otlpreceiver resourceprocessor awsemfexporter pprofextension
awsecscontainermetricsreceiver batchprocessor prometheusremotewriteexporter zpagesextension
awsxrayreceiver memorylimiterprocessor loggingexporter ecsobserver
statsdreceiver probabilisticsamplerprocessor otlpexporter awsproxy
zipkinreceiver metricstransformprocessor fileexporter ballastextention
jaegerreceiver spanprocessor otlphttpexporter sigv4authextension
awscontainerinsightreceiver filterprocessor prometheusexporter filestorage
kafka resourcedetectionprocessor datadogexporter
filelogreceiver metricsgenerationprocessor sapmexporter
cumulativetodeltaprocessor signalfxexporter
deltatorateprocessor logzioexporter
groupbytraceprocessor kafka
tailsamplingprocessor loadbalancingexporter
k8sattributesprocessor awscloudwatchlogsexporter

Besides the components that interact with telemetry signals directly from the previous table, there is also support to the following confmap providers:

  • file
  • env
  • YAML
  • s3
  • http
  • https

More documentation for confmap providers can be found here.

Getting Started

Prerequisites

To build the ADOT Collector locally, you will need to have Golang installed. You can download and install Golang here.

ADOT Collector Configuration

The ADOT Collector is built with a default configuration. The ADOT Collector configuration uses the same configuration syntax/design from OpenTelemetry Collector. For more information regarding OpenTelemetry Collector configuration please refer to the upstream documentation. so you can customize or port your OpenTelemetry Collector configuration files when running ADOT Collector. Please refer to the Try out the ADOT Collector section on configuring ADOT Collector.

Try out the ADOT Collector

The ADOT Collector supports all AWS computing platforms and Docker/Kubernetes. Here are some examples on how to run the ADOT Collector to send telemetry data:

Build Your Own Artifacts

Use the following instructions to build your own ADOT Collector artifacts:

Development

See docs/developers.

Benchmark

The latest performance report is here, while the trends by testcase can be found here. Both are updated on each successful CI run. The charts use the github-action-benchmark action and uses a modified layout to group the testcases. The performance test can be conducted by following the instructions here.

Support

Please note that as per policy, we're providing support via GitHub on a best effort basis. However, if you have AWS Enterprise Support you can create a ticket and we will provide direct support within the respective SLAs.

For each merged pull request, a corresponding image with the naming convention of [ADOT_COLLECTOR_VERSION]-[GITHUB_SHA] is pushed to public.ecr.aws/aws-otel-test/adot-collector-integration-test. This image is used for the integration tests. You can pull any of the images from there, however, we will not support any issues and pull requests for these test images.

Supported Versions

Each ADOT Collector release is supported until there are two newer minor releases. For example, ADOT collector v0.16.1 will be supported until v0.18.0 is released.

Security issue notifications

If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our vulnerability reporting page. Please do not create a public github issue.

License

ADOT Collector is licensed under an Apache 2.0 license.

aws-otel-collector's People

Contributors

alexperez52 avatar alolita avatar aneurysm9 avatar bjrara avatar bryan-aguilar avatar dependabot[bot] avatar erichsueh3 avatar github-actions[bot] avatar haojhcwa avatar hossain-rayhan avatar humivo avatar jefchien avatar johnwu20 avatar kausik-a avatar khanhntd avatar kohrapha avatar mxiamxia avatar paurushgarg avatar pingleig avatar pxaws avatar rapphil avatar saber-w avatar sethamazon avatar shaochengwang avatar straussb avatar tigrannajaryan avatar vasireddy99 avatar vastin avatar willarmiros avatar wytrivail avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aws-otel-collector's Issues

Add performance numbers for New Relic exporter with real back-end

We've tested the AWS Otel Collector with New Relic's exporter reporting to New Relic's real backend.

I followed the get-performance-model instructions in the aws-otel-test-framework repo.

I've pointed the newrelic_exporter_trace_mock test suite to the New Relic staging backend at https://staging-trace-api.newrelic.com/trace/v1.
The results:

{
  "testcase": "newrelic_exporter_trace_mock",
  "instanceType": "m5.2xlarge",
  "receivers": [
    "otlp"
  ],
  "processors": [
    "batch"
  ],
  "exporters": [
    "newrelic"
  ],
  "dataType": "otlp",
  "dataMode": "trace",
  "dataRate": 100,
  "avgCpu": 4.175036859934096,
  "avgMem": 63.16352853333333,
  "commitId": "dummy_commit",
  "collectionPeriod": 10,
  "testingAmi": "soaking_linux"
}
{
  "testcase": "newrelic_exporter_trace_mock",
  "instanceType": "m5.2xlarge",
  "receivers": [
    "otlp"
  ],
  "processors": [
    "batch"
  ],
  "exporters": [
    "newrelic"
  ],
  "dataType": "otlp",
  "dataMode": "trace",
  "dataRate": 1000,
  "avgCpu": 26.070249748225024,
  "avgMem": 71.18219946666667,
  "commitId": "dummy_commit",
  "collectionPeriod": 10,
  "testingAmi": "soaking_linux"
}
{
  "testcase": "newrelic_exporter_trace_mock",
  "instanceType": "m5.2xlarge",
  "receivers": [
    "otlp"
  ],
  "processors": [
    "batch"
  ],
  "exporters": [
    "newrelic"
  ],
  "dataType": "otlp",
  "dataMode": "trace",
  "dataRate": 5000,
  "avgCpu": 105.39248211531364,
  "avgMem": 146.71653546666667,
  "commitId": "dummy_commit",
  "collectionPeriod": 10,
  "testingAmi": "soaking_linux"
}

Performance model should include throughput

Description of the problem
The performance model currently measures CPU and memory, but not requests/second.

Motivation
From my experience at Datadog, that is an important measure to take into account when comparing different solutions (eg: single threaded vs multi-threaded) and can help identify problems (eg: mutexes causing contention).

Additional context
The implementation can be as simple as measuring the time from beginning to end of the test and dividing between the number of requests.

Requesting Jaeger & Zipkin Receivers/Exporters

Is your feature request related to a problem? Please describe.
We would like to explore xray, but our apps emit jaeger/zipkin spans. Requesting to enable jaeger/zipkin receiver and exporter

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

awsxrayreceiver throws fatal errors when the input data load is intensiv

I've just tested awsxrayreceiver if the input data load to the receiver is intensive (300 segments/s in my test), 'awsxrayreceiver' will give the fatal error and crash the collector. I've done the same test on OTLP receiver with even higher data load(1000 spans/s) but everything works fine. We need to resolve this issue, otherwise we'll consider to revert awsxrayreceiver in the collector. @JohnWu20

2020-11-03T12:17:52.990-0800    WARN    [email protected]/receiver.go:131   X-Ray segment to OT traces conversion failed    {"component_kind": "receiver", "component_type": "awsxray", "component_name": "awsxray", "error": "unexpected end of JSON input"}
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsxrayreceiver.(*xrayReceiver).start
        /Users/xiami/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/[email protected]/receiver.go:131
2020-11-03 12:17:52.990843 -0800 PST m=+541.715129136 write error: can't rename log file: rename /opt/aws/aws-otel-collector/logs/aws-otel-collector.log /opt/aws/aws-otel-collector/logs/aws-otel-collector-2020-11-03T20-17-52.990.log: permission denied
panic: JSON decoder out of sync - data changing underfoot?

goroutine 49 [running]:
encoding/json.(*decodeState).value(0xc001427810, 0x0, 0x0, 0x0, 0x5, 0xf92e)
        /usr/local/go/src/encoding/json/decode.go:373 +0x234
encoding/json.(*decodeState).object(0xc001427810, 0x6851060, 0xc00084dce8, 0x199, 0xc001427838, 0x7b)
        /usr/local/go/src/encoding/json/decode.go:782 +0x12e5
encoding/json.(*decodeState).value(0xc001427810, 0x6851060, 0xc00084dce8, 0x199, 0x6851060, 0xc00084dce8)
        /usr/local/go/src/encoding/json/decode.go:387 +0x6d
encoding/json.(*decodeState).array(0xc001427810, 0x6181280, 0xc0004b0810, 0x197, 0xc001427838, 0x5b)
        /usr/local/go/src/encoding/json/decode.go:575 +0x1a7
encoding/json.(*decodeState).value(0xc001427810, 0x6181280, 0xc0004b0810, 0x197, 0x6181280, 0xc0004b0810)
        /usr/local/go/src/encoding/json/decode.go:377 +0xfd
encoding/json.(*decodeState).object(0xc001427810, 0x6344620, 0xc0004b0780, 0x16, 0xc001427838, 0xc00044dd7b)
        /usr/local/go/src/encoding/json/decode.go:782 +0x12e5
encoding/json.(*decodeState).value(0xc001427810, 0x6344620, 0xc0004b0780, 0x16, 0xc00067d9a0, 0x43892f6)
        /usr/local/go/src/encoding/json/decode.go:387 +0x6d
encoding/json.(*decodeState).unmarshal(0xc001427810, 0x6344620, 0xc0004b0780, 0xc001427838, 0x0)
        /usr/local/go/src/encoding/json/decode.go:180 +0x1f0
encoding/json.Unmarshal(0xc00085e021, 0xc49, 0xffdf, 0x6344620, 0xc0004b0780, 0x4859e77, 0x851aa80)
        /usr/local/go/src/encoding/json/decode.go:107 +0x112
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsxrayreceiver/internal/translator.ToTraces(0xc00085e021, 0xc49, 0xffdf, 0x7, 0x3, 0x0, 0x0)
        /Users/xiami/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/[email protected]/internal/translator/translator.go:39 +0x8b
github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsxrayreceiver.(*xrayReceiver).start(0xc0002c0230)
        /Users/xiami/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/[email protected]/receiver.go:129 +0x111
created by github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsxrayreceiver.(*xrayReceiver).Start.func1
        /Users/xiami/go/pkg/mod/github.com/open-telemetry/opentelemetry-collector-contrib/receiver/[email protected]/receiver.go:96 +0xd9
Process finished with exit code 2

Performance test results for Datadog exporter against real backend

Hi,

Please find the performance results for the datadog metric and trace exporter on commit b3b6b2f below. The tests were run against an internal environment following the instructions of the aws-otel-terraform repository.

Metrics

100 TPS

{
   "testcase":"datadog_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":100,
   "avgCpu":0.07166817860385208,
   "avgMem":59.631684266666674,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

1000 TPS

{
   "testcase":"datadog_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":1000,
   "avgCpu":0.0816695860714726,
   "avgMem":59.21532586666666,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

5000 TPS

{
   "testcase":"datadog_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":5000,
   "avgCpu":0.07000132449577053,
   "avgMem":59.1806464,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

Traces

100 TPS

{
   "testcase":"datadog_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":100,
   "avgCpu":4.32168270134054,
   "avgMem":70.52035413333334,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

1000 TPS

{
   "testcase":"datadog_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":1000,
   "avgCpu":24.18414959058314,
   "avgMem":95.29698986666668,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

5000 TPS

{
   "testcase":"datadog_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "datadog"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":5000,
   "avgCpu":106.21314370620614,
   "avgMem":385.15418453333336,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

StatsD receiver and EMF Exporter Misses Metrics in Cloud Watch Console and Has Empty Dimension

I am testing locally using StatsD receiver with EMF Exporter

Here is the OT config I use:
receivers:
statsd:
endpoint: "0.0.0.0:8125"
exporters:
awsemf:
namespace: โ€˜ot-receiver-testโ€™
region: 'us-west-2'
log_group_name: 'aoc-emf-ecs-test-1016-new'
log_stream_name: 'aoc-emf-ecs-test-1016-new'
logging:
loglevel: debug
service:
pipelines:
metrics:
receivers: [statsd]
exporters: [awsemf, logging]

These are the message I sent:
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:1.6|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:2.8|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:12.8|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:22.8|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:42.8|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:52.8|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:62.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:72.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:52.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:82.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:32.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125
kunyuz@f8ffc23d3a03 ~ % echo "statsdTestMetric:92.9|g|#mykey:myvalue,mykey1:myvalue1" | nc -w 1 -u localhost 8125

Can see EMF logs in the log groups I set:
image
{
"_aws": {
"CloudWatchMetrics": [
{
"Namespace": "โ€˜ot-receiver-testโ€™",
"Dimensions": [
[
"mykey",
"mykey1"
],
[],
[
"mykey"
],
[
"mykey1"
]
],
"Metrics": [
{
"Name": "statsdTestMetric",
"Unit": ""
}
]
}
],
"Timestamp": 1602894698208
},
"mykey": "myvalue",
"mykey1": "myvalue1",
"statsdTestMetric": 0.1431372829680947
}

Cannot see the metrics in the console(this one is I was testing using the old version):
image

Publish the Docker image

We can publish the Docker image for the connivence and add docker run instructions to the README.

docker run -d --volume=config.yaml:/etc/otel-config.yaml -p 55680:55680 -p 55861:55681 aws-observability/aws-otel-collector 

EKS Collector vs Auto Instrumentation Agent port mismatch (4317 vs 55680)

I have the aws-otel-collector deployed in my EKS cluster using the amazon/aws-otel-collector:latest image and noticed it listens on port 55680 by default.

When I added the AWS Auto Instrumentation Agent per the docs https://aws-otel.github.io/docs/getting-started/java-sdk/trace-auto-instr here, the agent started up trying to connect on port 4317 (port is documented here https://aws-otel.github.io/docs/getting-started/java-sdk/trace-auto-instr#advanced-configuration-of-the-auto-instrumentation-agent).

Which one of these is the correct port?

Performance test results for Splunk's SAPM exporter against real backend

Hello,

Please find the performance test result for sapm exporter. The test was run against an internal staging environment. I followed the instructions here to obtain these results.

{
   "testcase":"sapm_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "sapm"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":100,
   "avgCpu":2.4700100379242533,
   "avgMem":77.02623573333332,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}
{
   "testcase":"sapm_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "sapm"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":1000,
   "avgCpu":20.8646426006895,
   "avgMem":79.49979496296295,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}
{
   "testcase":"sapm_exporter_trace_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "sapm"
   ],
   "dataType":"otlp",
   "dataMode":"trace",
   "dataRate":5000,
   "avgCpu":97.61965412318149,
   "avgMem":88.73377185185187,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

[Datadog Exporter] Performance test anomaly

What's the issue?

In the performance test of AWS Otel Collector, we found the CPU usage and memory usage of the datadog testcase are over the realistic level. check the performance report

How do we run the performance test?

We have a github workflow to run performance test upon each testcase defined in the testing framework every week.

How to reproduce the performance test?

Please check run performance test in testing framework.

What's the action?

We'd like to have Datadog engineer to investigate this issue and fix it before AWS Otel Collector GA Release.

cc: @alolita @ntyrewalla @mx-psi

ECS Metrics not working on Fargate

ECS Metrics test was failing on FARGATE.

the platform version is 1.4.0.

this is the testing result.

module.validator_without_sample_app[0].null_resource.validator (local-exec): validator_1  | com.amazon.aoc.exception.BaseException: metric in toBeCheckedMetricList {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]} not found in baseMetricList: [{Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.cpu.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.reserved_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.memory.utilized_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.rx_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.network.rate.tx_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.read_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: []}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}, {Name: ecs.service,Value: undefined}, {Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}, {Name: ecs.task-definition-version,Value: 1}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.cluster,Value: arn:aws:ecs:us-west-2:***:cluster/aoc-testing-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.service,Value: undefined}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-family,Value: taskdef-e51a5848085cf11c}]}, {Namespace: aws-otel/aws-otel-integ-test,MetricName: ecs.task.storage.write_bytes_e51a5848085cf11c,Dimensions: [{Name: ecs.task-definition-version,Value: 1}]}]

[NewRelic Exporter] Performance test anomaly

What's the issue?

In the performance test of AWS Otel Collector, we found the CPU usage and memory usage of the newRelic testcase are over the realistic level. check the performance report.

How do we run the performance test?

We have a github workflow to run performance test upon each testcase defined in the testing framework every week.

How to reproduce the performance test?

Please check run performance test in testing framework.

What's the action?

We'd like to have NewRelic engineer to investigate this issue and fix it before AWS Otel Collector GA Release.

cc: @alolita @ntyrewalla @a-feld

Docker-demo doesn't work

Hello,

https://github.com/aws-observability/aws-otel-collector/blob/main/docs/developers/docker-demo.md isn't working

Pulling ot-sample-app (mxiamxia/aws-otel-collector-java-sample-app:0.4.0)... ERROR: pull access denied for mxiamxia/aws-otel-collector-java-sample-app, repository does not exist or may require 'docker login': denied: requested access to the resource is denied

On docker hub I can find aottestbed/aws-otel-collector-java-sample-app, is that legit? Not gonna bleed my aws credentials to someone with that, am I?

Performance test results for Splunk's signalfx exporter against real backend

Hi,

Please find the performance test results for signalfx metrics exporter. The test was run against an internal staging environment following the instructions here to obtain these results.

{
   "testcase":"signalfx_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "signalfx"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":100,
   "avgCpu":0.07499773424054576,
   "avgMem":61.04519111111111,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}
{
   "testcase":"signalfx_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "signalfx"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":1000,
   "avgCpu":0.08333150251368432,
   "avgMem":60.257272414814814,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}
{
   "testcase":"signalfx_exporter_metric_mock",
   "instanceType":"m5.2xlarge",
   "receivers":[
      "otlp"
   ],
   "processors":[
      "batch"
   ],
   "exporters":[
      "signalfx"
   ],
   "dataType":"otlp",
   "dataMode":"metric",
   "dataRate":5000,
   "avgCpu":0.0816654319905637,
   "avgMem":61.2560896,
   "commitId":"dummy_commit",
   "collectionPeriod":10,
   "testingAmi":"soaking_linux"
}

OTEL Client missing AwsXRayIdGenerator silently fails

When using the javascript otel libraries in combination with the aws-otel-collector, it's important that you override the default otel span id generator. While the documentation does highlight this, if mis-configured no errors are logged.

In troubleshooting the issue I turned log levels up to debug on both the javascript otel exporter and the aws-otel-collector, and it was not obvious that an actual error was occuring - only a lack of traces showing up in x-ray. :-)

Ideally I would have liked to see the aws-otel-collector detect the invalid span ids and log an error.

I've included a simple example if you would like to re-create the scenario:

const api = require("@opentelemetry/api");
const { NodeTracerProvider } = require("@opentelemetry/node");
const { SimpleSpanProcessor, ConsoleSpanExporter } = require("@opentelemetry/tracing");
const { CollectorTraceExporter } = require('@opentelemetry/exporter-collector');

const { AWSXRayPropagator } = require('@aws/otel-aws-xray-propagator');
const { AwsXRayIdGenerator } = require('@aws/otel-aws-xray-id-generator');

const SERVICE_NAME = 'otel-test';
const SERVICE_VERSION = '0.1.0';
const OTEL_COLLECTOR_ENDPOINT = 'http://localhost:55681/v1/traces';

// AWS X-Ray support
const idGenerator = new AwsXRayIdGenerator();
const propagator = new AWSXRayPropagator();
api.propagation.setGlobalPropagator(propagator);

// configure tracer provider
const tracerProvider = new NodeTracerProvider({
    // IMPORTANT - Notice I've commented out the idGenerator? 
    // This will result in a silent failure!
    // idGenerator,
});

// otel exporter
const otlpExporter = new CollectorTraceExporter({
    SERVICE_NAME,
    url: OTEL_COLLECTOR_ENDPOINT,
    protocolNode: 2,
    logger: console
});
tracerProvider.addSpanProcessor(new SimpleSpanProcessor(otlpExporter));

// console exporter
const consoleExporter = new ConsoleSpanExporter();
tracerProvider.addSpanProcessor(new SimpleSpanProcessor(consoleExporter));

// register tracerProvider global reference
tracerProvider.register();



async function main() {
    const tracer = api.trace.getTracer(SERVICE_NAME, SERVICE_VERSION);

    // AWS X-Ray OTEL Collector only supports CLIENT, SERVER
    // for the root span at the moment, maybe a bug?
    const rootSpan = tracer.startSpan('main', {
        parent: null,
        kind: api.SpanKind.SERVER 
    });

    await tracer.withSpan(rootSpan, async () => {
        const childSpan = tracer.startSpan('op', { parent: tracer.getCurrentSpan() });
        await sleep(2000);
        childSpan.end();
    });

    rootSpan.end();
}

function sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
}

main().catch(console.error);

Include OTLPHTTP Exporter from upstream

Currently only OTLP exporter is included in the distro which exports the data using grpc. OTLPHTTP provides another option to export the OT data (OTLP) as JSON over HTTP.

Fargate CFN Example Issues

I was going through the CFN Fargate example here: https://aws-otel.github.io/docs/setup/ecs

I will detail all of the issues I found:

  1. The command has --template-body file:///<CFN_File_Downloaded> which does not work, it should be something like `$(pwd)/${CFN_File_Downloaded}
  2. The command tells you to set CreateIAMRoles to False, but you need to set it to True for it to work.
  3. Once you fix the above two, the sample service is created but then it quits shortly after task launch. Due to the following logs:
Exception in thread "main" java.lang.RuntimeException: s3region is empty
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused

It looks like the version of the sample app used is out of date, I think it should be this instead: https://hub.docker.com/r/aottestbed/aws-otel-collector-java-sample-app/tags?page=1&ordering=last_updated

ASGI request errors not propagating to root trace

It seems that the collector's AWS X-Ray exporter is mishandling the interpretation of the root trace's status for ASGI spans.

Given an instrumented FastAPI application:

import fastapi
from opentelemetry import trace
from opentelemetry.exporter.otlp.trace_exporter import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import (
    ConsoleSpanExporter,
    SimpleExportSpanProcessor,
)

from opentelemetry.sdk.extension.aws.trace import AwsXRayIdsGenerator
from opentelemetry import propagators
from opentelemetry.sdk.extension.aws.trace.propagation.aws_xray_format import AwsXRayFormat

propagators.set_global_textmap(AwsXRayFormat())

# ALT: sanity check using console span exporter
# trace.set_tracer_provider(
#     TracerProvider(ids_generator=AwsXRayIdsGenerator())
# )
# trace.get_tracer_provider().add_span_processor(
#     SimpleExportSpanProcessor(ConsoleSpanExporter())
# )

# using OTLP exporter with XRay support
otlp_exporter = OTLPSpanExporter(endpoint="localhost:55680", insecure=True)
span_processor = SimpleExportSpanProcessor(otlp_exporter)
trace.set_tracer_provider(TracerProvider(active_span_processor=span_processor, ids_generator=AwsXRayIdsGenerator()))

app = fastapi.FastAPI()

@app.get("/")
async def hello():
    return {"message": "hello world"}

FastAPIInstrumentor.instrument_app(app)

i'm running AWS OTel Collector version: v0.6.0 in a docker container locally and then invoking the application this way:

uvicorn fastapi_example_instrumented_xray:app --reload

Using the console span exporter, i see three spans for an HTTP 404, two having status UNSET and the third as ERROR:

{
    "name": "GET asgi.http.send",
    "context": {
        "trace_id": "0x600a0aa3734259d834dd1315e74cf9ea",
        "span_id": "0x6d02afce8b99ef53",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x7ab7d27035511c5b",
    "start_time": "2021-01-21T23:13:39.569816Z",
    "end_time": "2021-01-21T23:13:39.570079Z",
    "status": {
        "status_code": "ERROR"
    },
    "attributes": {
        "http.status_code": 404,
        "type": "http.response.start"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "0.17b0"
    }
}
{
    "name": "GET asgi.http.send",
    "context": {
        "trace_id": "0x600a0aa3734259d834dd1315e74cf9ea",
        "span_id": "0xb7f5be18d868f596",
        "trace_state": "[]"
    },
    "kind": "SpanKind.INTERNAL",
    "parent_id": "0x7ab7d27035511c5b",
    "start_time": "2021-01-21T23:13:39.570549Z",
    "end_time": "2021-01-21T23:13:39.570620Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "type": "http.response.body"
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "0.17b0"
    }
}
{
    "name": "GET asgi",
    "context": {
        "trace_id": "0x600a0aa3734259d834dd1315e74cf9ea",
        "span_id": "0x7ab7d27035511c5b",
        "trace_state": "[]"
    },
    "kind": "SpanKind.SERVER",
    "parent_id": null,
    "start_time": "2021-01-21T23:13:39.569437Z",
    "end_time": "2021-01-21T23:13:39.570793Z",
    "status": {
        "status_code": "UNSET"
    },
    "attributes": {
        "component": "http",
        "http.scheme": "http",
        "http.host": "127.0.0.1:8000",
        "net.host.port": 8000,
        "http.flavor": "1.1",
        "http.target": "/invalid",
        "http.url": "http://127.0.0.1:8000/invalid",
        "http.method": "GET",
        "http.server_name": "localhost:8000",
        "http.user_agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0",
        "net.peer.ip": "127.0.0.1",
        "net.peer.port": 60934
    },
    "events": [],
    "links": [],
    "resource": {
        "telemetry.sdk.language": "python",
        "telemetry.sdk.name": "opentelemetry",
        "telemetry.sdk.version": "0.17b0"
    }
}

But using the collector with the X-Ray exporter, X-Ray is showing the trace as a 200/OK result with fault and error as "false"

{
    "Id": "1-600a088a-109dbb153757333fdada5f70",
    "Duration": 0.106,
    "LimitExceeded": false,
    "Segments": [
        {
            "Id": "f405f8dc0560a892",
            "Document": {
                "id": "f405f8dc0560a892",
                "name": "127.0.0.1:8000",
                "start_time": 1611270282.307393,
                "trace_id": "1-600a088a-109dbb153757333fdada5f70",
                "end_time": 1611270282.4130836,
                "fault": false,
                "error": false,
                "http": {
                    "request": {
                        "url": "http://127.0.0.1:8000/invalid",
                        "method": "GET",
                        "user_agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0",
                        "client_ip": "127.0.0.1"
                    },
                    "response": {
                        "status": 200,
                        "content_length": 0
                    }
                },
                "aws": {
                    "xray": {
                        "auto_instrumentation": false,
                        "sdk_version": "0.17b0",
                        "sdk": "opentelemetry for python"
                    }
                },
                "metadata": {
                    "default": {
                        "otel.resource.telemetry.sdk.name": "opentelemetry",
                        "http.flavor": "1.1",
                        "net.host.port": "",
                        "otel.resource.telemetry.sdk.language": "python",
                        "otel.resource.telemetry.sdk.version": "0.17b0"
                    }
                },
                "subsegments": [
                    {
                        "id": "b1bab769fcb3a95f",
                        "name": "GET asgi.http.send",
                        "start_time": 1611270282.307856,
                        "end_time": 1611270282.3082073,
                        "fault": false,
                        "error": true,
                        "http": {
                            "request": {},
                            "response": {
                                "status": 404,
                                "content_length": 0
                            }
                        },
                        "aws": {
                            "xray": {
                                "auto_instrumentation": false,
                                "sdk_version": "0.17b0",
                                "sdk": "opentelemetry for python"
                            }
                        },
                        "metadata": {
                            "default": {
                                "type": "http.response.start"
                            }
                        }
                    },
                    {
                        "id": "ba50a1de9e723e58",
                        "name": "GET asgi.http.send",
                        "start_time": 1611270282.3615057,
                        "end_time": 1611270282.3615806,
                        "fault": false,
                        "error": false,
                        "aws": {
                            "xray": {
                                "auto_instrumentation": false,
                                "sdk_version": "0.17b0",
                                "sdk": "opentelemetry for python"
                            }
                        },
                        "metadata": {
                            "default": {
                                "type": "http.response.body"
                            }
                        }
                    }
                ]
            }
        }
    ]
}

So it seems to me that the X-Ray exporter is not correctly interpreting the trace status based on the 3 ASGI spans.

XRay exporter is failing the exports to the collector

When exporting spans to the collector via aws-opentelemetry-agent-0.9.0.jar, the user sees the following error. Switching to Jaeger export on the collector resolves the problem.

WARN io.opentelemetry.exporters.otlp.OtlpGrpcSpanExporter - Failed to export spans
io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 952949744ns. [buffered_nanos=957880184, waiting_for_connection]

ecs-demo.md references missing CloudFormation templates

https://github.com/aws-observability/aws-otel-collector/blob/main/docs/developers/ecs-demo.md

References under Install AWSOTelCollector on ECS EC2 to use the curl command but the .template files just result in a 404.

curl -O https://github.com/mxiamxia/aws-opentelemetry-collector/blob/master/deployment-template/ecs/aws-otel-ec2-sidecar-deployment-cfn.template

curl -O https://github.com/mxiamxia/aws-opentelemetry-collector/blob/master/deployment-template/ecs/aws-otel-fargate-sidecar-deployment-cfn.template

Add setup instructions for the AWS collector on Elastic Beanstalk

We currently provide step-by-step instructions for running the collector on ECS, EKS, and EC2. We should also provide documentation and a sample YAML file for running the collector on Elastic Beanstalk. The below docs describe how to run custom software on the instances for a beanstalk environment using Linux or Windows, and the structure of the yaml file we'd provide to customers.

https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html#linux-commands
https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-windows-ec2.html#windows-commands

A couple caveats:

  1. customers do need to enable the X-Ray Daemon in their environment for the EB resource detector to collect and record metadata (except in the docker platforms where it's not available).
  2. Beanstalk supports a multi-container docker platform that uses ECS under the hood, but is configured with a dockerrun.aws.json file instead of a task definition (madness I know) so we should also include a sample for that too.

Local dev environment support

$ docker run amazon/aws-otel-collector

fails with the following due to Unable to retrieve the region from the EC2 instance, we should relax some of these requirements to support a local dev environment.

Unable to find image 'amazon/aws-otel-collector:latest' locally
latest: Pulling from amazon/aws-otel-collector
6d2849115956: Pull complete
cb2a5307368e: Pull complete
8d5348a2246c: Pull complete
Digest: sha256:d6b798ee33476aa3d23bfafed57a1094b291328e30887cdf4d429c6dccc720a0
Status: Downloaded newer image for amazon/aws-otel-collector:latest
AWS OTel Collector version: v0.4.0
2020-12-11T20:13:49.749Z	INFO	service/service.go:397	Starting AWS OTel Collector...	{"Version": "v0.4.0", "GitHash": "be64e63fbd972170e024cbf10d41b7fad0e94394", "NumCPU": 4}
2020-12-11T20:13:49.749Z	INFO	service/service.go:241	Setting up own telemetry...
2020-12-11T20:13:49.752Z	INFO	service/telemetry.go:101	Serving Prometheus metrics	{"address": "localhost:8888", "level": 0, "service.instance.id": "31ef0e14-ae9f-4627-827e-d4c33f4e56a8"}
2020-12-11T20:13:49.752Z	INFO	service/service.go:278	Loading configuration...
2020-12-11T20:13:49.755Z	INFO	service/service.go:289	Applying configuration...
2020-12-11T20:13:49.756Z	INFO	service/service.go:310	Starting extensions...
2020-12-11T20:13:49.756Z	INFO	builder/extensions_builder.go:53	Extension is starting...	{"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-12-11T20:13:49.756Z	INFO	healthcheckextension/healthcheckextension.go:40	Starting health_check extension	{"component_kind": "extension", "component_type": "health_check", "component_name": "health_check", "config": {"TypeVal":"health_check","NameVal":"health_check","Port":13133}}
2020-12-11T20:13:49.756Z	INFO	builder/extensions_builder.go:59	Extension started.	{"component_kind": "extension", "component_type": "health_check", "component_name": "health_check"}
2020-12-11T20:13:50.106Z	ERROR	[email protected]/conn.go:71	Unable to retrieve the region from the EC2 instance	{"component_kind": "exporter", "component_type": "awsemf", "component_name": "awsemf", "error": "EC2MetadataRequestError: failed to get EC2 instance identity document\ncaused by: EC2MetadataError: failed to make EC2Metadata request\n\tstatus code: 403, request id: \ncaused by: "}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.GetAWSConfigSession
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/conn.go:71
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.New
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/emf_exporter.go:61
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.NewEmfExporter
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/emf_exporter.go:101
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.createMetricsExporter
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/factory.go:68
go.opentelemetry.io/collector/exporter/exporterhelper.(*factory).CreateMetricsExporter
	go.opentelemetry.io/[email protected]/exporter/exporterhelper/factory.go:127
go.opentelemetry.io/collector/service/builder.(*ExportersBuilder).buildExporter
	go.opentelemetry.io/[email protected]/service/builder/exporters_builder.go:266
go.opentelemetry.io/collector/service/builder.(*ExportersBuilder).Build
	go.opentelemetry.io/[email protected]/service/builder/exporters_builder.go:172
go.opentelemetry.io/collector/service.(*Application).setupPipelines
	go.opentelemetry.io/[email protected]/service/service.go:320
go.opentelemetry.io/collector/service.(*Application).setupConfigurationComponents
	go.opentelemetry.io/[email protected]/service/service.go:296
go.opentelemetry.io/collector/service.(*Application).execute
	go.opentelemetry.io/[email protected]/service/service.go:415
go.opentelemetry.io/collector/service.New.func1
	go.opentelemetry.io/[email protected]/service/service.go:154
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/[email protected]/command.go:850
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/[email protected]/command.go:958
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/[email protected]/command.go:895
go.opentelemetry.io/collector/service.(*Application).Run
	go.opentelemetry.io/[email protected]/service/service.go:468
main.runInteractive
	aws-observability.io/collector/cmd/awscollector/main.go:75
main.run
	aws-observability.io/collector/cmd/awscollector/main_others.go:24
main.main
	aws-observability.io/collector/cmd/awscollector/main.go:59
runtime.main
	runtime/proc.go:204
2020-12-11T20:13:50.107Z	ERROR	[email protected]/conn.go:79	Cannot fetch region variable from config file, environment variables and ec2 metadata.	{"component_kind": "exporter", "component_type": "awsemf", "component_name": "awsemf"}
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.GetAWSConfigSession
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/conn.go:79
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.New
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/emf_exporter.go:61
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.NewEmfExporter
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/emf_exporter.go:101
github.com/open-telemetry/opentelemetry-collector-contrib/exporter/awsemfexporter.createMetricsExporter
	github.com/open-telemetry/opentelemetry-collector-contrib/exporter/[email protected]/factory.go:68
go.opentelemetry.io/collector/exporter/exporterhelper.(*factory).CreateMetricsExporter
	go.opentelemetry.io/[email protected]/exporter/exporterhelper/factory.go:127
go.opentelemetry.io/collector/service/builder.(*ExportersBuilder).buildExporter
	go.opentelemetry.io/[email protected]/service/builder/exporters_builder.go:266
go.opentelemetry.io/collector/service/builder.(*ExportersBuilder).Build
	go.opentelemetry.io/[email protected]/service/builder/exporters_builder.go:172
go.opentelemetry.io/collector/service.(*Application).setupPipelines
	go.opentelemetry.io/[email protected]/service/service.go:320
go.opentelemetry.io/collector/service.(*Application).setupConfigurationComponents
	go.opentelemetry.io/[email protected]/service/service.go:296
go.opentelemetry.io/collector/service.(*Application).execute
	go.opentelemetry.io/[email protected]/service/service.go:415
go.opentelemetry.io/collector/service.New.func1
	go.opentelemetry.io/[email protected]/service/service.go:154
github.com/spf13/cobra.(*Command).execute
	github.com/spf13/[email protected]/command.go:850
github.com/spf13/cobra.(*Command).ExecuteC
	github.com/spf13/[email protected]/command.go:958
github.com/spf13/cobra.(*Command).Execute
	github.com/spf13/[email protected]/command.go:895
go.opentelemetry.io/collector/service.(*Application).Run
	go.opentelemetry.io/[email protected]/service/service.go:468
main.runInteractive
	aws-observability.io/collector/cmd/awscollector/main.go:75
main.run
	aws-observability.io/collector/cmd/awscollector/main_others.go:24
main.main
	aws-observability.io/collector/cmd/awscollector/main.go:59
runtime.main
	runtime/proc.go:204
Error: cannot setup pipelines: cannot build builtExporters: error creating awsemf exporter: NoAwsRegion: Cannot fetch region variable from config file, environment variables and ec2 metadata.

EC2 instance type

Is there a minimum EC2 instance the package will run on? I'm using a t2-micro to test out and the RPM instructions are failing.

Docker example fails with "bucketName is empty" and "DEADLINE_EXCEEDED"

I'm trying to get otel working. I was following the instructions here:

https://aws-otel.github.io/docs/setup/docker-images

After running into a number of weird issues trying to instrument my own app & generate traces, I tried to simplify the problem by just running the example emitter so I could work backwards. Unfortunately, the instructions do not work for me.

I filled out the following env vars in the docker compose:

      - AWS_ACCESS_KEY_ID
      - AWS_SECRET_ACCESS_KEY
      - AWS_SESSION_TOKEN
      - AWS_REGION

the first 3 with output from the awsapps sso page, and the last as us-east-2. When running I had two interesting issues. First, it attempts to connect to localhost:4567 for some reason that's not explained

aws-ot-collector_1   | 2020-11-07T00:05:09.996Z INFO    service/service.go:252  Everything is ready. Begin running and processing data.
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed to connect to localhost port 4567: Connection refused
ot-metric-emitter_1  | [opentelemetry.auto.trace 2020-11-07 00:05:20:224 +0000] [main] INFO io.opentelemetry.javaagent.tooling.TracerInstaller - Installed span exporter: io.opentelemetry.exporters.otlp.OtlpGrpcSpanExporter

Don't know what that's about.

Anyway, it then has two following issues repeatedly

ot-metric-emitter_1  | [opentelemetry.auto.trace 2020-11-07 00:05:22:278 +0000] [grpc-default-executor-0] WARN io.opentelemetry.exporters.otlp.OtlpGrpcMetricExporter - Failed to export metrics
ot-metric-emitter_1  | io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 974524962ns. [buffered_nanos=976275332, waiting_for_connection]
ot-metric-emitter_1  |  at io.grpc.Status.asRuntimeException(Status.java:533)
ot-metric-emitter_1  |  at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:515)
ot-metric-emitter_1  |  at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
ot-metric-emitter_1  |  at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
ot-metric-emitter_1  |  at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
ot-metric-emitter_1  |  at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
ot-metric-emitter_1  |  at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
ot-metric-emitter_1  |  at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
ot-metric-emitter_1  |  at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
ot-metric-emitter_1  |  at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:510)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:630)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:518)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:692)
ot-metric-emitter_1  |  at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:681)
ot-metric-emitter_1  |  at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
ot-metric-emitter_1  |  at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
ot-metric-emitter_1  |  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
ot-metric-emitter_1  |  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
ot-metric-emitter_1  |  at java.lang.Thread.run(Thread.java:748)

and also

ot-metric-emitter_1  |   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
ot-metric-emitter_1  |                                  Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0Executing request GET http://localhost:4567/span1 HTTP/1.1
ot-metric-emitter_1  | Executing request GET http://localhost:4567/span2 HTTP/1.1
ot-metric-emitter_1  | emit metric with return time 1,/span2,200
ot-metric-emitter_1  | emit metric with http request size 926 byte, /span2
ot-metric-emitter_1  | ----------------------------------------
ot-metric-emitter_1  | 9472e6246b032793
ot-metric-emitter_1  | emit metric with return time 35,/span1,200
ot-metric-emitter_1  | emit metric with http request size 198 byte, /span1
ot-metric-emitter_1  | ----------------------------------------
ot-metric-emitter_1  | d01381d7026198e5,9472e6246b032793
ot-metric-emitter_1  | java.lang.RuntimeException: bucketName is empty
ot-metric-emitter_1  |  at com.amazon.aocagent.S3Service.uploadTraceData(S3Service.java:36)
ot-metric-emitter_1  |  at com.amazon.aocagent.App.lambda$main$0(App.java:81)
ot-metric-emitter_1  |  at spark.ResponseTransformerRouteImpl$1.handle(ResponseTransformerRouteImpl.java:47)
ot-metric-emitter_1  |  at spark.http.matching.Routes.execute(Routes.java:61)
ot-metric-emitter_1  |  at spark.http.matching.MatcherFilter.doFilter(MatcherFilter.java:134)
ot-metric-emitter_1  |  at spark.embeddedserver.jetty.JettyHandler.doHandle(JettyHandler.java:50)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1584)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.Server.handle(Server.java:501)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:556)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
ot-metric-emitter_1  |  at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:272)
ot-metric-emitter_1  |  at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
ot-metric-emitter_1  |  at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
ot-metric-emitter_1  |  at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
ot-metric-emitter_1  |  at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
ot-metric-emitter_1  |  at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
ot-metric-emitter_1  |  at java.lang.Thread.run(Thread.java:748)

Was there supposed to be a third image included in the docker-compose or something? I'm very confused. I'd have looked at the sample app source code, but I can't find it. Is it public?

UNKNOWN: NoCredentialProviders if AWS_PROFILE is specified

Hello,

I tried to set up locally the collector with the following configuration:

version: "3"
services:
  aws-ot-collector:
    image: amazon/aws-otel-collector:latest
    command: ["--config=/etc/otel-agent-config.yaml", "--log-level=DEBUG"]
    environment:
      - AWS_PROFILE=p3
      - AWS_REGION=eu-west-1
    volumes:
      - ./config-test.yaml:/etc/otel-agent-config.yaml
      - ~/.aws:/root/.aws
    ports:
      - "55680:55680" # OTLP receiver

But when submitting traces I get {"stack":"Error: 2 UNKNOWN: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors\n at Object.exports.createStatusError

But if instead of using AWS_PROFILE I add AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY then it works, so my docker-compose.yaml file looks like:

version: "3"
services:
  aws-ot-collector:
    image: amazon/aws-otel-collector:latest
    command: ["--config=/etc/otel-agent-config.yaml", "--log-level=DEBUG"]
    environment:
      - AWS_REGION=eu-west-1
      - AWS_ACCESS_KEY_ID=key
      - AWS_SECRET_ACCESS_KEY=secret
    volumes:
      - ./config-test.yaml:/etc/otel-agent-config.yaml
      - ~/.aws:/root/.aws
    ports:
      - "55680:55680" # OTLP receiver

My credentials file in ~/.aws/credentials looks like

[p3]
aws_access_key_id=key
aws_secret_access_key=secret

And my tracer.js looks like:

const { NodeTracerProvider } = require('@opentelemetry/node');
const { ConsoleSpanExporter, SimpleSpanProcessor } = require('@opentelemetry/tracing');
// const { CollectorTraceExporter } =  require('@opentelemetry/exporter-collector');
const { CollectorTraceExporter } = require('@opentelemetry/exporter-collector-grpc');
const { propagation } = require("@opentelemetry/api");

const { AwsXRayIdGenerator } = require('@aws/otel-aws-xray-id-generator');
const { AWSXRayPropagator } = require('@aws/otel-aws-xray-propagator');

propagation.setGlobalPropagator(new AWSXRayPropagator());

const provider = new NodeTracerProvider({
    idGenerator: new AwsXRayIdGenerator()
});

const otlpExporter = new CollectorTraceExporter({
    serviceName: 'service-hello',
    url: "localhost:55680"
});
const otlpProcessor = new SimpleSpanProcessor(otlpExporter);
provider.addSpanProcessor(otlpProcessor);

provider.register();

So I'm not really sure why this is not working

AOC stops for long running ECS task

Whats wrong
When I am running AOC using ECS with example task-definition, it suddenly stops after running for a long time (6+ hours).

What I am expecting
The task should run forever.

What might be the issue
We have a sample application which emits trace data. This container emits data for 6 hours or so and stops. As this container is marked as an essential container, it forces the task to stop.

What might be possible solutions
I can think of two different solutions-

  1. Mark the sample app container as non-essential. So, even if it stops, it won't kill our task.
  2. Update the sample app container to stay alive even if it stops emitting trace data and keep it as an essential container.

Benchmarks - machine configuration

The readme has some benchmarks that are useful to understand the resources consumed by the collector, however, there doesn't seem to be any instance size listed. Could you confirm that these are based on a m4.2xlarge EC2 instance as defined here?

Thanks

Segfault on collector startup with awsecscontainermetrics receiver

I am getting the following segfault on starting the amazon/aws-otel-collector:latest container with AWS OTel Collector version: v0.4.0

1606326170999,AWS OTel Collector version: v0.4.0
1606326170999,"2020-11-25T17:42:50.999Z    INFO    service/service.go:397  Starting AWS OTel Collector...  {""Version"": ""v0.4.0"", ""GitHash"": ""be64e63fbd972170e024cbf10d41b7fad0e94394"", ""NumCPU"": 2}"
1606326170999,2020-11-25T17:42:50.999Z  INFO    service/service.go:241  Setting up own telemetry...
1606326171000,"2020-11-25T17:42:51.000Z    INFO    service/telemetry.go:101    Serving Prometheus metrics  {""address"": ""localhost:8888"", ""level"": 0, ""service.instance.id"": ""88ab22d1-3030-4197-9ae7-c2264727402b""}"
1606326171001,2020-11-25T17:42:51.000Z  INFO    service/service.go:278  Loading configuration...
1606326171011,2020-11-25T17:42:51.011Z  INFO    service/service.go:289  Applying configuration...
1606326171011,2020-11-25T17:42:51.011Z  INFO    service/service.go:310  Starting extensions...
1606326171011,"2020-11-25T17:42:51.011Z    INFO    builder/extensions_builder.go:53    Extension is starting...    {""component_kind"": ""extension"", ""component_type"": ""health_check"", ""component_name"": ""health_check""}"
1606326171011,"2020-11-25T17:42:51.011Z    INFO    healthcheckextension/healthcheckextension.go:40 Starting health_check extension {""component_kind"": ""extension"", ""component_type"": ""health_check"", ""component_name"": ""health_check"", ""config"": {""TypeVal"":""health_check"",""NameVal"":""health_check"",""Port"":13133}}"
1606326171011,"2020-11-25T17:42:51.011Z    INFO    builder/extensions_builder.go:59    Extension started.  {""component_kind"": ""extension"", ""component_type"": ""health_check"", ""component_name"": ""health_check""}"
1606326171011,"2020-11-25T17:42:51.011Z    DEBUG   [email protected]/conn.go:59    Using proxy address:    {""component_kind"": ""exporter"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""proxyAddr"": """"}"
1606326171011,"2020-11-25T17:42:51.011Z    DEBUG   [email protected]/conn.go:125   Fetch region from environment variables {""component_kind"": ""exporter"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""region"": ""us-west-2""}"
1606326171012,"2020-11-25T17:42:51.011Z    DEBUG   [email protected]/xray_client.go:56 Using Endpoint: %s  {""component_kind"": ""exporter"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""endpoint"": ""https://xray.us-west-2.amazonaws.com""}"
1606326171012,"2020-11-25T17:42:51.011Z    INFO    builder/exporters_builder.go:306    Exporter is enabled.    {""component_kind"": ""exporter"", ""exporter"": ""awsxray""}"
1606326171012,"2020-11-25T17:42:51.012Z    DEBUG   [email protected]/conn.go:59 Fetch region from environment variables {""component_kind"": ""exporter"", ""component_type"": ""awsemf"", ""component_name"": ""awsemf"", ""region"": ""us-west-2""}"
1606326171012,"2020-11-25T17:42:51.012Z    INFO    builder/exporters_builder.go:306    Exporter is enabled.    {""component_kind"": ""exporter"", ""exporter"": ""awsemf""}"
1606326171012,2020-11-25T17:42:51.012Z  INFO    service/service.go:325  Starting exporters...
1606326171012,"2020-11-25T17:42:51.012Z    INFO    builder/exporters_builder.go:92 Exporter is starting... {""component_kind"": ""exporter"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray""}"
1606326171012,"2020-11-25T17:42:51.012Z    INFO    builder/exporters_builder.go:97 Exporter started.   {""component_kind"": ""exporter"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/exporters_builder.go:92 Exporter is starting... {""component_kind"": ""exporter"", ""component_type"": ""awsemf"", ""component_name"": ""awsemf""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/exporters_builder.go:97 Exporter started.   {""component_kind"": ""exporter"", ""component_type"": ""awsemf"", ""component_name"": ""awsemf""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/pipelines_builder.go:207    Pipeline is enabled.    {""pipeline_name"": ""traces"", ""pipeline_datatype"": ""traces""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/pipelines_builder.go:207    Pipeline is enabled.    {""pipeline_name"": ""metrics"", ""pipeline_datatype"": ""metrics""}"
1606326171014,2020-11-25T17:42:51.012Z  INFO    service/service.go:338  Starting processors...
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/pipelines_builder.go:51 Pipeline is starting... {""pipeline_name"": ""traces"", ""pipeline_datatype"": ""traces""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/pipelines_builder.go:61 Pipeline is started.    {""pipeline_name"": ""traces"", ""pipeline_datatype"": ""traces""}"
1606326171014,"2020-11-25T17:42:51.012Z    INFO    builder/pipelines_builder.go:51 Pipeline is starting... {""pipeline_name"": ""metrics"", ""pipeline_datatype"": ""metrics""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    builder/pipelines_builder.go:61 Pipeline is started.    {""pipeline_name"": ""metrics"", ""pipeline_datatype"": ""metrics""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    builder/receivers_builder.go:235    Receiver is enabled.    {""component_kind"": ""receiver"", ""component_type"": ""awsecscontainermetrics"", ""component_name"": ""awsecscontainermetrics"", ""datatype"": ""metrics""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    builder/receivers_builder.go:235    Receiver is enabled.    {""component_kind"": ""receiver"", ""component_type"": ""otlp"", ""component_name"": ""otlp"", ""datatype"": ""traces""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    builder/receivers_builder.go:235    Receiver is enabled.    {""component_kind"": ""receiver"", ""component_type"": ""otlp"", ""component_name"": ""otlp"", ""datatype"": ""metrics""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    [email protected]/receiver.go:61    Going to listen on endpoint for X-Ray segments  {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""udp"": ""0.0.0.0:2000""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    udppoller/poller.go:105 Listening on endpoint for X-Ray segments    {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""udp"": ""0.0.0.0:2000""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    [email protected]/receiver.go:73    Listening on endpoint for X-Ray segments    {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""udp"": ""0.0.0.0:2000""}"
1606326171014,"2020-11-25T17:42:51.013Z    DEBUG   proxy/conn.go:98    Fetched region from environment variables   {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""region"": ""us-west-2""}"
1606326171014,"2020-11-25T17:42:51.013Z    INFO    builder/receivers_builder.go:235    Receiver is enabled.    {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray"", ""datatype"": ""traces""}"
1606326171014,2020-11-25T17:42:51.013Z  INFO    service/service.go:350  Starting receivers...
1606326171016,"2020-11-25T17:42:51.014Z    INFO    builder/receivers_builder.go:70 Receiver is starting... {""component_kind"": ""receiver"", ""component_type"": ""awsecscontainermetrics"", ""component_name"": ""awsecscontainermetrics""}"
1606326171016,"2020-11-25T17:42:51.014Z    INFO    builder/receivers_builder.go:75 Receiver started.   {""component_kind"": ""receiver"", ""component_type"": ""awsecscontainermetrics"", ""component_name"": ""awsecscontainermetrics""}"
1606326171016,"2020-11-25T17:42:51.014Z    INFO    builder/receivers_builder.go:70 Receiver is starting... {""component_kind"": ""receiver"", ""component_type"": ""otlp"", ""component_name"": ""otlp""}"
1606326171046,"2020-11-25T17:42:51.045Z    INFO    builder/receivers_builder.go:75 Receiver started.   {""component_kind"": ""receiver"", ""component_type"": ""otlp"", ""component_name"": ""otlp""}"
1606326171046,"2020-11-25T17:42:51.045Z    INFO    builder/receivers_builder.go:70 Receiver is starting... {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray""}"
1606326171046,"2020-11-25T17:42:51.045Z    INFO    [email protected]/receiver.go:98    X-Ray TCP proxy server started  {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray""}"
1606326171046,"2020-11-25T17:42:51.045Z    INFO    builder/receivers_builder.go:75 Receiver started.   {""component_kind"": ""receiver"", ""component_type"": ""awsxray"", ""component_name"": ""awsxray""}"
1606326171046,"2020-11-25T17:42:51.045Z    INFO    healthcheck/handler.go:128  Health Check state change   {""component_kind"": ""extension"", ""component_type"": ""health_check"", ""component_name"": ""health_check"", ""status"": ""ready""}"
1606326171046,2020-11-25T17:42:51.045Z  INFO    service/service.go:253  Everything is ready. Begin running and processing data.
1606326191021,panic: runtime error: invalid memory address or nil pointer dereference
1606326191021,[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1160c3c]
1606326191021,goroutine 98 [running]:
1606326191021,"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver/awsecscontainermetrics.getContainerMetrics(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)"
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/awsecscontainermetrics/metrics_helper.go:19 +0x5c
1606326191021,"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver/awsecscontainermetrics.(*metricDataAccumulator).getMetricsData(0xc000799d48, 0xc000532630, 0xc00053a380, 0x6a, 0xc0002ea090, 0x88, 0xc00057c360, 0x25, 0xc0001574c0, 0x2, ...)"
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/awsecscontainermetrics/accumulator.go:37 +0x1c5
1606326191021,github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver/awsecscontainermetrics.MetricsData(...)
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/awsecscontainermetrics/metrics.go:24
1606326191021,"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver.(*awsEcsContainerMetricsReceiver).collectDataFromEndpoint(0xc0007a8fc0, 0x33012a0, 0xc000382600, 0x2ed133c, 0x16, 0x1, 0x2000100000000)"
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/receiver.go:97 +0x3e5
1606326191021,"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver.(*awsEcsContainerMetricsReceiver).Start.func1(0xc0007a8fc0, 0x33012a0, 0xc000382600)"
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/receiver.go:71 +0xc5
1606326191021,created by github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver.(*awsEcsContainerMetricsReceiver).Start
1606326191021,  github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsecscontainermetricsreceiver@v0.14.1-0.20201118171217-2398a00e4656/receiver.go:64 +0xcd

Here is the config I'm using for running the container in AWS Fargate:

extensions:
  health_check:
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: localhost:55680
      http:
        endpoint: localhost:55681
  awsxray:
    endpoint: 0.0.0.0:2000
    transport: udp
  awsecscontainermetrics:
processors:
  batch/traces:
    timeout: 1s
    send_batch_size: 50
  batch/metrics:
    timeout: 60s
exporters:
  awsxray:
  awsemf:
service:
  pipelines:
    traces:
      receivers: [otlp,awsxray]
      processors: [batch/traces]
      exporters: [awsxray]
    metrics:
      receivers: [otlp, awsecscontainermetrics]
      processors: [batch/metrics]
      exporters: [awsemf]
  extensions: [health_check]

Am I missing something or messing up the config in some way?

Strip down aws-otel-collector binary for AWS Lambda layer

Description:

This is a feature request. AWS Lambda has a hard limit that the total unzipped size of the function and all layers can't exceed the unzipped deployment package size limit of 250 MB. Besides that, the memory usage of a goLang program in Lambda is proportional to its binary size, put aws-otel-collector in Lambda directly will cause high Max memory used. So, the current aws-otel-collector is not fit to be packaged into lambda layer. We have to strip down aws-otel-collector as much as possible, create a minimized aws-otel-collector binary for Lambda.

Solution:

Build a Lambda version aws-otel-collector which needs:

  1. Remove all of receiver plugins except otlp receiver.
  2. Remove file exporter, otlp exporter
  3. Remove all of processors
  4. Keep health_check
  5. remove output log to /opt
  6. provide a public download url for binary

Include AWS Resource Detectors by Default

We should modify our vended config.yaml to include the all AWS-specific resource detectors on the collector. #32 is a first attempt at this, but it is still TBD whether we should include the env resource detector. Eventually, we should have detectors for:

  • EC2
  • ECS
  • EKS (Although the k8s detector could potentially suffice)
  • Elastic Beanstalk
  • Lambda

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.