Code Monkey home page Code Monkey logo

opentelemetry-erlang's Introduction

OpenTelemetry Erlang/Elixir


Hex.pm Hex.pm Hex.pm EEF Observability WG project

GitHub Workflow Status Codecov


OpenTelemetry distributed tracing framework for Erlang and Elixir.

These applications implement version 1.8.0 of the OpenTelemetry Specification, see the spec compliance matrix for a list of features supported.

Requirements

  • Erlang/OTP 23+ (With best effort for OTP 22 support)

If using the Elixir API:

  • Elixir 1.13+

Contacting Us

We hold weekly meetings. See details at community page.

We use GitHub Discussions for support or general questions. Feel free to drop us a line.

We are also present in the #otel-erlang-elixir channel in the CNCF slack. Please join us for more informal discussions.

You can also find us in the #opentelemetry channel on Elixir Slack.

Getting Started

You can find a getting started guide on opentelemetry.io.

To start capturing distributed traces from your application it first needs to be instrumented. The easiest way to do this is by using an instrumentation library, there are a number of officially supported instrumentation libraries for popular Erlang and Elixir libraries and frameworks.

Design

The OpenTelemetry specification defines a language library as having 2 components, the API and the SDK. The API must not only define the interfaces of any implementation in that language but also be able to function as a noop implementation of the tracer. The SDK is the default implementation of the API that must be optional.

When instrumenting a project your application should only depend on the OpenTelemetry API application, found in directory apps/opentelemetry_api of this repo which is published as the hex package opentelemetry_api.

The SDK implementation, found under apps/opentelemetry and hex package opentelemetry, should be included in an OTP Release along with an exporter.

Example of Release configuration in rebar.config:

{relx, [{release, {my_instrumented_release, "0.1.0"},
         [opentelemetry_exporter,
	      {opentelemetry, temporary},
          my_instrumented_app]},

       ...]}.

Example configuration for mix's Release task:

def project do
  [
    releases: [
      my_instrumented_release: [
        applications: [opentelemetry_exporter: :permanent, opentelemetry: :temporary]
      ],

      ...
    ]
  ]
end

Note that you also need to add opentelemetry_exporter before your other opentelemetry dependencies in mix.exs so that it starts before opentelemetry does.

In the above example opentelemetry_exporter is first to ensure all of its dependencies are booted before opentelemetry attempts to start the exporter. opentelemetry is set to temporary so that if the opentelemetry application crashes, or is shutdown, it does not terminate the other applications in the project -- opentelemetry_exporter does not not need to be temporary because it does not have a startup and supervision tree. This is optional, the opentelemetry application purposely sticks to permanent for the processes started by the root supervisor to leave it up to the end user whether they want the crash or shutdown or opentelemetry to be ignored or cause the shutdown of the rest of the applications in the release.

Git Dependencies

While it is recommended to use the Hex packages for the API, SDK and OTLP exporter, there are times depending on the git repo is necessary. Because the OpenTelemetry OTP Applications are kept in a single repository, under the directory apps, either rebar3's git_subdir (rebar 3.14 or above is required) or mix's sparse feature must be used when using as Git dependencies in a project. The blocks below shows how in rebar3 and mix the git repo for the API and/or SDK Applications can be used.

{opentelemetry_api, {git_subdir, "http://github.com/open-telemetry/opentelemetry-erlang", {branch, "main"}, "apps/opentelemetry_api"}},
{opentelemetry, {git_subdir, "http://github.com/open-telemetry/opentelemetry-erlang", {branch, "main"},
"apps/opentelemetry"}},
{opentelemetry_exporter, {git_subdir, "http://github.com/open-telemetry/opentelemetry-erlang", {branch, "main"}, "apps/opentelemetry_exporter"}}
{:opentelemetry_api, github: "open-telemetry/opentelemetry-erlang", sparse:
"apps/opentelemetry_api", override: true},
{:opentelemetry, github: "open-telemetry/opentelemetry-erlang", sparse:
"apps/opentelemetry", override: true},
{:opentelemetry_exporter, github: "open-telemetry/opentelemetry-erlang", sparse: "apps/opentelemetry_exporter", override: true}

The override: true is required because the SDK Application, opentelemetry, has the API in its deps list of its rebar.config as a hex dependency and this will clash when mix tries to resolve the dependencies and fail without the override. override: true is also used on the SDK because the opentelemetry_exporter application depends on it and the API as hex deps so if it is included the override is necessary.

Benchmarks

Running benchmarks is done with benchee. Benchmark functions are in modules under samples/. To run them open a rebar3 shell in the bench profile:

$ rebar3 as bench shell

> otel_benchmarks:run().

If an Elixir script is wanted for the benchmarks they could be run like (after running rebar3 as bench compile):

$ ERL_AFLAGS="-pa ./_build/bench/extras/samples/" ERL_LIBS=_build/bench/lib/ mix run --no-mix-exs samples/run.exs

W3C Trace Context Interop Tests

Start the interop web server in a shell:

$ rebar3 as interop shell

> w3c_trace_context_interop:start().

Then, clone the W3C Trace Context repo and run the tests:

$ cd test
$ python3 test.py http://127.0.0.1:5000/test

Contributing

Approvers (@open-telemetry/erlang-approvers):

Find more about the approver role in community repository.

Maintainers (@open-telemetry/erlang-maintainers):

Find more about the maintainer role in community repository.

Thanks to all the people who have contributed

contributors

opentelemetry-erlang's People

Contributors

albertored avatar bogdandrutu avatar bryannaegele avatar btkostner avatar bullno1 avatar calvin-kargo avatar chad-g-adams avatar chulkilee avatar derekkraan avatar dvic avatar fcevado avatar ferd avatar garthk avatar gugahoa avatar hauleth avatar indrekj avatar kenichi avatar kuroneer avatar kw7oe avatar maciej-szlosarczyk avatar marcdel avatar pedro-gutierrez avatar renovate[bot] avatar roadrunnr avatar sergetupchiy avatar sergeykanzhelev avatar srstrong avatar tsloughter avatar whatyouhide avatar wingyplus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opentelemetry-erlang's Issues

Compile Error: vcs_vsn: Unknown vsn format: {file,"VERSION"}

Hi,
I see this error while compiling directly (or including in a project). I'm not sure if this is fixable rebar3-side, or here.
Thanks for your work on opentelemetry!

➜  opentelemetry-erlang git:(master) rebar3 compile
===> Verifying dependencies...
===> Compiling opentelemetry_api
===> vcs_vsn: Unknown vsn format: {file,"VERSION"}
➜  opentelemetry-erlang git:(master) rebar3 report "rebar3 compile"
Rebar3 report
 version 3.5.0
 generated at 2020-02-03T15:49:15+00:00
=================
Please submit this along with your issue at https://github.com/erlang/rebar3/issues (and feel free to edit out private information, if any)
-----------------
Task: rebar3
Entered as:
  rebar3 compile
-----------------
Operating System: x86_64-unknown-linux-gnu
ERTS: Erlang/OTP 22 [erts-10.5.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:0] [hipe]
Root Directory: /usr/lib/erlang
Library directory: /usr/lib/erlang/lib
-----------------
Loaded Applications:
bbmustache: 1.3.0
certifi: 2.0.0
cf: 0.2.2
common_test: 1.18
compiler: 7.4.7
crypto: 4.6.1
cth_readable: 1.3.2
dialyzer: 4.1
edoc: 0.11
erlware_commons: 1.0.4
eunit: 2.3.8
eunit_formatters: 0.5.0
getopt: 1.0.1
inets: 7.1.1
kernel: 6.5
providers: 1.7.0
public_key: 1.7
relx: 3.24.3
sasl: 3.4.1
snmp: 5.4.1
ssl_verify_fun: 1.1.3
stdlib: 3.10
syntax_tools: 2.2.1
tools: 3.2.1

-----------------
Escript path: /home/afa/Erlang/rebar3
Providers:
  app_discovery as clean compile compile cover ct deps dialyzer do edoc escriptize eunit generate get-deps help install install_deps list lock new path pkgs release relup report shell state tar tree unlock update upgrade upgrade upgrade version xref 
➜  opentelemetry-erlang git:(master) 

Building opentelemetry_exporter as a dependency

I've noticed something which might be related to this issue, or not: #140:

I can't seem to build a release when one of the apps in the release has opentelemetry_exporter as a dependency (this is while I try to explicitly fetch the opentelemetry_exporter git subdirectory in its rebar.config)

===> Compiling opentelemetry_exporter
===> Compiling _build/default/lib/opentelemetry_exporter/apps/opentelemetry_exporter/src/opentelemetry_exporter.erl failed
_build/default/lib/opentelemetry_exporter/apps/opentelemetry_exporter/src/opentelemetry_exporter.erl:246: undefined macro 'OTEL_STATUS_UNSET'

_build/default/lib/opentelemetry_exporter/apps/opentelemetry_exporter/src/opentelemetry_exporter.erl:192: function to_otlp_status/1 undefined
_build/default/lib/opentelemetry_exporter/apps/opentelemetry_exporter/src/opentelemetry_exporter.erl:245: spec for undefined function to_otlp_status/1

make: *** [Makefile:26: rel] Error 1

Rebar normally builds apps (dependencies) in _build/default/lib/my_app/ebin. For opentelemetry-erlang I see paths like this _build/default/lib/opentelemetry_api/apps/opentelemetry_api/ebin.

I don't know whether this is the reason opentelemetry_exporter doesn't seem to have access to the .hrl files of the API.

Might be a super basic thing I'm missing here... thanks much for any clues.

Metrics: remove active instruments that haven't had recent recordings

Each active instrument (the combination of an instrument name and a labelset) is stored in the active instruments ets table. There is no reason to keep these active instrument rows if they are not actively being used.

The active_instrument needs to change to include a counter for how many collection cycles this record has not had an update. After a certain threshold the row is removed from the table.

This is detailed in the SDK spec https://github.com/open-telemetry/opentelemetry-specification/blob/2b75442b05fd968f197422dc18124338a955f3a2/specification/sdk-metric.md#recommended-implementation

Constants for Erlang macros like Kind and Status in Elixir lib

I noticed in opentelemetry-beam/opentelemetry_phoenix#16 that since Elixir (as far as I know, and seems confirmed by the code in the PR) can't use macros from Erlang headers the user code has to use explicit atoms like :SERVER and :Error for Kind and Status.

I'm not sure what the equivalent is in Elixir (I know there are module attributes with @name but those are just within the module right?) but will find out and unless someone gets to it before me.

Make propagators configurable

Currently opentelemetry_app:start/2 sets up propagation with hard coded settings:

    {BaggageHttpExtractor, BaggageHttpInjector} = ot_baggage:get_http_propagators(),
    {CorrelationsHttpExtractor, CorrelationsHttpInjector} = ot_correlations:get_http_propagators(),
    {B3HttpExtractor, B3HttpInjector} = ot_tracer_default:b3_propagators(),
    opentelemetry:set_http_extractor([BaggageHttpExtractor,
                                      CorrelationsHttpExtractor,
                                      B3HttpExtractor]),
    opentelemetry:set_http_injector([BaggageHttpInjector,
                                     CorrelationsHttpInjector,
                                     B3HttpInjector]),

Maybe this should be completely removed and up to the user to write in one of their apps start/2? Or we can figure out some configuration to allow it.

Third option, that I think I like the most, is to have only a simple configuration option that lets you turn on or off the defaults for HTTP. So like:

{default_http_propagators, true | false}

Then if that is set the code above would be run by opentelemetry on start. Except with W3C instead of B3.

Instrumented versions of OTP behaviours

In the SIG meeting yesterday @yurishkuro asked about auto-instrumentation, like in Java and others. I realized this is similar to what @ferd and I had discussed at one point making otel_gen_* behaviours to be used in place of the standard gen_* modules but that propagate context.

Preferably they would just call the gen_* modules so no logic/features have to be duplicated, but I the receiving of the messages and unwrapping the context before calling handle_* will have to be implemented.

I'm also now wondering if adding "context" to OTP so the gen_* behaviours propagate it could be an acceptable patch to OTP.

This would still leave raw send to not be automatically propagating context, but this could be a happy middle ground between always propagating some value across processes and having to always do it manually.

Opening this issue to track/discuss ideas, other issues should probably be opened to track implementation of individual behaviours.

Tracer, span and context implementations and how to configure them

I'm beginning to rethink what is currently in master, so looking for feedback.

Currently the default SDK tracer has no predefined implementation of spans or context. It relies on the modules used for spans and context to be looked up dynamically from a persistent term.

Meaning you can use the SDK tracer and set the span module to be used to one that stores spans in ets, ot_span_ets, and the context to be stored in the pdict, ot_ctx_pdict.

What concerns me is this may be too dynamic and may be confusing to the user.

One fix for the confusion that I'll probably make a PR, or add to my open PR #7, with is to have the span and context implementations to be configured through the tracer. So you'd have config like:

{opentelemetry, [{default_tracer, {ot_tracer_sdk, [{span_impl, ot_span_ets}, {ctx_impl, ot_ctx_sdk}]}}]}

Instead of letting each be defined separately, in which case you could set a separate default tracer that doesn't use the default_span_impl you defined in the config and be very confused.

The alternative is to have only the tracer be configurable and you choose the tracer based on how you want spans and context stored.

Meaning there would be a tracer ot_tracer_ets_pdict and ot_tracer_ets_seqtrace.

Open to any ideas.

ot_resource exports attributes/1 without its result types key/0 and value/0

-export([create/1,
         merge/2,
         attributes/1]).

-type key() :: io_lib:latin1_string().
-type value() :: io_lib:latin1_string() | integer() | float() | boolean().
-type resource() :: {ot_resource, [{key(), value()}]}.
-opaque t() :: resource().

-export_type([t/0]).

%% ...

-spec attributes(t()) -> [{key(), value()}].
attributes({ot_resource, Resource}) ->
    Resource.

… making it kinda hard to declare a spec for another function that takes the return value from attributes/1

Misleading callback specs in ot_batch_processor?

I read this, returned a keyword list from my init/1, and crashed out with a {:case_clause, []}:

%% behaviour for exporters to implement
-type opts() :: term().

%% Do any initialization of the exporter here and return configuration
%% that will be passed along with a list of spans to the `export' function.
-callback init(term()) -> opts().

The reason was obvious once I read the code that called it:

init_exporter(undefined) ->
    undefined;
init_exporter({ExporterModule, Config}) when is_atom(ExporterModule) ->
    case ExporterModule:init(Config) of
        {ok, ExporterConfig} ->
            {ExporterModule, ExporterConfig};
        ignore ->
            undefined
    end;
init_exporter(ExporterModule) when is_atom(ExporterModule) ->
    init_exporter({ExporterModule, []}).

Strikes me I'd not have run into strife if we'd been clearer:

%% Do any initialization of the exporter here and return configuration
%% that will be passed along with a list of spans to the `export' function.
-callback init(term()) -> {ok, opts()} | ignore.

Set is_recording to false in the span context

Added to the spec in open-telemetry/opentelemetry-specification#1011 which should be merged soon.

In our case it means updating the span context stored in the context when End is called, which is nice since it'll make future calls of any functions to update the span in ets (which will fail since the span has been end'ed and moved to the export table) will short circuit on the span context is_recording being false.

Error compiling opentelemetry in elixir project

I included the dependencies like this

{:opentelemetry_api,
       git: "[email protected]:open-telemetry/opentelemetry-erlang.git",
       branch: "master",
       sparse: "apps/opentelemetry_api",
       override: true},
{:opentelemetry,
       git: "[email protected]:open-telemetry/opentelemetry-erlang.git",
       branch: "master",
       sparse: "apps/opentelemetry",
       override: true},
{:opentelemetry_exporter,
       git: "[email protected]:open-telemetry/opentelemetry-erlang.git",
       branch: "master",
       sparse: "apps/opentelemetry_exporter",
       override: true},

and I got this error when compiling

===> Compiling parse_trans
===> Compiling opentelemetry
===> Compiling src/otel_metric_exporter.erl failed
src/otel_metric_exporter.erl:30: can't find include lib "opentelemetry_api/include/opentelemetry.hrl"; Make sure opentelemetry_api is in your app file's 'applications' list

** (Mix) Could not compile dependency :opentelemetry, "/home/mrkaspa/.asdf/installs/elixir/1.9.4/.mix/rebar3 bare compile --paths="/home/mrkaspa/code/ex/fms_gateway_umbrella/_build/dev/lib/*/ebin"" command failed. You can recompile
 this dependency with "mix deps.compile opentelemetry", update it with "mix deps.update opentelemetry" or clean it with "mix deps.clean opentelemetry"

Add Elixir record structs to support tests and integrations

I think it'd help to ship Record adapters so our integration authors and end users don't have to duplicate the effort. Something like this, only with documentation and better typespecs, and for span and span_ctx and link as well as event:

defmodule OpenTelemetry.Records.Event do
  @moduledoc false
  require Record

  @fields Record.extract(:event, from_lib: "opentelemetry_api/include/opentelemetry.hrl")
  Record.defrecordp(:event, @fields)
  defstruct @fields

  @type t :: %__MODULE__{}

  @spec from(record(:event)) :: t()
  def from(rec) when Record.is_record(rec, :event) do
    fields = event(rec)
    struct!(__MODULE__, fields)
  end
end

I'm not sure about the module name.

Proposal: Use GitHub Actions for CI/CD

This issue is in reference to issue 398 posted on the community repository. In this issue, we are proposing that all OpenTelemetry repositories consider using GitHub Actions as their CI provider in order to maintain consistency across the various language repositories.

The overall proposal was discussed in the OpenTelemetry maintainers SIG meeting. @trask has been assigned as the mentor for the project.

Repository CI Provider Automated Build and Test Code Coverage Automated Performance Testing Automated Deployment Automated Docs Deployment
Erlang CircleCI [x] [x] [x] [] []

The justification and benefits are enumerated in the issue on the community repository and are pasted here as well for convenience:

Proposal

We propose that all languages consider using the same CI provider. This would create a more consistent development process and make it easier for developers to contribute to multiple language libraries.

We suggest that provider be GitHub Actions. Here’s why:

Ease-of-Use

CircleCI will automatically run when pull requests and commits are issued against the repository. But if a contributor forks the repository, unless they set up an account with the CI provider and link it to their forked repository, CI will not be activated and tests will not be run automatically.
In contrast, GitHub Actions works out of the box on a forked repository and can be easily configured to run a test workflow each time a commit is issued. This would help individual contributors test their code and ensure code quality before submitting a pull request against the repository.

Transparency

Current CI providers such as CircleCI allow anyone to view the console output when building and running tests but the test results can not be seen anywhere on the GitHub repository. To view this testing output: You need go to a different website, navigate a different user interface, and then sift through thousands of lines of console output. This is not a seamless developer experience.
In contrast, using GitHub Actions would provide all testing output directly on the repository’s GitHub page, which would help contributors to find, read, and use the test output to maintain code quality.

Control

GitHub Actions’ integration with other GitHub features means you can have finer control over the CI pipeline. For example, certain workflows can be set to only run on a new release. Workflows can even be used to close stale issues and pull requests.

Recommendation

We recommend that we consider using one consistent CI provider, GitHub Actions, which provides an integrated and seamless developer experience for all contributors.

Example

Please see this example that the C++ repository has adopted for the above reasons.

Next Steps

This issue shall serve as a place for discussion about this proposal.

Could a maintainer please assign this issue to us if approved?

cc: @Brandon-Kimberly @alolita

Populate implementation/spec compliance matrix

In order to have a better understanding of where we stand in terms of implementing OpenTelemetry specification in various language SDKs we would like to have a compliance matrix that shows which features are implemented in which languages.

This will show how much work is remaining and tell us far away are we from being ready for GA.

The matrix is located here: https://github.com/open-telemetry/opentelemetry-specification/blob/master/spec-compliance-matrix.md

Please open a PR against specification repo and populate each cell in the column for "Erlang" in Traces and Exporters tables with one of the following:

+ means the feature is supported
- means it is not supported
N/A means the feature is not applicable to the particular language. If it is not self-explanatory why the feature is not applicable please add a numbered footnote below the table to clarify why, e.g. N/A [7], then later [7] - This feature makes no sense for language... because....

It is OK if you don't have the complete information, please leave the cells blank and come back and complete them later.

Please try to keep the duration that the PR stays open short to avoid merge conflicts when other languages also make changes to the same tables.

Once the initial round of updates to the matrix is done it will be great if language maintainers can periodically update the table as they implement features.

Checking for correct Behaviour in module_info/0 in API

Exploring things a little bit, I noticed that I can't set tracers when the .beam files are stripped in a prod release, I think.
There reason is the check for the correct Behaviour here:

case lists:keyfind(behaviour, 1, Attributes) of

The module_info/0 call might look like this, giving an empty attributes list:

1> otel_tracer_default:module_info().
[{module,otel_tracer_default},
 {exports,[{start_span,3},
           {start_span,4},
           {with_span,3},
           {with_span,4},
           {b3_propagators,0},
           {w3c_propagators,0},
           {module_info,0},
           {module_info,1}]},
 {attributes,[]},
 {compile,[]},
 {native,false},
 {md5,<<229,142,94,192,207,250,193,205,25,119,87,154,29,
        226,111,166>>}]
2> 

Do you think this is a correct assessment, and is it a case that matters?

Means to force export

:timer.sleep/1 in tests makes babies cry; any chance of an exported method to trigger export so we can verify the exporter worked?

2X mentions of opencensus

I figure these typespecs won't work as intended:

src/ot_span_sweeper.erl
41:               strategy :: drop | end_span | failed_attribute_and_end_span | fun((opencensus:span()) -> ok),

src/ot_propagation_http_w3c.erl
55:-spec encode(opencensus:span_ctx()) -> iolist().

Dynamic resources

We need to send all events with specific attributes for correlation, suggesting we use OpenTelemetry resources… but for some of the attributes we need the values to be computed on the fly eg. for:

  • Instance uptime
  • RAM consumption and availability
  • Disk consumption and availability
  • CPU consumption and availability

We can't satisfy this need with metrics: to use our trace destination to correlate these against other measurements and metadata in our trace attributes, we need these to also be present in our span attributes.

I've got two obvious places to put this capability, both of which feel wrong:

  • A wrapper around the OpenTelemetry API to make it more useful
  • The exporter

Is there some place I could put it? A hook at span creation or completion would do the trick, or both if I wanted to do an adequate job for CPU consumption without needing to keep a log somewhere.

OTLP/HTTP JSON Protobuf Exporter

Right now OTLP HTTP exporter only supports exporting binary Protobufs and not in the Protobuf->JSON format. So the latter needs to be added.

Getting started guide with stdout exporter

I am discovering getting started with open telemetry is not easy. Or at least that's how I'm finding it.

Is there any getting started guide with erlang code as the example? Is it sensible to start a new project with the stdout exporter so I don't have to set up a collector service? A guide, or even a link to a guide, that covered these points would be really helpful

Move propagators to API

So now propagators are supposed to be in the API and this allows propagation without an SDK:

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/api.md#included-propagators

This is a bigger issue for us since currently we rely on the Application startup of the SDK to setup propagators:

%% set the global propagators for HTTP based on the application env
setup_text_map_propagators(Opts),

Having this have to be part of the API means the only way to ensure it happens before anything that depends on OpenTelemetry is to make the API have an Application start.

I didn't want the API to have any processes but I don't see a way around it now?

Tracing into NIFs and Ports

This came up on the spec call today with regards to like python and ruby. I think it will not be common in Erlang because of how we use NIFs most often as very very short function calls but there are dirty NIFs out there and others doing their own worker pools that do much more.

One example is https://github.com/edenhill/librdkafka which uses https://github.com/edenhill/librdkafka.

The process dictionary and seqtrace are not available inside the NIF code so the only option is to grab the context as a record and pass it to the NIF function call, then reconstruct it as a context with https://github.com/open-telemetry/opentelemetry-cpp

I think simply accepting that you'd be running multiple OpenTelemetry SDKs is acceptable for the very rare case someone actually wants this -- meaning the spans created in the NIF, or whatever workers the NIF communicates with, will be exported by opentelemetry-cpp and not go to the processor/exporter in Erlang.

But this means double configuration, you have to setup the C++ processors and exporter to duplicate what you have in the Erlang config.

Second option would be to create a span processor for OnEnd or an exporter in C++ that would, instead of exporting out of the node, create the Erlang span record and send that finished span to the Erlang finished span ETS table. I think this has to be done by a process, so the span would be send as a message to some process that then sends it to the regular finished span ETS table. So an optional process that accepts spans to finish would be added.

Anyway, not very important, though the second idea seems interesting to implement if anyone wants to delve into C/C++ :).

I might look at doing the second idea for Rust when there is an OpenTelemetry Rust available.

Understanding how sampling works

I'm trying to understand how sampling works in the SDK, especially with regard to if/how the probability samplers cascade to child spans.

I've put together some test cases here: https://github.com/marcdel/otel_sandbox/tree/master/test

These are the tests that are failing because they exported a span when I expected them not to, or vice versa.

1) test percentage when parent is not set, and children are off, parent is exported but children are not (ParentChildTest)
2) test percentage when parent is off, but children are on: parent is not exported, but children are (ParentChildTest)
3) test percentage when parent is percentage, and children are not set: parent and children use percentage (ParentChildTest)
4) test percentage when parent is on, but children are off: parent is exported, children are not (ParentChildTest)
5) test always_on/off when parent is off, and children are not set: parent and children are not exported (ParentChildTest)

A simple example: if I specify a 0.0 sampler for the parent, which isn’t exported, I would expect the children not to be exported, but they are. For what it's worth, the on/off sampler behaves the same way in this case.

@zero_percent_sampler :ot_sampler.setup(:probability, %{probability: 0.0})

OpenTelemetry.Tracer.with_span "parent", %{sampler: @zero_percent_sampler} do
  Enum.map(1..4, fn i ->
    OpenTelemetry.Tracer.with_span "child#{i}" do
    end
  end)
end

The percentage based samplers seem to behave differently than the on/off samplers (when set to 100% or 0% respectively) in a few cases, which also seemed surprising.

On slack @tsloughter pointed out that this may have changed recently. I tried pointing this test repo to the master branch to see, but I get a compile error:

▶ mix test
===> Compiling opentelemetry
===> Compiling src/otel_metric_exporter.erl failed
src/otel_metric_exporter.erl:30: can't find include lib "opentelemetry_api/include/opentelemetry.hrl"; Make sure opentelemetry_api is in your app file's 'applications' list

** (Mix) Could not compile dependency :opentelemetry, "/Users/marc/.asdf/installs/elixir/1.10.3/.mix/rebar3 bare compile --paths="/Users/marc/Code/marcdel/otel_sandbox/_build/test/lib/*/ebin"" command failed. You can recompile this dependency with "mix deps.compile opentelemetry", update it with "mix deps.update opentelemetry" or clean it with "mix deps.clean opentelemetry"

Expose `ot_tracer` record?

I can't try on_start_processors and on_end_processors in a unit test to solve #73 without being able to call :opentelemetry.set_default_tracer/1 with a #tracer{}, which I can't consume with Record.extract/2 because it's not in include.

That's not entirely true. I can save the environment, replace it, stop the app, restart it, run my test, stop it again, put back the old environment, and restart it again. You'll feel awful when you review the code, though.

Normalize metric labelsets

There is no longer a "labelset" part of the API, so the SDK takes the raw label and value in a tuple list. At some point in the metrics pipeline these need to be "normalized" in order to not have duplicate metrics for the same labels.

Example:

ot_meter:record(mycounter, 5, [{key1, <<"value1">>}]),
ot_meter:record(mycounter, 8, [{<<"key1">>, <<"value1">>}]),

I think the labels should be able to be atoms, strings or binaries to make it easier on the user, so those 2 recordings should go to the same metric.

The normalizing can be done either at the time of the recording in ot_meter_default.erl or at the time of "integration" in ot_metric_integrator.erl. I'm not sure which is best.

Add pid to active span table?

As I was thinking about adding a monitor option to spans so that they can be ended if the process dies I realized it was probably a good idea to also add it to the active table so the sweeper can make a judgement on it for spans not monitored.

The sweeper could then be configured to either ignore expired spans if the process that created it is still alive or to only care about if the process that created it is dead. Meaning, expiration of infinity but if the pid is no longer alive it will sweep it.

Thoughts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.