Code Monkey home page Code Monkey logo

prometheus.ex's Introduction

Prometheus.ex

Build Status Hex.pm Coverage Status Hex.pm Documentation

Elixir Prometheus.io client based on Prometheus.erl.

Starting from v3.0.0 works with Elixir >=1.6 and Erlang >=20. For older versions, please use older tags.

@skosch dashboard

Dashboard from Monitoring Elixir apps in 2016: Prometheus and Grafana by @skosch.

  • IRC: #elixir-lang on Freenode;
  • Slack: #prometheus channel - Browser or App(slack://elixir-lang.slack.com/messages/prometheus).

Example

defmodule ExampleInstrumenter do
  use Prometheus.Metric

  def setup do    
    Histogram.new([name: :http_request_duration_milliseconds,
                   labels: [:method],
                   buckets: [100, 300, 500, 750, 1000],
                   help: "Http Request execution time"])
  end

  def instrument(%{time: time, method: method}) do
    Histogram.observe([name: :http_request_duration_milliseconds, labels: [method]], time)
  end
end

or

defmodule ExampleInstrumenter do
  use Prometheus.Metric

  @histogram [name: :http_request_duration_milliseconds,
              labels: [:method],
              buckets: [100, 300, 500, 750, 1000],
              help: "Http Request execution time"]

  def instrument(%{time: time, method: method}) do
    Histogram.observe([name: :http_request_duration_milliseconds, labels: [method]], time)
  end
end

Here histogram will be declared in auto-generated @on_load callback, i.e. you don't have to call setup manually.

Please read how to measure durations correctly with prometheus.ex.

Integrations / Collectors / Instrumenters

Dashboards

Installation

Available in Hex, the package can be installed as:

  1. Add prometheus_ex to your list of dependencies in mix.exs:

    def deps do
      [{:prometheus_ex, "~> 3.1"}]
    end
  2. Ensure prometheus_ex is started before your application:

    def application do
      [applications: [:prometheus_ex]]
    end

prometheus.ex's People

Contributors

cybrox avatar deadtrickster avatar dylan-chong avatar feld avatar hauleth avatar iamjarvo avatar jarimatti avatar jlgeering avatar kianmeng avatar lanodan avatar maxdrift avatar parroty avatar progsmile avatar skosch avatar zeha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

prometheus.ex's Issues

Library Architecture Patterns

I am currently using this library and I quite enjoy it's use. I think that for our needs though a few interesting concepts may need to be addressed.

In Prometheus Client libraries, it seems that they provide 2 main bits of functionality regardless:

  • the ability to add metrics by simply importing the lib
  • the ability to start the webserver to export the metrics (they've moved towards text based only now)

This library requires the plug exporter library and I wonder if that should simply be included in this library?

Other concerns I have are around which application should "setup" the Metrics. I see that if I wish to create an instrumenter, I need to call the setup in my application boot loader. This seems like it should be handled by the prometheus client library.

I would love to hear your thoughts on this.

How to approach debugging a dead metric pull request?

We use prometheus and prometheus-plugs for pulling metrics out of our elixir services, usually the amount of
data pulled is around 80-100 MB which we think it's too much.

We set a hard timeout of 10 mins for all our endpoints requests and on a daily basis we get failed prometheus pull requests due to timeouts (1 request exceeds 10 mins), we'd like to look more into debugging what's happening inside, whether it's an issue with re-formatting the payload or with its size. Is there a debugging mode that we can enable for better insights?

prometheus_ex on server with node_exporter - best practice

Hi

Just a question, … let say I have a server, with prometheus (node_exporter) already installed and operational. Without having elixir on this server. Now, I would like to deploy to this server a Elixir app, in which I have and would like to use Prometheus_ex package, to be able to get some elixir/erlang VM specifics. How is this done? what would be best practice? Keeping both node_exporter and elixirs’ prometheus_ex and both exporting under /metrics but on different ports? or what?

Compilation error with Elixir 1.14.0

== Compilation error in file lib/prometheus/buckets.ex ==
** (UndefinedFunctionError) function Kernel.Utils.defdelegate/2 is undefined or private. Did you mean:

      * defdelegate_all/3
      * defdelegate_each/2

    (elixir 1.14.0) Kernel.Utils.defdelegate({:new, [line: 18], [{:arg, [line: 18], nil}]}, [])
    lib/prometheus/buckets.ex:18: (module)

Prometheus.InvalidMetricArityError after phoenix Update

I updated phoenix to Version 1.6.
After that i get the following error in the log:

[error] Handler "telemetry_web__event_handler" has failed and has been detached. Class=:error
Reason=%Prometheus.InvalidMetricArityError{expected: 2, present: 3}
Stacktrace=[
  {:prometheus_metric, :check_mf_exists, 4,
   [file: 'src/prometheus_metric.erl', line: 149]},
  {:prometheus_histogram, :insert_placeholders, 3,
   [file: 'src/metrics/prometheus_histogram.erl', line: 443]},
  {:prometheus_histogram, :insert_metric, 5,
   [file: 'src/metrics/prometheus_histogram.erl', line: 431]},
  {:prometheus_histogram, :observe, 4,
   [file: 'src/metrics/prometheus_histogram.erl', line: 203]},
  {Prometheus.Metric.Histogram, :observe, 2,
   [file: 'lib/prometheus/metric/histogram.ex', line: 101]},
  {:telemetry, :"-execute/3-fun-0-", 4,
   [
     file: '/Users/david/projects/private/BetterTyping/deps/telemetry/src/telemetry.erl',
     line: 150
   ]},
  {:lists, :foreach, 2, [file: 'lists.erl', line: 1342]},
  {Plug.Telemetry, :"-call/2-fun-0-", 4,
   [file: 'lib/plug/telemetry.ex', line: 76]},
  {Enum, :"-reduce/3-lists^foldl/2-0-", 3, [file: 'lib/enum.ex', line: 2396]},
  {Plug.Conn, :run_before_send, 2, [file: 'lib/plug/conn.ex', line: 1690]},
  {Plug.Conn, :send_resp, 1, [file: 'lib/plug/conn.ex', line: 399]},
  {TyperacerWeb.LobbyController, :action, 2,
   [file: 'lib/typeracer_web/controllers/lobby_controller.ex', line: 1]},
  {TyperacerWeb.LobbyController, :phoenix_controller_pipeline, 2,
   [file: 'lib/typeracer_web/controllers/lobby_controller.ex', line: 1]},
  {Phoenix.Router, :__call__, 2, [file: 'lib/phoenix/router.ex', line: 355]},
  {TyperacerWeb.Endpoint, :plug_builder_call, 2,
   [file: 'lib/typeracer_web/endpoint.ex', line: 1]},
  {TyperacerWeb.Endpoint, :"call (overridable 3)", 2,
   [file: 'lib/plug/debugger.ex', line: 136]},
  {TyperacerWeb.Endpoint, :call, 2,
   [file: 'lib/typeracer_web/endpoint.ex', line: 1]},
  {Phoenix.Endpoint.Cowboy2Handler, :init, 4,
   [file: 'lib/phoenix/endpoint/cowboy2_handler.ex', line: 54]},
  {:cowboy_handler, :execute, 2,
   [
     file: '/Users/david/projects/private/BetterTyping/deps/cowboy/src/cowboy_handler.erl',
     line: 37
   ]},
  {:cowboy_stream_h, :execute, 3,
   [
     file: '/Users/david/projects/private/BetterTyping/deps/cowboy/src/cowboy_stream_h.erl',
     line: 306
   ]}
]

This is how i configure it:

application.ex:

...
 Typeracer.PhoenixInstrumenter.setup()
    Typeracer.PipelineInstrumenter.setup()
    Typeracer.RepoInstrumenter.setup()
    Prometheus.Registry.register_collector(:prometheus_process_collector)
    Typeracer.PrometheusExporter.setup()

    :ok =
      :telemetry.attach(
        "prometheus-ecto",
        [:typeracer, :repo, :query],
        &Typeracer.RepoInstrumenter.handle_event/4,
        %{}
      )

    PrometheusPhx.setup()
...

How to remove entry with certain label

Hey,

I am using the Boolean Metric to produce something like this:

test_value{foo="bar"} 0
test_value{foo="baz"} 1

Now I no longer need test_value{foo="baz"} because baz was removed from the system. How can I remove it from the output?

Return value from Prometheus.Metric.Histogram.observe_duration/2

Hi, we're migrating our platform to run on Kubernetes and as part of that we're transitioning from using Graphite to Prometheus for graphing our metrics. We currently use the library Statix to push metrics to our Graphite server and specifically we use https://hexdocs.pm/statix/Statix.html#c:measure/3 to instrument functions where we measure execution time. Statix.measure/3 allows the return value of the instrumented function to be returned, but unless I'm misunderstanding the correct usage, this doesn't appear to be the case for Prometheus.Metric.Histogram.observe_duration/2?

Could you clarify whether it is possible to return the value of the function passed to Prometheus.Metric.Histogram.observe_duration/2?

Startup Time issues

I recently tried to integrate this library, and found the startup time increased around 10 seconds. This 10 seconds even happens on mix test. Not sure why. Any ideas?

Compilation fails on Elixir 1.7.0-rc.1

Hi,

just to let you know: prometheus_ex (currently published version on Hex) seems to fail to compile on Elixir 1.7.0-rc.1 (OTP 20).

image

Might be related to elixir-lang/elixir#7309
Would love to contribute a fix but I'm not terribly familiar with what the purpose of the Macro is in prometheus_ex :-/

Using float values in Histogram?

Hi, thanks for creating this!

I am confused on how to use float values in Histogram. I want to have my time units in seconds, so I have to keep track float values.

Histogram.observe/2 doesn't accept Float values at all, and Histogram.dobserve/2 seems to be rounding down to zero. Am I doing something wrong?

Invalid value "unknown" for "erlang_vm_logical_processors_available"

When I go to /metrics in my Pheonix app, I get a response containing the following snippet.

# TYPE erlang_vm_logical_processors_available gauge
# HELP erlang_vm_logical_processors_available The detected number of logical processors available to the Erlang runtime system
erlang_vm_logical_processors_available unknown

Prometheus is unable to parse unknown and yields this error message: text format parsing error in line 9: expected float as value, got "unknown". From what I hear, the correct value in this case should be NaN.

Edit: I am using version 1.0.0-alpha9.

:mnesia warning when compiling

Hi! Thanks for the lib!

I'm getting this when compiling:

==> prometheus_ex
Compiling 19 files (.ex)
warning: :mnesia.system_info/1 defined in application :mnesia is used by the current application but the current application does not depend on :mnesia. To fix this, you must do one of:

  1. If :mnesia is part of Erlang/Elixir, you must include it under :extra_applications inside "def application" in your mix.exs

  2. If :mnesia is a dependency, make sure it is listed under "def deps" in your mix.exs

  3. In case you don't want to add a requirement to :mnesia, you may optionally skip this warning by adding [xref: [exclude: [:mnesia]]] to your "def project" in mix.exs

  lib/prometheus/contrib/mnesia.ex:22: Prometheus.Contrib.Mnesia.table_disk_size/2

Here are my versions:

Erlang/OTP 24 [erts-12.3.1] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:1] [jit]

Mix 1.13.4 (compiled with Erlang/OTP 22)

To fix this you need to add the code below to the mix.exs

  def application do
    [
      applications: [:logger, :prometheus],
      extra_applications: [:mnesia]
    ]
  end

I can open a PR if you don't mind.

Reduce needless usage of macros

Such frivolous usage of macros causes problems for end users that wants to check project with dialyzer, instead it will be better to use functions that can be inlined later when needed.

Do not change API over Prometheus.erl

Currently API provided by prometheus_ex for no reason tries to use keyword list as a first argument and do some magic parsing before calling functions from prometheus (Erlang one). Why though? I find Erlang API much cleaner and easier to work with, only gripe is fact that it throws ErlangError instead of Elixir-like ones (but that isn't big problem IMHO).

Solutions that I see are 2:

  • deprecate prometheus_ex in favour of raw prometheus calls and just don't care about error wrapping
  • make prometheus_ex API 1:1 mapping of the Erlang one with graceful handling of the errors
  • make prometheus_ex addition to Erlang API instead of wrapping it, aka provide only new features instead of doing that and at the same time wrapping existing features

How to use in a cluster

Hey,

how would one use this in a cluster of nodes? I don't use it to collect operating system or vm metrics but metrics about my application state. That means every node should give me (kinda) the same metrics.

Has someone done this yet? What is the easiest approach?

Microseconds getting automatically converted to milliseconds?

Is this expected behavior ?

I declare a metric with a microseconds suffix :

@histogram [
    name: :http_check_duration_microseconds,
    labels: [:target],
    buckets: :default,
    help: "Http check execution time"
  ]

And I then feed it a microseconds value (20481):

Histogram.observe(
      [name: :http_check_duration_microseconds, labels: [target]],
      time
    )

It is then silently converted to milliseconds - but the metric name still indicates microseconds.

http_check_duration_gauge_microseconds{target="http://google.com"} 20.481
# TYPE http_check_duration_microseconds histogram
# HELP http_check_duration_microseconds Http check execution time
http_check_duration_microseconds_bucket{target="http://google.com",le="0.005"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.01"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.025"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.05"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.1"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.25"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="0.5"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="1"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="2.5"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="5"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="10"} 0
http_check_duration_microseconds_bucket{target="http://google.com",le="+Inf"} 17
http_check_duration_microseconds_count{target="http://google.com"} 17
http_check_duration_microseconds_sum{target="http://google.com"} 364.862

The code in it's entirity can be seen here

Prometheus.Metric breaks when `:application_controller` is busy

There can be other scenarios for why :application_controller would be busy, but the one that I've seen is where you're draining connections while shutting down using a library like https://hexdocs.pm/plug_cowboy/Plug.Cowboy.Drainer.html. Without connection draining when receiving a SIGTERM the endpoint will shut down immediately, killing any current connections. With connection draining listeners on the port are suspended, meaning no more connections are opened, but allow the existing connections to drain, and then (and only then) proceed with shutting down the endpoint.

This means :application_controller asks the application containing the endpoint to shut down and waits for it to be done. While it's waiting, it's completely blocked and can't respond to messages. Depending on how long your draining timeout is, this can be a long time. Prometheus.Metric uses Application.started_applications which sends a message to :application_controller and waits timeout (5 seconds) for a response. While draining connections this will always fail, causing Prometheus.Metric to blow up (this also means Prometheus.PlugExporter blows up when Prometheus tries to scrape). If it's helpful I can set up a repo that reproduces this.

Is it possible to avoid calling Application.started_applications? Or catching the failure?

I may also be missing something, but why is the on_load being called each time a request hits the Prometheus.PlugExporter?

Metrics declared via module attributes do not get registered

I have been running into an issue where I declare metrics via module attributes like @counter and @histogram, but when running my test suite, failures occur because metrics the code attempts to update are not registered. I have worked around this by adding a loop like so to my application startup:

for mod <- Application.spec(:my_app)[:modules] do
  if function_exported?(mod, :__declare_prometheus_metrics__, 0) do
    mod.__declare_prometheus_metrics__()
  end
end

Obviously this workaround is brittle because it relies on knowledge of code generated by the macros in Prometheus.Metric. I'd like to stop using it.

The core problem is that the generated code calls Application.started_applications(), but normal OTP startup loads all application code before starting any applications. This means that the @on_load function will be called before prometheus has started and created its ets tables, and so no metrics will be declared and the function will never run when the application has already started.

Proposed solution: The macros should generate entries to be inserted into the default_metrics env key of the prometheus application. When the application starts, the default metrics are automatically declared. This will work because the application_controller process that owns the application environment table will be started at the time each user application is loaded.

Does that solution sound reasonable? If so, I will send a PR.

Prefer already formatted time value instead of native time

Getting Histograms to work with time values is not straightforward. I never used or converted any time values to native time until I started using this library. I think this is counter intuitive and prone to error, since it is easy to mis-read the documentation.

It will be much more natural to do the conversion to native time inside the library by default, rather than asking the user to do so every time. This way the user can supply an integer/float and the library will do the conversion for you.

I realise this will be a breaking change but I think it will pay off. What do you think?

Missing blog post – move the content to this repo?

Hey Ilya & friends,

As I'm currently revamping my blog I realized that the old links don't work anymore, including the one to my blog post about this library to which you're linking to in the readme file. The current link is here, but even that may change (or disappear altogether). Do you want to grab a copy of it, strip it of the fluff, and put it in a docs folder (or the wiki) in this repo as an extended example/tutorial? I'm happy to PR a Markdown file if that helps.

Documentation checklist

Checklist for modules:

  • buckets;
  • collector;
  • model;
  • registry;
  • http;
  • protobuf;
  • text;
  • counter;
  • gauge;
  • histogram;
  • summary;

Checklist for pages:

  • Overview;
  • Time intervals;
  • vm_memory_collector;
  • vm_statistics_collector;
  • vm_system_info_collector.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.