Code Monkey home page Code Monkey logo

collins_exporter's Introduction

Prometheus Collins Exporter

This is a Collins exporter for Prometheus.

This exporter usually retrieves the necessary data from Collins (called a “Collins scrape”) upon each scrape of the /metrics endpoint (the “Prometheus scrape”). However, Collins can become slow at times, especially if multiple requests are overlapping. Therefore, whenever a Prometheus scrape hits the exporter while a Collins scrape is still ongoing, no new Collins scrape will be started, but all pending Prometheus scrapes will get metrics from the current Collins scrape once it has finished. (This adds jitter, but exporting from a potentially slow backend has jitter anyway. Arguably, not applying this protection against overloading Collins will make things even worse.)

Despite this precaution, a Collins scrape might still take longer than 10s for large inventories. Take that into account when configuring the scrape timeout on your Prometheus server.

Installing

You need a Go development environment. Then, run the following to get the source code and build and install the binary:

go get github.com/soundcloud/collins_exporter`

Running

A minimal invocation is simply:

./collins_exporter

Supported parameters include:

  • web.listen-address: the address/port to listen on (default: ":9136")
  • web.telemetry-path: the path under which to expose metrics (default: "/metrics")
  • collins.config: the path to your Collins config, if not in a standard location (see https://tumblr.github.io/collins/tools.html#configs)

Digging into the data

The exporter exposes three major groups of metrics, collins_asset_status, collins_asset_state, and collins_asset_details, with the Collins asset tag being used as a label for each one.

Asset info

The collins_asset_details metrics always have a value of one. But they contain useful information about each asset in their labels. A typical metric looks like this:

collins_asset_details{instance="collins.example.com:9139",ipmi_address="10.1.2.3",job="collins",nodeclass="web-server",primary_address="10.10.20.30",tag="ABCD1234"}

The information encoded in this series can be used to find assets by attributes other then the asset tag, as demonstrated in the example queries below.

Status

There is one collins_asset_status metric per asset tag and per possible Collins status. Since there are (currently) nine different Collins statuses, the query collins_asset_status{tag="ABCD1234"} will yield nine metrics, each with a different status label like "New" or "Allocated". The value of each but one of the metrics will be 0. The one metric with a value of 1 represents the status the asset is currently in.

A useful query to get started is to list the number of assets per status per nodeclass:

count(((collins_asset_status == 1) * on(tag) group_right(status) collins_asset_details)) by (job, nodeclass, status)

In fact, it's so useful you might want to use a recording rule for it:

status_nodeclass:collins_asset_status:count = count(((collins_asset_status == 1) * on(tag) group_right(status) collins_asset_details)) by (job, nodeclass, status)

Based on that, you could for example display the percentage of unallocated (available) assets per nodeclass on a dashboard:

sum(status_nodeclass:collins_asset_status:count{status="Unallocated"}) without (status) / sum(status_nodeclass:collins_asset_status:count) without (status)

Sometimes you need to get the status of a machine but don't know the asset tag yet. Here is a sample query for the asset status using only the primary IP address. It returns the asset tag, the status name, and the nodeclass as labels:

collins_asset_details{primary_address="10.1.2.3"} * on(tag) group_right(nodeclass) collins_asset_status == 1

State

Unlike the fixed number of statuses, there can be an arbitrary number of user-defined Collins states. Thus, the collins_asset_state metrics follow a different approach. There is exactly one metric per asset, and its value reflects the ID of the state. We are still looking for a good way of exposing the state by name, too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.