Code Monkey home page Code Monkey logo

metrics-server-prom's Introduction

Kubernetes Metrics Server Prometheus Adapter

Motivation | Usage | Docker Compose | Metrics transformation | License

Build Status release

Motivation

What is provided

A Docker image on which Prometheus can scrape Kubernetes metrics provided by metrics-server. The image can run anywhere where Prometheus can use it as a target, even in Kubernetes itself.

Why is it needed

metrics-server seems to be the successor of heapster for Kubernetes monitoring. However, metrics-server currently only provides its metrics in JSON format via the Kubernetes API server.

Prometheus on the other hand expects a special text-based format. So in order for Prometheus to scrape those metrics, they must be transparently transformed from JSON to its own format on every request.

Differences

Other than metrics-server itself, this Docker container provides additional metrics metadata that are retrieved via kubectl API calls and included in the Prometheus output.

How does it work

The following diagram illustrates how the format transformation is achieved:

workflow

  1. Prometheus scrapes the Docker container on :9100/metrics
  2. Inside the Docker container uwsgi is proxying the request to kube proxy
  3. kube-proxy reads the provided config file and tunnels the request into Kubernetes to metrics-server
  4. metrics-server replies with JSON formatted metrics
  5. kube proxy forwards the request back to uwsgi
  6. uwsgi calls transform.py
  7. transform.py rewrites the JSON into Prometheus readable output and hands the result back to uwsgi
  8. uwsgi sends the final response back to Prometheus

Usage

Run metrics-server-prom

Simply run the Docker image with a kube config mounted into /etc/kube/config.

$ docker run -d \
    -p 9100:9100 \
    -v ${HOME}/.kube/config:/etc/kube/config:ro \
    cytopia/metrics-server-prom

If your kube config contains multiple contexts, you can tell metrics-server-prom what context to use, to connect to the cluster.

$ docker run -d \
    -p 9100:9100 \
    -v ${HOME}/.kube/config:/etc/kube/config:ro \
    -e KUBE_CONTEXT=my-context \
    cytopia/metrics-server-prom

Configure Prometheus

prometheus.yml:

scrape_configs:
  - job_name: 'kubernetes'
    scrape_interval: '15s'
    metrics_path: '/metrics'
    static_configs:
      - targets:
        - <DOCKER_IP_ADDRESS>:9100

Docker Compose

To test this out locally, this repository ships an example Docker Compose setup with metrics-server-prom and Prometheus.

  1. Navigate to example/
  2. Copy env-example to .env
  3. Adjust KUBE_CONTEXT in .env
  4. Run it docker-compose up

Metrics transformation

metrics-server provices metrics in the following format:

{
  "kind": "PodMetricsList",
  "apiVersion": "metrics.k8s.io/v1beta1",
  "metadata": {
    "selfLink": "/apis/metrics.k8s.io/v1beta1/pods"
  },
  "items": [
    {
      "metadata": {
        "name": "etcd-server-events-abc",
        "namespace": "kube-system",
        "selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/kube-system/pods/etcd-server-events-ip-10-30-78-99.eu-central-1.compute.internal",
        "creationTimestamp": "2018-08-20T03:19:05Z"
      },
      "timestamp": "2018-08-20T03:19:00Z",
      "window": "1m0s",
      "containers": [
        {
          "name": "etcd-container",
          "usage": {
            "cpu": "7m",
            "memory": "125448Ki"
          }
        }
      ]
    },

  ]
}

metrics-server-prom transforms it to the following format:

Note: Additional metadata (node and ip) have been added.

# HELP kube_metrics_server_pod_cpu The CPU time of a pod in seconds.
# TYPE kube_metrics_server_pod_cpu gauge
kube_metrics_server_pod_cpu{node="ip-10-30-78-99.eu-central-1.compute.internal",pod="etcd-server-events-abc",ip="10.30.62.138",container="etcd-container",namespace="kube-system"} 420
# HELP kube_metrics_server_pod_mem The memory of a pod in KiloBytes.
# TYPE kube_metrics_server_pod_mem gauge
kube_metrics_server_pod_mem{node="ip-10-30-78-99.eu-central-1.compute.internal",pod="etcd-server-events-abc",ip="10.30.62.138",container="etcd-container",namespace="kube-system"} 128475136

License

MIT License

Copyright (c) 2018 cytopia

metrics-server-prom's People

Contributors

cytopia avatar czerasz-flaconi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

metrics-server-prom's Issues

Errors: KeyError: '<pod_name>' and AttributeError: 'NoneType' object has no attribute 'group'

I set this up in my kubernetes cluster

[init] Checking for kube config ...OK
[init] KUBE_CONTEXT not set, relying on context set in /etc/kube/config
[init] Checking current kube context ...OK
[init] Current context: aws
[init] Checking cluster info ...OK
2019-03-01 01:45:18,634 CRIT Set uid to user 0
2019-03-01 01:45:18,636 INFO supervisord started with pid 1
2019-03-01 01:45:19,638 INFO spawned: 'transformer' with pid 32
2019-03-01 01:45:19,640 INFO spawned: 'kubectl-proxy' with pid 33
[uWSGI] getting INI configuration from uwsgi.ini
*** Starting uWSGI 2.0.18 (64bit) on [Fri Mar 1 01:45:19 2019] ***
compiled with version: 6.3.0 20170516 on 01 March 2019 01:10:52
os: Linux-4.14.88-88.76.amzn2.x86_64 #1 SMP Mon Jan 7 18:43:26 UTC 2019
nodename: metrics-server-2-prom-755b964746-p6rhc
machine: x86_64
clock source: unix
detected number of CPU cores: 2
current working directory: /home/prometheus/transform
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
your memory page size is 4096 bytes
detected max file descriptor number: 65536
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 0.0.0.0:9100 fd 3
Starting to serve on 127.0.0.1:8080
Python version: 3.7.2 (default, Feb 6 2019, 03:40:10) [GCC 6.3.0 20170516]
Set PythonHome to /home/prometheus/transform/transformenv
*** Python threads support is disabled. You can enable it with --enable-threads ***
Python main interpreter initialized at 0x562b2e793850
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 218712 bytes (213 KB) for 2 cores
*** Operational MODE: preforking ***
WSGI app 0 (mountpoint='') ready in 1 seconds on interpreter 0x562b2e793850 pid: 32 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 32)
spawned uWSGI worker 1 (pid: 39, cores: 1)
spawned uWSGI worker 2 (pid: 40, cores: 1)
2019-03-01 01:45:21,536 INFO success: transformer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-03-01 01:45:21,536 INFO success: kubectl-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
[pid: 40|app: 0|req: 1/1] 10.10.113.116 () {24 vars in 287 bytes} [Fri Mar 1 02:02:56 2019] GET / => generated 297 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0)
[2019-03-01 02:17:36,085] ERROR in app: Exception on /metrics [GET]
Traceback (most recent call last):
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functionsrule.endpoint
File "./transform.py", line 278, in metrics
'pods': trans_pod_metrics(json['pods'])
File "./transform.py", line 199, in trans_pod_metrics
more[lbl['pod']]['node'],
KeyError: 'kube-proxy-9jqrt'
[pid: 40|app: 0|req: 2/2] 10.10.115.20 () {28 vars in 466 bytes} [Fri Mar 1 02:17:35 2019] GET /metrics => generated 291 bytes in 329 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0)
[2019-03-01 02:17:40,894] ERROR in app: Exception on /metrics [GET]
Traceback (most recent call last):
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functionsrule.endpoint
File "./transform.py", line 278, in metrics
'pods': trans_pod_metrics(json['pods'])
File "./transform.py", line 199, in trans_pod_metrics
more[lbl['pod']]['node'],
KeyError: 'ingress-nginx-ingress-default-backend-6679dd498c-s7cxc'
[pid: 39|app: 0|req: 1/3] 10.10.115.20 () {28 vars in 466 bytes} [Fri Mar 1 02:17:40 2019] GET /metrics => generated 291 bytes in 138 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0)
[pid: 40|app: 0|req: 3/4] 10.10.115.20 () {28 vars in 453 bytes} [Fri Mar 1 02:17:41 2019] GET / => generated 297 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0)
[pid: 40|app: 0|req: 4/5] 10.10.115.20 () {28 vars in 491 bytes} [Fri Mar 1 02:17:42 2019] GET /apis/k8s.metrics.io => generated 233 bytes in 2 msecs (HTTP/1.1 404) 2 headers in 72 bytes (1 switches on core 0)
[2019-03-01 02:17:45,886] ERROR in app: Exception on /metrics [GET]
Traceback (most recent call last):
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functionsrule.endpoint
File "./transform.py", line 278, in metrics
'pods': trans_pod_metrics(json['pods'])
File "./transform.py", line 199, in trans_pod_metrics
more[lbl['pod']]['node'],
KeyError: 'prometheus-prometheus-node-exporter-978th'

Also This:

2019-03-01 03:07:15,396 INFO success: transformer entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2019-03-01 03:07:15,396 INFO success: kubectl-proxy entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
[pid: 72|app: 0|req: 1/1] 172.17.0.1 () {24 vars in 255 bytes} [Fri Mar 1 03:07:27 2019] GET / => generated 297 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 80 bytes (1 switches on core 0)
[2019-03-01 03:07:37,133] ERROR in app: Exception on /metrics [GET]
Traceback (most recent call last):
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1815, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1718, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/_compat.py", line 35, in reraise
raise value
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1813, in full_dispatch_request
rv = self.dispatch_request()
File "/home/prometheus/transform/transformenv/lib/python3.7/site-packages/flask/app.py", line 1799, in dispatch_request
return self.view_functionsrule.endpoint
File "./transform.py", line 278, in metrics
'pods': trans_pod_metrics(json['pods'])
File "./transform.py", line 174, in trans_pod_metrics
more = get_pod_metrics_from_cli()
File "./transform.py", line 238, in get_pod_metrics_from_cli
'ns': line.group(1),
AttributeError: 'NoneType' object has no attribute 'group'
[pid: 72|app: 0|req: 2/2] 172.17.0.1 () {24 vars in 269 bytes} [Fri Mar 1 03:07:33 2019] GET /metrics => generated 291 bytes in 3899 msecs (HTTP/1.1 500) 2 headers in 84 bytes (1 switches on core 0)

Prometheus Scrapping Error

Hi,

while scrapping metrics page of cytopia container by prometheus , following message is logged at prometheus. could please help to fix it.

level=warn ts=2020-01-16T16:33:52.682948027Z caller=scrape.go:835 component="scrape manager" scrape_pool=k8s_usage target=http://localhost:9100/metrics msg="append failed" err="strconv.P
arseFloat: parsing "5761140264n": invalid syntax"
level=warn ts=2020-01-16T16:34:02.668555018Z caller=scrape.go:835 component="scrape manager" scrape_pool=k8s_usage target=http://localhost:9100/metrics msg="append failed" err="strconv.P
arseFloat: parsing "226202814n": invalid syntax"
level=warn ts=2020-01-16T16:34:12.647209977Z caller=scrape.go:835 component="scrape manager" scrape_pool=k8s_usage target=http://localhost:9100/metrics msg="append failed" err="strconv.P
arseFloat: parsing "226202814n": invalid syntax"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.