Code Monkey home page Code Monkey logo

prometheus-workshop's Introduction

prometheus-workshop

https://docs.google.com/presentation/d/1nfosNvuHAihk2Pe-U9RA6vEA89vnBre9h60GwYlJ_44/edit?usp=sharing

Resources

Pre-requisites

Milestone 1: Installing Prometheus

docker run -d -p 9090:9090 prom/prometheus

Lab

  • Visit http://localhost:9090
  • See all the metrics scraped by Prometheus about itself.

Milestone 2: Tour De Prometheus

Lab

  • Play around each of the pages and tabs in Prometheus.
  • Go through Prometheus configuration.
  • Go through Targets.

Milestone 3: First taste of querying

Lab

  • Run a query to understand if Prometheus is up and running.
  • Run a query to know how many samples are ingested by Prometheus. Hint: tsdb_head_samples_ap...

Milestone 4: Instrumenting an HTTP service

Updated Prometheus Configuration

global:
  scrape_interval: 15s

scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Scrape the service running on localhost:8080
  - job_name: 'api-service'
    static_configs:
      - targets: ['locaohost:8080'] # For gitpod, this has to be gitpod host along with `scheme: https`

Running Prometheus with updated configuration

docker run -d \
  -p 9090:9090 \
  -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml

Running HTTP Service

docker run -d -p 8080:8080 pierrevincent/prom-http-simulator:0.1

Lab

  • See all metrics exposed by the HTTP service.
  • Verify that Prometheus can scrape metrics from HTTP service.

Milestone 5: Monitoring an application

Lab

  • See all metrics and corresponding labels the HTTP service emits.
  • up{job="api-service"}
  • Let's look at the number of requests that this service has received since it started up.
  • All requests with status 200 and endpoint /login.
  • First taste of aggregation, do a sum of all requests with status 200 and endpoint /login.
  • How do we get data for a range instead of a single timestamp? Range vector vs. instant vector.
  • Rate of change in the number of requests.
  • Bytes allocated by Go for memstats. go_memstats_alloc_bytes
  • What's the overall request rate (with a 1 minute rolling window)
  • How many requests per minute are errors?
  • What's the error rate (in %) of requests to the /users endpoint?
  • Top-requested endpoints by status code

Milestone 6: Alerting

# alert-rules.yml
    groups:
    - name: http_health
      rules:
      - alert: HttpSimulatorNotRunning
        expr: absent(up{job="api-service"}) == 1
        for: 1m
        labels:
          severity: major
# prometheus.yml
scrape_configs:
  # Scrape Prometheus itself
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  # Scrape the service running on localhost:8080
  - job_name: 'api-service'
    static_configs:
      - targets: ['localhost:8080']
    scheme: https

rule_files:
 - "/etc/prometheus/alert-rules.yml"
docker run -d   -p 9090:9090   -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml   -v $(pwd)/alert-rules.yml:/etc/prometheus/alert-rules.yml prom/prometheus --web.enable-lifecycle --config.file=/etc/prometheus/prometheus.yml

Adding one more alert

groups:
    - name: http_health
      rules:
      - alert: HttpSimulatorNotRunning
        expr: absent(up{job="api-service"}) == 1
        for: 1m
        labels:
          severity: major
      - alert: ErrorRateHigh
        expr: sum(rate(http_requests_total{job="api-service", status="500"}[5m])) / sum(rate(http_requests_total{job="api-service"}[5m])) > 0.001
        for: 1m
        labels:
          severity: major
        annotations:
          summary: Alert if the error rate is high
          description: Calculate the error rate for this service and error out if high

Reload the Prometheus configuration with the following command:

curl -X POST :9090/-/reload

Lab

  • Setup alert-rules
  • Verify alert rules are getting evaluated.
  • Add one more alert rule and reload config.
  • See alert rules turning green and red.

Milestone 7: See with Grafana

docker run -d -p 3000:3000 grafana/grafana-oss

Lab

  • First visit to Grafana.
  • Adding a data source.
  • Create a dashboard.
  • Add all previous queries as panels.
  • Graph of latency distribution
  • Cumulative % graph of endpoint request rate
  • Memory usage over time
  • CPU usage over time
  • Graph % of requests fulfilling the SLO of 400ms for /login endpoint

Milestone 8: Advanced Topics: Remote Write to Long-Term Storage like Levitate

prometheus-workshop's People

Contributors

prathamesh-sonpatki avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

boyisboyis

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.