Code Monkey home page Code Monkey logo

cole's Introduction

Cole

I see dead people

Cole is a dead man switch listener. In prometheus it is common to create a dead man switch which will constantly send alerts to test your entire alerting pipline. A question that comes up often is what do you have watching those dead man switch alerts. Who watches the watchers, effectively.

This is a basic implmentation of something that could watch for those deadman switch alerts, and then send alert itself if it does not receive a notification from the deadman switch within the assigned time interval.

Status

this project is in very early stages and should not be used in production yet. This is Still in Work In Progress (WIP) status that does work but there are some planned features that still need to be added and things like configuration are still evolving.

How does it work

Cole listens for http requests from prometheus alertmanager sending alerts for dream switch alert. When a message is received a timer will be started for the specified duration. If a message is not received from the deadman alert inside of that time duration, it will fire off an alert of it's own.

There is a forthcoming blog post on jpweber.io on how to leverage a deadman switch alert in your prometheus monitoring and how something like Cole fits in which will provide some more detail in to the thinking of creating a tool like this.

Supported alert integrations

  • Slack
  • PagerDuty
  • MsTeams
  • Generic Webhook

How to use

  1. Start the cole server by any of the below defined means (bare binary, docker, etc)

  2. For each DeadManSwitch that you want to check in you must generate an ID for that alert. Perform an http GET request to /id of the cole server. For example. curl http://yourcoleaddress/id. This will return a json payload of the following. This timerid will be part of the url you hit to check in.

    {
        "timerid":"bg8obqel0s1fdr02gtvg"
    }
  3. Create a receiver in your alert manager config to make a call to a webhook when it recieves a DeadManSwitch alert. The wait, group and repeat intervals may need to be changed based on your needs.

    global:
     ...
    route:
     ...
        routes:
        - match:
            alertname: DeadMansSwitch
            receiver: 'cole'
            group_wait: 0s
            group_interval: 1m
            repeat_interval: 50s
    receivers:
    - name: 'cole'
    webhook_configs:
    - url: 'http://192.168.2.66:8080/ping/bg8obqel0s1fdr02gtvg'
        send_resolved: false

Configuration

Example using configuration file

# Example Cole configuration file

# Slack
# SenderType = "slack"
# Interval = 10
# HTTPEndpoint = "https://hooks.slack.com/services/..."
# HTTPMethod = "POST"
# SlackChannel = "#general"
# SlackUsername = "Cole - DeadManSwitch Monitor"
# SlackIcon = ":monkey_face:"


# PagerDuty
SenderType = "pagerduty"
Interval = 10
PDAPIKey = "noiD8-khbpNpgAAAAAAAAAA"
PDIntegrationKey = "5353fb993888441811111111111"

# Ms Teams
SenderType = "teams"
Interval = 10
HTTPEndpoint = "https://hooks.teams.com/services/..."

Flags supported as ENV Vars

  • SENDER_TYPE
  • INTERVAL
  • HTTP_ENDPOINT
  • HTTP_METHOD
  • EMAIL_ADDR
  • PD_KEY
  • SLACK_CHANNEL
  • SLACK_USERNAME
  • SLACK_ICON

Example Prometheus Alert Manager config

Run it

With docker

docker run -d \
-e SENDER_TYPE="slack" \
-e INTERVAL="10" \
-e HTTP_ENDPOINT="https://hooks.slack.com/services/..." \
-p 8080:8080 \
cole:0.2.0

Bare binary

./cole

API Endpoints

  • POST - /ping/<timerid>
  • GET - /id
  • GET - /version

Build locally

  • clone the repo
  • dep ensure -v
  • go build That is it.

cole's People

Contributors

chrisob avatar danukapraneeth avatar jpweber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

cole's Issues

Discrepancy between latest code and latest release

I'm trying to customize alerts using this payload

curl -X POST http://127.0.01:8080/ping/bricvuhnrtbavrirdvg-staging -d '{"commonAnnotations":{"message":"master-2.ocp.example.com"}}'

This works fine with the binary compiled localy, message in Slack looks like this

Missed DeadManSwitch Alert  - master-2.ocp.example.com

But if I use production server instead of localhost, the result is always the same:

Missed DeadManSwitch Alert  -

The release should be identical to local since there were no commits.

Any idea?

docker build fails

$ git clone https://github.com/jpweber/cole.git
$ cd cole
$ docker build -t cole .
---------- SNIP --------------
Step 6/12 : RUN CGO_ENABLED=0 go build -a -ldflags '-s' -installsuffix cgo -o app .
 ---> Running in 92fe8e2f38e9
# github.com/jpweber/cole
./main.go:126:2: invalid character U+0023 '#'
./main.go:127:7: no new variables on left side of :=
The command '/bin/sh -c CGO_ENABLED=0 go build -a -ldflags '-s' -installsuffix cgo -o app .' returned a non-zero code: 2

COLE - Kubernetes Deployment - TEAMS

Hi,

So i have AlertManager sending Alert to Cole via Webhook, and then Cole sends Alert but TEAMS never gets it.
Teams webhook tested and its accessible.

Cole Logs:

time="2023-03-02T14:05:29Z" level=info msg="Starting application..."
time="2023-03-02T14:05:29Z" level=info msg="Using ENV Vars for configuration"
time="2023-03-02T14:06:07Z" level=info msg="timerID: cg0an55eo2h3hje7o380"
time="2023-03-02T14:06:07Z" level=info msg="POST - /ping/cg0an55eo2h3hje7o380"
time="2023-03-02T14:07:07Z" level=info msg="timerID: cg0an55eo2h3hje7o380"
time="2023-03-02T14:07:07Z" level=info msg="POST - /ping/cg0an55eo2h3hje7o380"
time="2023-03-02T14:08:07Z" level=info msg="Sending Alert. Missed deadman switch notification."
time="2023-03-02T14:08:07Z" level=info msg="timerID: cg0an55eo2h3hje7o380"
time="2023-03-02T14:08:07Z" level=info msg="POST - /ping/cg0an55eo2h3hje7o380"
time="2023-03-02T14:09:07Z" level=info msg="timerID: cg0an55eo2h3hje7o380"
time="2023-03-02T14:09:07Z" level=info msg="POST - /ping/cg0an55eo2h3hje7o380"
time="2023-03-02T14:10:07Z" level=info msg="Sending Alert. Missed deadman switch notification."
time="2023-03-02T14:10:07Z" level=info msg="timerID: cg0an55eo2h3hje7o380"
time="2023-03-02T14:10:07Z" level=info msg="POST - /ping/cg0an55eo2h3hje7o380"
time="2023-03-02T14:11:07Z" level=info msg="Sending Alert. Missed deadman switch notification."

Cole Config:

image: jpweber/cole:latest

env:
- name: SENDER_TYPE
value: teams
- name: INTERVAL
value: '60'
- name: HTTP_METHOD
value: post
- name: HTTP_ENDPOINT
value: >-
'https://teams.webhook'

AM i missing something here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.