alerta / nagios-alerta Goto Github PK
View Code? Open in Web Editor NEWForward nagios alerts to the alerta monitoring system
Home Page: http://alerta.io
License: MIT License
Forward nagios alerts to the alerta monitoring system
Home Page: http://alerta.io
License: MIT License
Hello
Issue Summary
When Alerta Neb is initialised, it log in nagios.log with Livestatus message and some error
Environment
To Reproduce
Steps to reproduce the behavior:
Expected behavior
No Livestatus message with "alerta" as NEB module
error %s assigned
Screenshots
[1627446602] alerta: NEB callbacks for host and service checks successfully de-registered. Bye.
[1627446602] Event broker module '/usr/lib/nagios/alerta-neb.o' deinitialized successfully.
[1627446602] Event broker module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' deinitialized successfully.
[1627446602] alerta: deinitializing
[1627446602] alerta: Waiting for main to terminate...
[1627446602] alerta: Socket thread has terminated
[1627446602] alerta: error: %s
[1627446602] alerta: Waiting for client threads to terminate...
[1627446602] alerta: Logfile cache: flushing complete cache.
[1627446602] Event broker module '/usr/local/lib/mk-livestatus/livestatus128.o' deinitialized successfully.
[1627446602] Nagios 4.4.5 starting... (PID=258423)
[1627446602] Local time is Wed Jul 28 06:30:02 CEST 2021
[1627446602] LOG VERSION: 2.0
[1627446602] qh: Socket '/var/spool/nagios/cmd/nagios.qh' successfully initialized
[1627446602] qh: core query handler registered
[1627446602] qh: echo service query handler registered
[1627446602] qh: help for the query handler registered
[1627446602] wproc: Successfully registered manager as @wproc with query handler
[1627446602] wproc: Registry request: name=Core Worker 258425;pid=258425
.....
[1627446602] wproc: Registry request: name=Core Worker 258436;pid=258436
[1627446603] alerta: Initialising Nagios-Alerta Gateway module, v5.0.1
[1627446603] alerta: debug is off
[1627446603] alerta: states=Hard/Soft
[1627446603] alerta: Forward service checks, host checks and downtime to http://xxxxxxxxxxxxxxx:80/api
[1627446603] Event broker module '/usr/lib/nagios/alerta-neb.o' initialized successfully.
[1627446603] mod_gearman: initialized version 3.0.9 (libgearman 0.33)
[1627446603] Event broker module '/usr/lib64/mod_gearman/mod_gearman_nagios4.o' initialized successfully.
[1627446603] alerta: Livestatus %s by Mathias Kettner. Socket: '%s'
[1627446603] alerta: Please visit us at http://mathias-kettner.de/
[1627446603] alerta: Hint: please try out OMD - the Open Monitoring Distribution
[1627446603] alerta: Please visit OMD at http://omdistro.org
[1627446603] alerta: Finished initialization. Further log messages go to %s
[1627446603] Event broker module '/usr/local/lib/mk-livestatus/livestatus128.o' initialized successfully.
without alerta-neb
[1627457291] livestatus: Livestatus 1.2.8p25 by Mathias Kettner. Socket: '/usr/local/nagios/var/rw/live'
[1627457291] livestatus: Please visit us at http://mathias-kettner.de/
[1627457291] livestatus: Hint: please try out OMD - the Open Monitoring Distribution
[1627457291] livestatus: Please visit OMD at http://omdistro.org
[1627457291] livestatus: Finished initialization. Further log messages go to /u01/app/nagios/var/log/livestatus.log
Additional context
Uses with mod_gearman and livestatus (1.2.8)
Thank's
Hi,
The past week we installed it on check-mk server. The integration works fine but every 24 hours the service was going down. When we were saw the graphics, we saw that the aplication commit on progresive mode memory but doen't free it
Can you help us?
Hi,
Unfortunately Icinga2 dropped neb support for 2.x. I'm running around 5-6 Icinga2 boxes and need to pipe alerts to Alerta like I'm doing with Prometheus.
When an Icinga2 notification fires, it triggers a Python script which does some magic and sends info off to our ticketing system. Would it be possible to do the same with Icinga2, possibly to a webhook? I notice there's no generic webhook, though, so I'm not sure if that's possible? Or am I going to have to install the alerta CLI on each machine and interact that way?
icinga is nagios clone, so it would be nice to - alerta could support it :-)
[1440722291] [alerta] Heartbeat service check OK.
[1440722291] { "origin": "nagios/localhost", "type": "Heartbeat", "tags": ["3.2.0"] }
[1440722291] [alerta] HTTP response status=201
[1440722301] [alerta] Service check received.
[1440722301] {"origin":"nagios/vagrant-ubuntu-trusty-64","resource":"localhost","event":"Current Users","group":"Nagios","severity":"normal","environment":"Production","service":["Platform"],"tags":["check=Active"],"text":"USERS OK - 2 users currently logged in","value":"1/4 (Hard)","type":"nagiosServiceAlert","rawData":"users=2;20;50;0"}
[1440722301] [alerta] HTTP response status=201
[1440722341] [alerta] Service check received.
[1440722341] {"origin":"nagios/vagrant-ubuntu-trusty-64","resource":"localhost","event":"Disk Space","group":"Nagios","severity":"critical","environment":"Production","service":["Platform"],"tags":["check=Active"],"text":"DISK CRITICAL - free space: /vagrant 22855 MB (4% inode=100%):","value":"4/4 (Hard)","type":"nagiosServiceAlert","rawData":"/=2036MB;32227;36255;0;40284 /sys/fs/cgroup=0MB;0;0;0;0 /dev=0MB;392;441;0;491 /run=0MB;79;89;0;99 /run/lock=0MB;4;4;0;5 /run/shm=0MB;396;446;0;496 /run/user=0MB;80;90;0;100 /vagrant=453426MB;381025;428653;0;476282"}
[1440722341] [alerta] HTTP response status=201
[1440722351] [alerta] Heartbeat service check OK.
[1440722351] { "origin": "nagios/localhost", "type": "Heartbeat", "tags": ["3.2.0"] }
[1440722351] [alerta] HTTP response status=201
[1440748786] Warning: A system time change of 0d 7h 20m 33s (forwards in time) has been detected. Compensating...
[1440748786] HOST DOWNTIME ALERT: localhost;STARTED;No comment
[1440748786] Caught SIGSEGV, shutting down...
Issue Summary
Nagios with nagios-alerta crash after reload
Environment
Steps:
make nagios4 && make install
broker_module=/usr/lib/nagios/alerta-neb.o https://alerta.server env=IT key=ALERTA_KEY hard_only=1
systemcl start nagios
everything works fine. Alerta shows alerts from nagios.systemctl reload nagios
nagios always crash.Nagios.log with debug on
[1678370296] alerta: Initialising Nagios-Alerta Gateway module, v5.0.1
[1678370296] alerta: debug is on
[1678370296] alerta: states=Hard/Soft
[1678370296] alerta: Forward service checks, host checks and downtime to https://alerta.xyz/api
[1678370296] Event broker module '/usr/lib/nagios/alerta-neb.o' initialized successfully.
....
[1678370296] alerta: Downtime started.
[1678370296] alerta: {"origin": "nagios/nagios.server", "resource": "xxxx", "event": ......}
[1678370296] alerta: [curl] About to connect() to alerta.xyz port 443 (#0)
[1678370296] alerta: [curl] Trying 10.x.x.y...
[1678370296] alerta: [curl] Connected to alerta.xyz (10.x.x.y) port 443 (#0)
[1678370296] alerta: [curl] Initializing NSS with certpath: sql:/etc/pki/nssdb
[1678370296] Caught SIGSEGV, shutting down...
[1678370296] Caught SIGTERM, shutting down...
Hello Nick, thank you for your work on Alerta.
The Heartbeat timeout seems to default to 24 hours.
It would be handy to have a way to set the heartbeat timeout to something like 5 or 10 minutes.
e.g.
broker_module=/usr/lib/nagios/alerta-neb.o http://alertahost:8080/api heartbeat_timeout=300
I searched the code a bit and I didn't see anything in the documentation. Sorry if this is already a feature or I've missed something.
-Ryan
Is there a way to filter out SOFT state checks, so only HARD states are shown in Alerta?
If not yet, this would be a great feature.
Hi All,
I have installed Alerta and Nagios on different servers.
On the first server (Alerta) I follow the steps on official documentation on Ubuntu 16 (http://docs.alerta.io/en/latest/tutorials.html#tutorials). It run and I can send alarm from command line on localhost.
On second server I've installed Nagios Core 4 and Nagios-alerta from this repo.
But I can't send any alarm.
I try some configuration but I receive different error:
where I wrong?
Thx.
with nagios4/nagios3 build the log:
[1540218035] Error: Could not load module '/usr/lib/nagios/alerta-neb.o' -> /usr/lib/nagios/alerta-neb.o: undefined symbol: write_to_all_logs
[1540218035] Error: Failed to load module '/usr/lib/nagios/alerta-neb.o'.
If the field long_desc is multiline the http call is invalid causing alerta api to returrn 400.
Example:
[1520768752] [alerta] Service check received.
[1520768752] {"origin":"nagios/op5","resource":"db103","event":"SAP HANA Services","group":"Nagios","severity":"normal","environment":"Production","service":["Platform"],"tags":["check=Active"],"text":"daemon:YES
nameserver:YES
preprocessor:YES
webdispatcher:YES
compileserver:YES
indexserver:YES
","value":"1/3 (Hard)","type":"nagiosServiceAlert","rawData":""}
[1520768752] [alerta] HTTP server error (status=400)
Hi,
How to define default user-agent value? I would tell it empty according to my http logs.
Now I'm trying to organize backend management in varnish (I mean route http requests)
nagios -> haproxy -> varnish -> alerta
Thanks a lot for your help.
Oleksandr
experiencing many time_wait connections with nagios neb module. nagios and alerta on are 2 separate hosts. nagios hosts has 1000's of time_wait connections but alerta has none. any ideas?
I am running Nagios v4.0.7 on my CentOS 5.10 x64 linux machine. I have tried your module, it gets compiled and installed. Unfortunatelly when I try to add it to nagios and restart, nagios fails to start and I get this error:
Aug 14 09:24:44 zira nagios: Error: Module '/usr/lib64/nagios/alerta-neb.o' is using an old or unspecified version of the event broker API. Module will be unloaded.
Aug 14 09:24:44 zira nagios: Event broker module '/usr/lib64/nagios/alerta-neb.o' deinitialized successfully.
Aug 14 09:24:44 zira nagios: Error: Failed to load module '/usr/lib64/nagios/alerta-neb.o'.
Aug 14 09:24:44 zira nagios: Error: Module loading failed. Aborting.
I suspect that this is because the alerta module was created for nagios 3.x and not for nagios 4.x. Is there anyway this module can be modified to run on nagios 4.x?
Thanks,
Issue Summary
My nagios process is crashing if I enable the event broker,
can you verify if nagios >= 4.4.3 should work?
Environment
OS: Linux
API version: Newest
Deployment: Docker
Database: Postgres
Hello
Issue Summary
When Alerta Neb is running, it log all event (curl response ?) in /var/log/messages even if notin debug mode
/var/log/messages before : 10Mo after 700Mo !!
Environment
OS: Linux
Nagios Alerta version: [5.0.1]
To Reproduce
Steps to reproduce the behavior:
stop/start nagios
look on /var/log/messages
See messages
Expected behavior
No messages in /var/log/messages but in /var/log/alerta/alerta.log with verbosity level
Screenshots for one event
Jul 26 15:20:40 nagios: {
Jul 26 15:20:40 nagios: "alert": {
Jul 26 15:20:40 nagios: "attributes": {
Jul 26 15:20:40 nagios: "CI source": "xxxxxxxxxxxxxxxxxxxxxx.fr",
Jul 26 15:20:40 nagios: "Fiche_Consigne": "<a href=hxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx target="_blank">Nagios.pdf",
Jul 26 15:20:40 nagios: "Sonde_Nagios": "check_specifique_UMANIS_JOBQ_Ctrl_Alert_Lost",
Jul 26 15:20:40 nagios: "checkCommand": "check_specifique_UMANIS_JOBQ_Ctrl_Alert_Lost",
Jul 26 15:20:40 nagios: "commandLine": "/opt/nrpe/libexec/check_ACT400_liste.pl "ZZ2023CD" xxxxxxxxxxxxxxxxxxxxxxxxxx "OPT0001" "OPT0002""
Jul 26 15:20:40 nagios: },
Jul 26 15:20:40 nagios: "correlate": [],
Jul 26 15:20:40 nagios: "createTime": "2021-07-20T09:05:13.536Z",
Jul 26 15:20:40 nagios: "customer": "xxxxxxxxxxxxxxx",
Jul 26 15:20:40 nagios: "duplicateCount": 5,
Jul 26 15:20:40 nagios: "environment": "recette",
Jul 26 15:20:40 nagios: "event": "job-ctrl-alert-lost .job-ctrl-alert-lost",
Jul 26 15:20:40 nagios: "group": "Nagios",
Jul 26 15:20:40 nagios: "history": [
Jul 26 15:20:40 nagios: {
Jul 26 15:20:40 nagios: "event": "job-ctrl-alert-lost .job-ctrl-alert-lost",
Jul 26 15:20:40 nagios: "href": "/api/alert/883ab417-b890-49e7-a9bb-6bd9eb24606a",
Jul 26 15:20:40 nagios: "id": "883ab417-b890-49e7-a9bb-6bd9eb24606a",
Jul 26 15:20:40 nagios: "severity": "normal",
Jul 26 15:20:40 nagios: "status": "closed",
Jul 26 15:20:40 nagios: "text": "OK : OPT0001 - Aucune alerte non transmise \u00e0 OMi",
Jul 26 15:20:40 nagios: "timeout": 86400,
Jul 26 15:20:40 nagios: "type": "new",
Jul 26 15:20:40 nagios: "updateTime": "2021-07-20T09:05:13.536Z",
Jul 26 15:20:40 nagios: "user": "adm-randreet",
Jul 26 15:20:40 nagios: "value": "1/2 (Hard)"
Jul 26 15:20:40 nagios: }
Jul 26 15:20:40 nagios: ],
Jul 26 15:20:40 nagios: "href": "/api/alert/883ab417-b890-49e7-a9bb-6bd9eb24606a",
Jul 26 15:20:40 nagios: "id": "883ab417-b890-49e7-a9bb-6bd9eb24606a",
Jul 26 15:20:40 nagios: "lastReceiveId": "ddffe313-c72a-4eb9-ac01-a1f1798c7f59",
Jul 26 15:20:40 nagios: "lastReceiveTime": "2021-07-26T13:20:40.403Z",
Jul 26 15:20:40 nagios: "origin": "nagios/xxxxxxxxxxxxxx",
Jul 26 15:20:40 nagios: "previousSeverity": "indeterminate",
Jul 26 15:20:40 nagios: "rawData": "",
Jul 26 15:20:40 nagios: "receiveTime": "2021-07-20T09:05:13.546Z",
Jul 26 15:20:40 nagios: "repeat": true,
Jul 26 15:20:40 nagios: "resource": "xxxxxxxxxxxxxxxxxxxxxxx",
Jul 26 15:20:40 nagios: "service": [
Jul 26 15:20:40 nagios: "NAGIOS"
Jul 26 15:20:40 nagios: ],
Jul 26 15:20:40 nagios: "severity": "normal",
Jul 26 15:20:40 nagios: "status": "closed",
Jul 26 15:20:40 nagios: "tags": [
Jul 26 15:20:40 nagios: "Active"
Jul 26 15:20:40 nagios: ],
Jul 26 15:20:40 nagios: "text": "OK : OPT0001 - Aucune alerte non transmise \u00e0 OMi",
Jul 26 15:20:40 nagios: "timeout": 86400,
Jul 26 15:20:40 nagios: "trendIndication": "noChange",
Jul 26 15:20:40 nagios: "type": "nagiosServiceAlert",
Jul 26 15:20:40 nagios: "updateTime": "2021-07-20T09:05:13.546Z",
Jul 26 15:20:40 nagios: "value": "1/2 (Hard)"
Jul 26 15:20:40 nagios: },
Jul 26 15:20:40 nagios: "id": "883ab417-b890-49e7-a9bb-6bd9eb24606a",
Jul 26 15:20:40 nagios: "status": "ok"
Jul 26 15:20:40 nagios: }
Additional context
Uses with mod_gearman and livestatus (1.2.8)
Thank's
tried the nagios module with naemon, works after some small changes made to the logging, but when the module is enabled livestatus stops working. Any idea what could be causing it? /cc @tester22
https://www.naemon.org/
https://mathias-kettner.de/checkmk_livestatus.html
Is there a way to send acknowledgement from Alerta-WebUI to NAGIOS core?
Use coma-separated fields like...
define host{
use generic-host ; Name of host template to use
host_name localhost
alias localhost
address 127.0.0.1
_Environment Production
_Service Mobile,Web,Tablet
}
Is your feature request related to a problem? Please describe.
Nagios commands may contain sensitive informations. So it can be useful to have feature to prevent sending "Command Line" to Alerta.
Naemon can't start with alerta-neb.o.
Steps to reproduce the behavior:
apt-get install -y git-core curl gcc make libcurl4-openssl-dev libjansson-dev libglib2.0-dev pkg-config
git clone https://github.com/alerta/nagios-alerta.git
cd nagios-alerta
make naemon
sudo -s
make install
echo "broker_module=/usr/lib/nagios/alerta-neb.o" >> /etc/naemon/naemon.cfg
systemctl restart naemon
Naemon won't start. In /var/log/naemon/naemon.log you can find:
[1625560541] Error: Module '/usr/lib/nagios/alerta-neb.o' is using an old or unspecified version of the event broker API. Module will be unloaded.
[1625560541] Event broker module '/usr/lib/nagios/alerta-neb.o' deinitialized successfully.
[1625560541] Error: Failed to load module '/usr/lib/nagios/alerta-neb.o'.
Hello,
I'm having problems getting Windows disks events in Alerta, where a message example is below:
Aug 19 01:13:44 swone nagios: SERVICE ALERT: EKWS-DB;D Drive Usage;WARNING;SOFT;1;d:\ - total: 150.00 Gb - used: 122.97 Gb (82%) - free 27.02 Gb (18%)
Aug 19 01:13:44 swone nagios: {"origin":"nagios3/swone.expresso.net.br","resource":"EKWS-DB","event":"D Drive Usage","group":"Nagios","severity":"warning","environment":"Production","service":["Platform"],"tags":["check=Active"],"text":"d:\ - total: 150.00 Gb - used: 122.97 Gb (82%) - free 27.02 Gb (18%)","value":"1/3 (Soft)","type":"nagiosServiceAlert","rawData":"'d:\ Used Space'=122.97Gb;120.00;142.50;0.00;150.00"}#012#015
How can I replace the ":" character to prevent a broken JSON?
Is your feature request related to a problem? Please describe.
Service or host could have notifications_enabled=0 flag in Nagios to skip notification (i.e. non-production host or not important service). Currently, NEB ignores notifications_enabled=0 flag and show alerts for all services or hosts.
Describe the solution you'd like
Flag in NEB configuration allowing to define whether notifications_enabled flag should be taken into account while sending events to alerta.
Describe alternatives you've considered
None at the moment.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.