Zabbix Integration Cookbook¶ ↑

Overview¶ ↑

This cookbook integrates all systems with Zabbix monitoring using Cluster Chef.

It has two kinds of recipes:

1) Recipes for integrated monitoring of common aspects of components like logs, ports, daemons, &c.

2) Recipes for standalone monitors of distributed systems like databases that Zabbix can’t talk to directly.

Usage¶ ↑

Aspect Monitoring¶ ↑

Components following the Metachef cookbook’s approach announce themselves along with their aspects. A typical example might be

# in webapp/recipes/backend.rb
announce(:webapp, :backend, {
  :logs    => { :unicorn => '/var/www/webapp/shared/logs' },
  :ports   => { :master  => 8080, :slave1 => 8081, :slave2 => 8082 },
  :daemons => { :unicorn => 'webapp_backend' }
  })

# in webapp/recipes/frontend.rb
announce(:webapp, :frontend, {
  :logs    => { :nginx => '/var/www/webapp/shared/logs' },
  :ports   => { :http  => 80 },
  :daemons => { :nginx => 'nginx' }
})

# in webapp/recipes/worker.rb
announce(:webapp, :worker, {
  :logs    => { :resque => '/var/www/webapp/shared/logs' },
  :ports   => { :admin  => 9000 },
  :daemons => { :web => 'resque_web', :worker1 => 'resque_worker_1, :worker2 -> 'resque_worker_2' }
})

These recipes may run on the same or on different machines. Each announcement decalres all the aspects in a way that allows them to be monitored or processed in a fashion decoupled from the ‘webapp’ cookbook itself.

This cookbook sets up Zabbix items and triggers to monitor and alert on each aspect a component may declare.

The following recipes are available:

zabbix_integration::logs¶ ↑

This recipe monitors log files and directories, ensuring that the files within do not grow too large.

The following attributes define the default maximum file size and alert priority for a bloated log file.

default[:zabbix_integration][:logs][:file_size][:max_size] = 200
default[:zabbix_integration][:logs][:file_size][:priority] = :warning

Declaring a log file and overriding these options occur in the same place.

Here’s a component declaring a log directory. All files immediately within the directory will be monitored with the default settings:

announce(:webapp, :worker, { :logs => "/var/www/webapp/logs" })

Here’s a component declaring a specific log file:

announce(:webapp, :worker, {
  :logs => {
    :access => "/var/www/webapp/webapp.access.log",
    :error  => "/var/www/webapp/webapp.error.log"
  }})

There are two reasons for individually specifying log files in the same directory:

1) You want to provide different handling for each file (see customization below)

2) Not all the other files in the directory are logs.

Here’s a component customizing the maximum log file size and the alert priority:

announce(:webapp, :worker, {
  :logs => {
    # high throughput access log
    :access => {
      :path     => "/var/www/webapp/webapp.access.log",

:max_size => 1024, # in MB

  :priority => :high # or: not_classified information warning average high disaster
},
# low throughput error log
:error => {
  :path     => "/var/www/webapp/webapp.error.log",

:max_size => 200, :priority => :warning

}}})

A log aspect can be declared and monitoring can be selectively disabled:

announce(:webapp, :worker, {
  :logs => {
    :access => {
      :path    => "/var/www/webapp/webapp.access.log",

:monitor => false

}}})

zabbix_integration::ports¶ ↑

This recipe monitors ports along with their availability and performance. Failures to connect to a port will cause an availability alert and poor response time from a port will cause a response time alert.

The following attributes define the default maximum failures and the default maximum average response time over chosen intervals with chosen alert priority:

default[:zabbix_integration][:ports][:availability][:window]   = 900
default[:zabbix_integration][:ports][:availability][:failures] = 5
default[:zabbix_integration][:ports][:availability][:priority] = :high

default[:zabbix_integration][:ports][:response_time][:window]   = 900
default[:zabbix_integration][:ports][:response_time][:average]  = 3.0
default[:zabbix_integration][:ports][:response_time][:priority] = :warning

Declaring a port and overriding these options occur in the same place.

Here’s a simple component with some ports:

announce(:webapp, :frontend, :ports => { :app => 8080, :admin => 8081 })

Here we dig in, specifying a different tolerance for failures and performance:

announce(:webapp, :frontend, {
  :ports => {
    :app => {
      :port 	       => 8080,

:failures => 2, :response_time => 1.0

},
:admin => {
  :port 	       => 8081,

:failures => 20, :repsonse_time => 5.0

}}})

Both the maximum number of failures and the average response time are measured over the last 15 minutes (900 s). This timeframe as well as the priority can be configured on a per-port basis:

announce(:webapp, :frontend, {
  :ports => {
    :app => {
      :port 	       	       	=> 8080

:failures => 2, :availability_window => 900, :availability_priority => :high,

:response_time => 1.0, :response_time_window => 60, :response_time_priority => :high

},
...
}}})

All port-checks are defined as “simple” Zabbix checks so they will be initiated by the server to the node – the node is not just looping back on itself to measure a local port’s performance.

A port aspect can be announced and have its monitoring disabled:

announce(:webapp, :frontend, {
  :ports => {
    :app => {
      :port    => 8080,

:monitor => false

}}})

zabbix_integration::daemons¶ ↑

This recipe monitors running daemons. If a daemon process is not found to be running, an alert will be sent.

The following attributes define the default priority for alerts triggered when daemons are found to be not running:

default[:zabbix_integration][:daemons][:running][:priority] = :average

Declaring a daemon and overriding this default option occur in the same place.

Here’s a simple component with a daemon:

announce(:webapp, :frontend, :daemons => { :unicorn => 'unicorn' })

If no process naemd ‘unicorn’ can be found on the node then this will cause an alert.

Some command line programs, typically those that are compiled, are their own ‘process name’: ‘nginx’, ‘sort’, ‘sshd’, &c. Interpreted scripts (‘/usr/bin/my-ruby-script’) also work this way.

For some daemons, the process name may not be useful or may not be enough to distinguish the daemon. In this case we can pass another option

announce(:webapp, :frontend, {
  :daemons => {
    :php_cgi => {
      :name => 'php-cgi',

:cmd => ‘webapp.ini’

}}})

Where the actual command line of the process is filtered using ‘webapp.ini’ (it’s not clear what the limits to this filtering syntax is – best not get too crazy).

Sometimes its sufficient to identify a process run by a user:

announce(:webapp, :frontend, {
  :daemons => {
    :php_cgi => {
      :name => 'php-cgi',

:user => ‘www-data’

}}})

It’s also possible to list multiple processes with specific counts:

announce(:webapp, :frontend, {
  :daemons => {
    :unicorn_master => {
      :name   => '',

:cmd => ‘unicorn master’, :number => 1

},
:unicorn_worker => {
  :name   => '',

:cmd => ‘unicorn worker’, :number => 4

}}})

A daemon aspect can be announced and have its monitoring disabled:

announce(:webapp, :frontend, {
  :daemons => {
    :nginx => {
      :name    => 'nginx',

:monitor => false

}}})

Standalone Monitors¶ ↑

Zabbix likes to monitor systems via the Zabbix agent that is installed on each monitored machine. This works well for systems that are limited to a single node and that can speak Zabbix’s protocol or be defined via UserParameters but fails horribly when trying to monitor something like a distributed database.

Happily Zabbix allows for measurements to be directly sent to it via some long-running script.

This cookbook defines several such Ruby scripts. Each:

provides performance metrics about a (possibly distributed) component
without having to be deployed on the same nodes that run the component
submits this data to a Zabbix server via a persistent FIFO set up on the node.

The recipes which set up standalone monitors include:

zabbix_integration::elasticsearch_monitor to monitor an ElasticSearch cluster
zabbix_integration::flume_monitor to monitor a Flume cluster
zabbix_integration::hbase_monitor to monitor an HBase cluster
zabbix_integration::redis_monitor to monitor Redis nodes
zabbix_integration::mongodb_monitor to monitor MongoDB nodes

strategist922 / zabbix_integration Goto Github PK

zabbix_integration's Introduction

Zabbix Integration Cookbook¶ ↑

Overview¶ ↑

Usage¶ ↑

Aspect Monitoring¶ ↑

zabbix_integration::logs¶ ↑

zabbix_integration::ports¶ ↑

zabbix_integration::daemons¶ ↑

Standalone Monitors¶ ↑

zabbix_integration's People

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent