Code Monkey home page Code Monkey logo

rpc-maas's Introduction

Monitoring as a Service (MaaS) for Rackspace Private Cloud

tags

openstack, rpc, cloud, ansible, maas, rackspace

category

*nix

Deployment, setup and installation of Rackspace MaaS for Rackspace Private clouds.

RPC-MaaS Monitoring

These playbooks allow deployers to monitor clouds using Rackspace Monitoring as a Service. The playbooks can be used with OpenStack-Ansible or on their own using Ansible static inventory.

Documentation

Documentation for the project can be found under: docs/source/index.rst

To build the documentation simply execute tox -e docs.

Submitting Bugs

Please submit all bugs to the rpc-maas repository: https://jira.rax.io/browse/RPCOS

Local Testing

To test these playbooks within a local environment you will need a single server with at leasts 8GiB of RAM and 40GiB of storage on root. Running an m1.medium (openstack) flavor size is generally enough to get an environment online.

To run the local functional tests execute the run-tests.sh script out of the tests directory. This will create an OpenStack AIO and run the RPC-MaaS playbooks against the newly constructed environment.

tests/run-tests.sh

rpc-maas's People

Contributors

andymcc avatar bjoernt avatar briancurtin avatar cfarquhar avatar chrsmeca avatar cloudnull avatar corystone avatar creiht avatar evrardjp avatar git-harry avatar hmakkapati avatar hughsaunders avatar jcannava avatar jedsmith avatar major avatar mancdaz avatar mattt416 avatar michaelrice avatar miguelgrinberg avatar npawelek avatar prometheanfire avatar racker-rick avatar sdhardy avatar seancarlisle avatar shahzaib-bhatia avatar sigmavirus24 avatar stevelle avatar supermari0 avatar tonytan4ever avatar xgerman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rpc-maas's Issues

cinder_service_check.py not using --host option

Backporting from master/kilo to juno from: https://github.com/rcbops/rpc-openstack/issues/200

The host argument is not used by the script leading to metrics being returned when they shouldn't.
Witthout host arg:
./cinder_service_check.py 172.29.236.3
status okay
metric cinder-scheduler_on_host_maas2-node3_cinder_scheduler_container-0377f639 uint32 1
metric cinder-scheduler_on_host_maas2-node1_cinder_scheduler_container-2706e6d1 uint32 1
metric cinder-scheduler_on_host_maas2-node2_cinder_scheduler_container-38784141 uint32 1
metric cinder-volume_on_host_maas2-node4 uint32 1

With host arg:
./cinder_service_check.py 172.29.236.3 --host maas2-node3_cinder_scheduler_container-0377f639
status okay
metric cinder-scheduler_status uint32 1
metric cinder-scheduler_on_host_maas2-node1_cinder_scheduler_container-2706e6d1 uint32 1
metric cinder-scheduler_on_host_maas2-node2_cinder_scheduler_container-38784141 uint32 1
metric cinder-volume_on_host_maas2-node4 uint32 1

Expected output with host arg:
./cinder_service_check.py 172.29.236.3 --host maas2-node3_cinder_scheduler_container-0377f639
status okay
metric cinder-scheduler_status uint32 1

CMD Checks need to be setup by play books

I am finally able to pull metrics from our customer accounts and realised that there is no CDM data. These checks can be configured via the api and I would like to get them added as part of the playbooks.

No alert for down compute node

Currently in CDC environments we are not notified if a compute node goes hard down. nova_service_check.py only checks the service locally on itself. This check needs to be performed from neighboring compute nodes and/or infra nodes to check if other nodes are up. This has already happened to a customer where they had a compute node down for 8hrs and we never alerted on it.

queries_per_second is actually just cumulative queries

when the plugin queries mysql, total queries is returned but we report it as queries per second

https://github.com/rcbops/rpc-maas/blob/master/galera_check.py#L47

MariaDB [(none)]> show status where variable_name REGEXP '^Queries';
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| Queries       | 376807 |
+---------------+--------+

We need some logic in the plugin to make 2 calls n seconds apart, and the delta is the qps

HP monitoring plugin doesn't monitor raid controller status

Currently the plugin monitors disk/cpu/dimm status, but not the raid controller itself. This has affected customers who've had a raid controller fail.

My first assumption was that a failed raid controller means the underlying raid array is innaccessible, and so no alerts would get out anyway, but on reading it seems that some arrays can fail partially, meaning they don't recognise some of the disks in the array, but do recognise enough to keep the array accessible. Therefore, it's worth throwing a controller monitor into the mix (it's really not much overhead to do so anyway).

ansible-rpc-lxc v9.0.3 — `python-neutronclient` upgrades to known broken version

Within the ansible-lxc-rpc repo, v9.0.3 branch, the specified version of python-neutronclient to install is set to 2.3.6.

However, within the rpc-maas repo, the requirements.txt file overrides this setting by specifying another version (2.3.9). This introduces a Keystone patch that ignores specified endpoint types (a known bug that's been patched in newer branches: https://bugs.launchpad.net/python-neutronclient/+bug/1368676). For our labs, this breaks client functionality due to our networking setup.

If the specific versions are removed from the requirements.txt file, then the packages can fallback to what's set in the developer's managed repositories.

rabbitmq_status.py issues

I've started to see this type of failure on newly built clusters:

root@jenk-heat-158-node1:/usr/lib/rackspace-monitoring-agent/plugins# ./rabbitmq_status.py -H 172.29.236.126 -n jenk-heat-158-node1_rabbit_mq_container-fd99ca31
status error Detected RabbitMQ connections with multiple channels. Please check RabbitMQ and all Openstack consumers
root@jenk-heat-158-node1:/usr/lib/rackspace-monitoring-agent/plugins#

It looks like #139 introduced this change -- however the issue seems to involve us alarming when channel['number'] > 1, but number here is actually the channel number which doesn't make sense to alarm on (assuming I'm reading this correctly).

I believe what @cfarquhar requested was not implemented -- we should be checking /api/connections and not /api/channels, and perhaps we can alarm if channels > channel_max? Some research will need to be done here as I'm not overly clear on the specifics of rabbitmq channels.

Mentioning @BjoernT for input.

--Matt

rabbitmq_status.py assumes cluster size of 3 hosts

The rabbitmq_status.py assumes cluster size of 3 hosts which is a hard coded number. In cases where we rabbitmq cluster > 3 nodes we will run into monitoring issues. Please either determine that from the cluster status or make a option available

`nova_service_check.py` can't list agents

See Issue: #115

File: nova_service_check.py
Arguments: --host 573962-compute14 192.168.96.10
Entity Label: 573962-compute14.qe1.iad3.rackspace.com

The host being passed as an argument for this check needs to be the FQDN of the entity, not just it's hostname. When this is corrected, the check's successful.

For example:

root@569038-infra02:/usr/lib/rackspace-monitoring-agent/plugins# ./nova_service_check.py --host 573962-compute14 192.168.96.10
status error No host(s) found in the service list

Compared to:

root@569038-infra02:/usr/lib/rackspace-monitoring-agent/plugins# ./nova_service_check.py --host 573962-compute14.qe1.iad3.rackspace.com 192.168.96.10
status okay
metric nova-compute_status uint32 1

os-a-d no longer enables nova's v3 API

Our nova plugins query the v3 api, however this is now disabled in os-a-d kilo and cannot be enabled. While that functionality might be enabled, it's probably best that we use v2(.1) instead.

Swift MaaS Replication Checks Failing - Traceback - KeyError: 'replication_type'

MaaS installation from RPC 10.1.2...

Account Replication

root@lab-byron-upgrade-n07:/usr/lib/rackspace-monitoring-agent/plugins# python swift-recon.py --ring-type account replication
status error main()\n File "swift-recon.py", line 352, in main\n stats = swift_replication(args.ring)\n File "swift-recon.py", line 156, in swift_replication\n replication_statistics[rep_dict.pop('replication_type')] = rep_dict\nKeyError: 'replication_type'\n
Traceback (most recent call last):
  File "swift-recon.py", line 365, in <module>
    main()
  File "swift-recon.py", line 352, in main
    stats = swift_replication(args.ring)
  File "swift-recon.py", line 156, in swift_replication
    replication_statistics[rep_dict.pop('replication_type')] = rep_dict
KeyError: 'replication_type'

Container Replication

root@lab-byron-upgrade-n07:/usr/lib/rackspace-monitoring-agent/plugins# python swift-recon.py --ring-type container replication
status error main()\n File "swift-recon.py", line 352, in main\n stats = swift_replication(args.ring)\n File "swift-recon.py", line 156, in swift_replication\n replication_statistics[rep_dict.pop('replication_type')] = rep_dict\nKeyError: 'replication_type'\n
Traceback (most recent call last):
  File "swift-recon.py", line 365, in <module>
    main()
  File "swift-recon.py", line 352, in main
    stats = swift_replication(args.ring)
  File "swift-recon.py", line 156, in swift_replication
    replication_statistics[rep_dict.pop('replication_type')] = rep_dict
KeyError: 'replication_type'

Object Replication

root@lab-byron-upgrade-n07:/usr/lib/rackspace-monitoring-agent/plugins# python swift-recon.py --ring-type object replication
status error main()\n File "swift-recon.py", line 352, in main\n stats = swift_replication(args.ring)\n File "swift-recon.py", line 156, in swift_replication\n replication_statistics[rep_dict.pop('replication_type')] = rep_dict\nKeyError: 'replication_type'\n
Traceback (most recent call last):
  File "swift-recon.py", line 365, in <module>
    main()
  File "swift-recon.py", line 352, in main
    stats = swift_replication(args.ring)
  File "swift-recon.py", line 156, in swift_replication
    replication_statistics[rep_dict.pop('replication_type')] = rep_dict
KeyError: 'replication_type'

Alert on predictive failure

Dell script openmanage.py does not seem to be catching predictive drive failures. Unfortunately predictive failures do not "degrade" virtual disks or show up under omreport chassis. The failure predicted status may be found with an "omreport storage pdisk controller=0" under the "Failure Predicted" reading.

openmanage.py: Update to work with OMSA 7.4

OMSA 7.4 is the latest and works fine on Ubuntu 14.04. I'm having issues getting OMSA 7.1, which is what is hard-coded in openmanage.py, to work properly on Ubuntu 14.04. Changing the version string [1] to 7.4 gets me past the version check, but 7.4 adds an additional line [2] [3] that matches 'Status' and causes the check to fail.

[1] https://github.com/rcbops/rpc-maas/blob/9.0.1/openmanage.py#L68
[2] OMSA 7.1

 # omreport storage vdisk | grep Status
Status                    : Ok

[3] OMSA 7.4

# omreport storage vdisk | grep Status
Status                            : Ok
T10 Protection Information Status : No

Create memcache check

Currently we do not check important memcache metric like:

  • bytes
  • evictions
  • curr_connections

Since memcache can affect the performance of openstack negatively we should have those checks in place

swift-recon.py plugin is broken

The swift-recon.py plugin is currently failing, due to a call to a non-existent method in maas_common:

root@aio1:/usr/lib/rackspace-monitoring-agent/plugins# python swift-recon.py --ring-type object replication
status error common.py", line 459, in print_output\n    yield\n  File "swift-recon.py", line 380, in <module>\n    main()\n  File "swift-recon.py", line 371, in main\n    maas_common.status_error(str(e))\nAttributeError: 'module' object has no attribute 'status_error'\n
Traceback (most recent call last):
  File "swift-recon.py", line 380, in <module>
    main()
  File "swift-recon.py", line 371, in main
    maas_common.status_error(str(e))
AttributeError: 'module' object has no attribute 'status_error'
root@aio1:/usr/lib/rackspace-monitoring-agent/plugins#

v10.1.18: RackspaceMonitoringValidationError - alarmParseError - Failed to parse alarm

Started seeing this in 10.1.18...

PLAY [rabbit] ***************************************************************** 

GATHERING FACTS *************************************************************** 
ok: [bc08a2dd-infra1_rabbit_mq_container-1ae54342]
ok: [bc08a2dd-infra1_rabbit_mq_container-96c29a24]
ok: [bc08a2dd-infra1_rabbit_mq_container-ff443063]

TASK: [maas_local | Get entity ID for physical_host] ************************** 
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24]
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342]
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063]

TASK: [maas_local | Validate if check exists] ********************************* 
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342]
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24]
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063]

TASK: [maas_local | Create check if it does not exist] ************************ 
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24]
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063]
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342]

TASK: [maas_local | Get check ID for newly created check] ********************* 
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342]
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063]
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24]

TASK: [maas_local | Validate if alarm exists] ********************************* 
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'})
failed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "delta": "0:00:01.997031", "end": "2015-12-11 00:48:03.280964", "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.283933"}
...ignoring
failed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item={'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "delta": "0:00:01.224140", "end": "2015-12-11 00:48:04.636171", "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:03.412031"}
...ignoring
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'})
failed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "delta": "0:00:01.312386", "end": "2015-12-11 00:48:02.488470", "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.176084"}
...ignoring
failed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item={'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "delta": "0:00:02.362091", "end": "2015-12-11 00:48:05.053890", "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:02.691799"}
...ignoring
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'})
changed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'})
failed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "delta": "0:00:02.497601", "end": "2015-12-11 00:48:03.865509", "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.367908"}
...ignoring
failed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item={'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}) => {"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "delta": "0:00:01.161495", "end": "2015-12-11 00:48:05.226314", "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:04.064819"}
...ignoring

TASK: [maas_local | Create alarm if it does not exist] ************************ 
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:54.141326', u'stdout': u'<Alarm: id=almyno8aeP, label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.458702', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:52.682624'}, {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:55.601866', u'stdout': u'<Alarm: id=alWybK5fT1, label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.271923', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:54.329943'}, {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:56.962498', u'stdout': u'<Alarm: id=alzVUuAW9l, label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}, u'stderr': u'', u'delta': u'0:00:01.156999', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:55.805499'}, {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:58.314845', u'stdout': u'<Alarm: id=alnYrpvCD5, label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is rskipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:54.020611', u'stdout': u'<Alarm: id=alwbgmADAZ, label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.329205', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:52.691406'}, {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:55.668497', u'stdout': u'<Alarm: id=alWHhf9XDY, label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.448378', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:54.220119'}, {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:56.894737', u'stdout': u'<Alarm: id=als1pNQON1, label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}, u'stderr': u'', u'delta': u'0:00:01.019669', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:55.875068'}, {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:58.386307', u'stdout': u'<Alarm: id=alt07U3NNd, label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is rskipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:53.821423', u'stdout': u'<Alarm: id=alwAoxtYGI, label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.109477', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_disk_free_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:52.711946'}, {'name': 'rabbitmq_disk_free_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_disk_free_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_disk_free_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:55.211417', u'stdout': u'<Alarm: id=aljWHZh48s, label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}, u'stderr': u'', u'delta': u'0:00:01.190727', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_mem_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:54.020690'}, {'name': 'rabbitmq_mem_alarm_status', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_mem_alarm_status"] != 1) { return new AlarmStatus(CRITICAL, "rabbitmq_mem_alarm_status triggered"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:56.448384', u'stdout': u'<Alarm: id=alWyQmKF5R, label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}, u'stderr': u'', u'delta': u'0:00:01.129503', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_max_channels_per_conn--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:55.318881'}, {'name': 'rabbitmq_max_channels_per_conn', 'criteria': u':set consecutiveCount=3 if (metric["rabbitmq_max_channels_per_conn"] > 10) { return new AlarmStatus(CRITICAL, "Detected RabbitMQ connections with > 10 channels, check RabbitMQ and all Openstack consumers"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:57.884949', u'stdout': u'<Alarm: id=alze8sZrav, label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.150314', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:57.164531'}, {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:59.667935', u'stdout': u'<Alarm: id=alcztrQxuG, label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.198971', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:58.468964'}, {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:01.156448', u'stdout': u'<Alarm: id=ala5rzAHJH, label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 0, 'item': {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.289177', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:47:59.867271'}, {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}])
failed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:03.865509', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 1, 'item': {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:02.497601', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:48:01.367908'}, {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chJs9elVwM --notification-plan npTechnicalContactsEmail --label rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24 --criteria ' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }'", "delta": "0:00:00.956843", "end": "2015-12-11 00:48:06.560366", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "delta": "0:00:02.497601", "end": "2015-12-11 00:48:03.865509", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "module_name": "shell"}, "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.367908", "stderr": "", "stdout": ""}, {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:05.603523"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 23, 'error_line': 1, 'error_position': 23, 'input': ' :set conseaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.296015', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:57.090292'}, {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:59.643110', u'stdout': u'<Alarm: id=alvTyr5Gh8, label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.049402', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:58.593708'}, {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:01.078499', u'stdout': u'<Alarm: id=aly4sAl5ot, label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 0, 'item': {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.231186', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:47:59.847313'}, {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}])
failed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:03.280964', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 1, 'item': {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:01.997031', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:48:01.283933'}, {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chfYGkCra9 --notification-plan npTechnicalContactsEmail --label rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063 --criteria ' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }'", "delta": "0:00:01.096529", "end": "2015-12-11 00:48:06.738768", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "delta": "0:00:01.997031", "end": "2015-12-11 00:48:03.280964", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "module_name": "shell"}, "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.283933", "stderr": "", "stdout": ""}, {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:05.642239"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 23, 'error_line': 1, 'error_position': 23, 'input': ' :set conseaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.242282', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_fd_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:56.642667'}, {'name': 'rabbitmq_fd_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_fd_used"],metric["rabbitmq_fd_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ file descriptors is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:47:59.504148', u'stdout': u'<Alarm: id=alpy8gceXK, label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.414871', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_proc_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:58.089277'}, {'name': 'rabbitmq_proc_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_proc_used"],metric["rabbitmq_proc_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ processes is reaching configured limit"); }'}])
skipping: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:00.966621', u'stdout': u'<Alarm: id=al6WtZDa8J, label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342 ...>', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 0, 'item': {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}, u'stderr': u'', u'delta': u'0:00:01.255064', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_socket_used_alarm_status--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:47:59.711557'}, {'name': 'rabbitmq_socket_used_alarm_status', 'criteria': u':set consecutiveCount=3 if (percentage(metric["rabbitmq_socket_used"],metric["rabbitmq_socket_total"]) >= 90 ) { return new AlarmStatus(CRITICAL, "RabbitMQ sockets is reaching configured limit"); }'}])
failed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:02.488470', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 1, 'item': {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:01.312386', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:48:01.176084'}, {'name': 'rabbitmq_msgs_excl_notifications', 'criteria': u' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chQsYSSDdI --notification-plan npTechnicalContactsEmail --label rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342 --criteria ' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }'", "delta": "0:00:01.292190", "end": "2015-12-11 00:48:06.935701", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "delta": "0:00:01.312386", "end": "2015-12-11 00:48:02.488470", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_msgs_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "module_name": "shell"}, "item": {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:01.176084", "stderr": "", "stdout": ""}, {"criteria": " :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric[\"rabbitmq_msgs_excl_notifications\"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, \"RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}\"); }", "name": "rabbitmq_msgs_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:05.643511"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 23, 'error_line': 1, 'error_position': 23, 'input': ' :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=23, pos=23\n :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }\n                       ^'}>
failed: [bc08a2dd-infra1_rabbit_mq_container-96c29a24] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:05.226314', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24"', u'rc': 1, 'item': {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:01.161495', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24"'}, u'start': u'2015-12-11 00:48:04.064819'}, {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chJs9elVwM --notification-plan npTechnicalContactsEmail --label rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24 --criteria ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }'", "delta": "0:00:00.938386", "end": "2015-12-11 00:48:07.708600", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "delta": "0:00:01.161495", "end": "2015-12-11 00:48:05.226314", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-96c29a24\"", "module_name": "shell"}, "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:04.064819", "stderr": "", "stdout": ""}, {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:06.770214"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 22, 'error_line': 1, 'error_position': 22, 'input': ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=22, pos=22\n:set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }\n                      ^'}>
ecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=23, pos=23\n :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }\n                       ^'}>
failed: [bc08a2dd-infra1_rabbit_mq_container-ff443063] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:04.636171', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063"', u'rc': 1, 'item': {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:01.224140', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063"'}, u'start': u'2015-12-11 00:48:03.412031'}, {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chfYGkCra9 --notification-plan npTechnicalContactsEmail --label rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063 --criteria ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }'", "delta": "0:00:00.952600", "end": "2015-12-11 00:48:07.900390", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "delta": "0:00:01.224140", "end": "2015-12-11 00:48:04.636171", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-ff443063\"", "module_name": "shell"}, "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:03.412031", "stderr": "", "stdout": ""}, {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:06.947790"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 22, 'error_line': 1, 'error_position': 22, 'input': ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=22, pos=22\n:set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }\n                      ^'}>
ecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=23, pos=23\n :set consecutiveCount={# maas_alarm_local_consecutive_count #} if (metric["rabbitmq_msgs_excl_notifications"] > {# rabbitmq_queued_messages_excluding_notifications_threshold #} ) { return new AlarmStatus(CRITICAL, "RabbitMQ sum of queued messages excluding notifications queues is reaching configured limit. Currently above {# rabbitmq_queued_messages_excluding_notifications_threshold #}"); }\n                       ^'}>
failed: [bc08a2dd-infra1_rabbit_mq_container-1ae54342] => (item=[{u'changed': True, u'end': u'2015-12-11 00:48:05.053890', u'stdout': u'', u'cmd': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342"', u'rc': 1, 'item': {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}, u'stderr': u'', u'delta': u'0:00:02.362091', 'invocation': {'module_name': 'shell', 'module_args': u'raxmon-alarms-list --entity-id enV52EnfGx | grep "label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342"'}, u'start': u'2015-12-11 00:48:02.691799'}, {'name': 'rabbitmq_qgrowth_excl_notifications', 'criteria': u':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }'}]) => {"changed": true, "cmd": "raxmon-alarms-create --entity-id enV52EnfGx --check-id chQsYSSDdI --notification-plan npTechnicalContactsEmail --label rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342 --criteria ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }'", "delta": "0:00:01.603925", "end": "2015-12-11 00:48:08.692220", "item": [{"changed": true, "cmd": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "delta": "0:00:02.362091", "end": "2015-12-11 00:48:05.053890", "invocation": {"module_args": "raxmon-alarms-list --entity-id enV52EnfGx | grep \"label=rabbitmq_qgrowth_excl_notifications--bc08a2dd-infra1_rabbit_mq_container-1ae54342\"", "module_name": "shell"}, "item": {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}, "rc": 1, "start": "2015-12-11 00:48:02.691799", "stderr": "", "stdout": ""}, {"criteria": ":set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric[\"rabbitmq_msgs_excl_notifications\"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, \"RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}\"); }", "name": "rabbitmq_qgrowth_excl_notifications"}], "rc": 1, "start": "2015-12-11 00:48:07.088295"}
stderr: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/raxmon_cli/common.py", line 130, in run_action
    callback(instance, options, args, done)
  File "/usr/local/bin/raxmon-alarms-create", line 39, in callback
    why=options.why)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 485, in create_alarm
    data=data, coerce=self.get_alarm, headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 301, in _create
    headers=headers)
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 152, in request
    raw=raw
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/openstack.py", line 577, in request
    return super(OpenStackBaseConnection, self).request(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 683, in request
    response = responseCls(**kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libcloud/common/base.py", line 116, in __init__
    raise Exception(self.parse_error())
  File "/usr/local/lib/python2.7/dist-packages/rackspace_monitoring/drivers/rackspace.py", line 107, in parse_error
    raise error
RackspaceMonitoringValidationError: <ValidationError type=alarmParseError, message="Failed to parse alarm", details={'error_token': 'endOfLine', 'error_column': 22, 'error_line': 1, 'error_position': 22, 'input': ':set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }', 'message': 'parse error: failed to match \'endOfLine\' at line=1, col=22, pos=22\n:set consecutiveCount={# maas_alarm_local_consecutive_count #} if (rate(metric["rabbitmq_msgs_excl_notifications"]) > {#rabbitmq_queue_growth_rate_threshold / maas_check_period#}) { return new AlarmStatus(CRITICAL, "RabbitMQ Queue growth rate is above configured threshold. Currently above {# rabbitmq_queue_growth_rate_threshold #}"); }\n                      ^'}>

Wrong permission on swift-dispersion.py / swift-recon.py

It looks like rpc-extras does a chmod on scripts once this repo is checked out, which means subsequent runs of setup-maas.yml will fail due to the following:

failed: [jenk-heat-167-node2] => {"failed": true}
msg: Local modifications exist in repository (force=no).

We probably need to address this on the rpc-extras side also.

horizon_check.py error:07069041:memory buffer routines:BUF_MEM_grow_clean:malloc

RPCv9 checks are failing in repo version 9.0.1

Fri Oct 24 21:33:23 2014 INF: (plugin=horizon_check.py, id=chlmWtuPvL, iid=idHjiWAC9S) -> agent.plugin (details=args="172.29.237.162",file="horizon_check.py",id="chlmWtuPvL",period=60) scheduled for 60s
Fri Oct 24 21:34:27 2014 ERR: Connection: nil (50.57.61.13:443) -> 139741911230336:error:07069041:memory buffer routines:BUF_MEM_grow_clean:malloc failure:../base/deps/luvit/deps/openssl/openssl/crypto/buffer/buffer.c:159:

Fri Oct 24 21:34:27 2014 ERR: Connection: nil (50.57.61.13:443) -> 139741911230336:error:07069041:memory buffer routines:BUF_MEM_grow_clean:malloc failure:../base/deps/luvit/deps/openssl/openssl/crypto/buffer/buffer.c:159:

Monitor connections

Would it be possible to add a monitor to compare /proc/sys/net/ipv4/netfilter/ip_conntrack_max with /proc/sys/net/ipv4/netfilter/ip_conntrack_count and alert if the connections vs max connections is 90% or more?

CMD Monitoring

From Russel Clark:

Metrics without units:

I have attached two spreadsheets. The first (agent.plugin11_16) is an actual data pull from [see Russel] You can see from the raw metrics that a number of the metrics don’t have a unit. If this is something that we submit when submitting the metric, I would like to have a unit for each. I created a pivot table so you can easily see all the metrics where unit is “unknown” and the data showing is the minimum just so you can see an example value for the metric. I guess that technically speaking, some of these don’t have a unit, but I would like to submit a unit name that at least makes some sense. In the second spreadsheet (Monitoring Metrics Units Questions) I have pulled all the metrics with an unknown unit and then in most cases written a proposed unit. I have also written a question for each (many are the same) What I would like is:

  1. If you agree with the proposed unit, go ahead and use that or if you would recommend a different unit, please let me know.
  2. There are a few where I don’t know what the metric is at all. Will you give me a definition/explanation for these ones and recommend a unit?
  3. Please will you answer the questions (Column F) – again, many are the same and I guess the answer is likely to be the same)

nova_api_local_check.py reports useless active/error/... metrics

Currently we report metrics from the novaclient.servers.list api call but doing this for the admin tenant only. This tenant usually has no active instances or any other instances in this state so I suggest to query this information across all tenants. Pull request coming

horizon.py failing on master (kilo)

horizon.py is currently failing when using requests 2.6.2 with the following error:

status error While logging in: ('Connection aborted.', ResponseNotReady())

The stack trace being caught is:

Traceback (most recent call last):
  File "./horizon_check.py", line 105, in <module>
    main(args)
  File "./horizon_check.py", line 96, in main
    check(args)
  File "./horizon_check.py", line 77, in check
    verify=False)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 508, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/requests/adapters.py", line 415, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ResponseNotReady())

If I downgrade requests to <2.6.1 this plugin seems to work fine.

cinder_api_local_check.py broken

root@infra2:~# python cinder_api_local_check.py 127.0.0.1
status error HTTPConnectionPool(host='127.0.0.1', port=8776): Max retries exceeded with url: /v1/51cfc1899ecd49f89c0d1a1db68f9f7c/volumes (Caused by <class 'socket.error'>: [Errno 111] Connection refused)
status okay
Traceback (most recent call last):
  File "cinder_api_local_check.py", line 56, in <module>
    main(args)
  File "cinder_api_local_check.py", line 48, in main
    check(auth_ref, args)
  File "cinder_api_local_check.py", line 39, in check
    metric_bool('cinder_api_local_status', is_up)
UnboundLocalError: local variable 'is_up' referenced before assignment
root@infra2:~#

Add SSL Cert Expiration Check/Alarm

If the external api checks are monitoring https endpoints [instead of http], we should add a check/alarm to for the ssl certificate expiration date. This will help ensure that alarms are raised when the cert is nearing its expiration date [or has expired] so the cert gets updated before it impacts functionality.

This will be important going forward once we force https for all public/admin endpoints.

`neutron_service_check.py` can't list agents

On the monitoring dashboard, No host(s) found in the agents list is being returned from this plugin. From the source code, that exception's raised when the agents variable doesn't contain anything.

Oddly enough, when I log into the Neutron server and run the script directly with the Neutron endpoint as an argument, it succeeds.

root@569038-infra02:/usr/lib/rackspace-monitoring-agent/plugins# ./neutron_service_check.py 192.168.96.10
status okay
metric neutron-linuxbridge-agent_123c2f04-a390-402c-9171-5ca9af8a10f3_on_host_573962-compute14.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_1c9b3a3a-2fee-4b85-8319-25d1cf57327b_on_host_573964-compute16.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_27722fec-6d91-439b-b2b7-ba8413c537b2_on_host_573965-compute17.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_2a3f586c-ce1e-4ec0-a5ab-1a5458c39eb6_on_host_573958-compute10.qe1.iad3.rackspace.com uint32 1
metric neutron-dhcp-agent_2c0d34d6-e832-4646-8c65-a6184b8cab52_on_host_569038-infra02_neutron_agents_container-0ba110fc uint32 1
metric neutron-linuxbridge-agent_311a007c-d732-42b0-aad8-eb223c95dab6_on_host_569038-infra02_neutron_agents_container-0ba110fc uint32 1
metric neutron-linuxbridge-agent_36130eac-0a4f-4b06-b71e-1b17a5b3342b_on_host_573959-compute11.qe1.iad3.rackspace.com uint32 1
metric neutron-metering-agent_4a1fffc9-6cb3-4bdf-add8-38bc567191f8_on_host_569038-infra02_neutron_agents_container-0ba110fc uint32 1
metric neutron-linuxbridge-agent_59f3d7f5-4d42-45e5-b384-069174f2a57c_on_host_573952-compute04.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_6901e4cd-9721-4625-a228-33dfb5feab03_on_host_573957-compute09.qe1.iad3.rackspace.com uint32 1
metric neutron-metadata-agent_7012ee30-52d8-484c-9fab-0fd23d7ac39f_on_host_569039-infra03_neutron_agents_container-9fd9d285 uint32 1
metric neutron-linuxbridge-agent_743d1bb3-2def-47c4-99e3-5c0169923076_on_host_573971-compute02.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_7764c472-a010-44f2-87e9-851e48bda4d9_on_host_573961-compute13.qe1.iad3.rackspace.com uint32 1
metric neutron-l3-agent_801c0873-85f6-4551-a93e-a82ad288f7b1_on_host_569039-infra03_neutron_agents_container-9fd9d285 uint32 1
metric neutron-linuxbridge-agent_8cf56dfb-31c3-4121-81f3-268bc04add91_on_host_573956-compute08.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_8d1b8b40-fdd8-45e9-8258-96b5d7a928dc_on_host_573970-compute18.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_9d1bf611-b9e5-433c-9309-400e3e06bf36_on_host_573963-compute15.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_a4870b5b-44a8-4184-8075-e777a04de7f1_on_host_573953-compute05.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_a7fec12a-e45d-47ad-bf85-b21a1acd08fb_on_host_573972-compute01.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_aa2e54de-ab6f-4ca1-9fb8-330ed8097ea1_on_host_569039-infra03_neutron_agents_container-9fd9d285 uint32 1
metric neutron-metering-agent_c8a650f0-5cf5-4f12-8392-1063e8da65ce_on_host_569039-infra03_neutron_agents_container-9fd9d285 uint32 1
metric neutron-dhcp-agent_ca36d77d-8f93-43e4-abf4-65a7dd7c8d2f_on_host_569039-infra03_neutron_agents_container-9fd9d285 uint32 1
metric neutron-linuxbridge-agent_d42be504-c264-4241-a531-63f15c919a18_on_host_573951-compute03.qe1.iad3.rackspace.com uint32 1
metric neutron-l3-agent_d9e62038-f952-4bc1-afc3-b03114533bb2_on_host_569038-infra02_neutron_agents_container-0ba110fc uint32 1
metric neutron-metadata-agent_eecf790c-736c-445f-87b1-c3d30b6ee34e_on_host_569038-infra02_neutron_agents_container-0ba110fc uint32 1
metric neutron-linuxbridge-agent_f0d91084-7603-443c-80b1-240a607ece73_on_host_573954-compute06.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_f1a74b02-3b09-4fc9-bc8b-9b716eed9a42_on_host_573960-compute12.qe1.iad3.rackspace.com uint32 1
metric neutron-linuxbridge-agent_f7cd866e-2785-44b4-976e-c22589b279c0_on_host_573955-compute07.qe1.iad3.rackspace.com uint32 1

glance_registry_local_check omits unit and format on timing

gives example output like

status okay
metric glance_registry_local_status uint32 1
metric glance_registry_local_response_time uint32 55.376

instead of including 3.f format and ms label on the timing as others do, for example glance_api_local_check.py

status okay
metric glance_api_local_status uint32 1
metric glance_api_local_response_time uint32 235.222 ms
metric glance_active_images uint32 2
metric glance_queued_images uint32 0
metric glance_killed_images uint32 0

Rewrite swift-recon.py to use python-swiftclient

All of the other checks similar to swift-recon.py use their client libraries so this should as well. That said, the swift team is moving away from actively developing python-swiftclient so we should also consider the viability of using python-openstacksdk. This will require the check registration to be rewritten in the playbooks.

memcache plugin is borken with syntax error

This is neither a list nor a dictionary:

MEMCACHE_METRICS = {'total_items': 'items',
        'get_hits': 'cache_hits',
        'get_misses': 'cache_misses',
        'total_connections': 'connections']

Create Disk Space Monitor Plugin

Currently we do not have monitoring for disk space. We have come across a couple of customers who had full /boot partitions due to apt's unattended upgrades installing new kernels and filling /boot (being addressed here: https://bugs.launchpad.net/openstack-ansible/+bug/1411897.) We need a plugin in place to check the amount of free disk space on a host that will be used by our monitoring to proactively alert when a disk approaches certain thresholds.

Desired features:

-Checks all mounted filesystems.
-Accepts list of filesystems to exclude from check.
-Prints the name and percentage of disk space remaining for each mountpoint (is this do-able with maas_common? Also, do the maas alerts support dicts or key-value pairs?)

Add container name to metrics

We wrote the plugins assuming they'd be running inside a container (which would its own MaaS entity). Since we're now running the plugins from the host, and there may be multiple of the same plugins running on the same host, we need a way to associate different metrics w/ a particular container.

swift_quarantine_check throws status error unrecognized command "quarantaine"

current version out of the master branch :

status error out = subprocess.check_output(command)\n  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output\n    raise CalledProcessError(retcode, cmd, output=output)\nCalledProcessError: Command '['swift-recon', '-q']' returned non-zero exit status 1\n
Traceback (most recent call last):
  File "./swift-recon.py", line 380, in <module>
    main()
  File "./swift-recon.py", line 369, in main
    stats = get_stats_from(args)
  File "./swift-recon.py", line 353, in get_stats_from
    stats = swift_quarantine()
  File "./swift-recon.py", line 235, in swift_quarantine
    regexp)
  File "./swift-recon.py", line 114, in recon_stats_dicts
    recon_output(for_ring, options)))
  File "./swift-recon.py", line 60, in recon_output
    out = subprocess.check_output(command)
  File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['swift-recon', '-q']' returned non-zero exit status 1

Missing rabbitmq metric

rabbitmq_status.py (10.1.9) does not provide the 'rabbitmq_max_channels_per_conn--' metric, which causes the MaaS checks (installed from os-ansible-deployment/rpc_deployment/playbooks/monitoring/) to alert.
As a sample, here is the output from a recent call to this script:

status okay
metric rabbitmq_uptime int64 529309065 ms
metric rabbitmq_messages int64 6780 messages
metric rabbitmq_ack int64 1375088 messages
metric rabbitmq_deliver_get int64 1375088 messages
metric rabbitmq_deliver int64 1375088 messages
metric rabbitmq_sockets_total int64 3594 fd
metric rabbitmq_publish int64 1318613 messages
metric rabbitmq_fd_used int64 162 fd
metric rabbitmq_mem_used int64 131505584 bytes
metric rabbitmq_fd_total int64 4096 fd
metric rabbitmq_disk_free_alarm_status uint32 1
metric rabbitmq_proc_used int64 2621 processes
metric rabbitmq_mem_limit int64 40494145536 bytes
metric rabbitmq_mem_alarm_status uint32 1
metric rabbitmq_sockets_used int64 141 fd
metric rabbitmq_messages_unacknowledged int64 0 messages
metric rabbitmq_messages_ready int64 6780 messages
metric rabbitmq_proc_total int64 1048576 processes

rabbitmq_status.py from this repo needs to be modified to support this required metric.

Cheers! :)

Add support for critical fileystems thresholds

Currently we do only create warning alarms which do not trigger core.
I added configurable warning and critical thresholds. Critical thresholds will also set off core.
Additionally we are working in getting warnings into core as well for our devices

neutronclient version issue with 9.0.1 pip wheel repo

Downloading/unpacking python-neutronclient==2.3.5 (from -r /usr/lib/rackspace-monitoring-agent/plugins/requirements.txt (line 8))
http://rpc.cloudnull.io/python_packages/9.0.1/ uses an insecure transport scheme (http). Consider using https if rpc.cloudnull.io has it available
Could not find a version that satisfies the requirement python-neutronclient==2.3.5 (from -r /usr/lib/rackspace-monitoring-agent/plugins/requirements.txt (line 8)) (from versions: 2.3.6)
Cleaning up...
No distributions matching the version for python-neutronclient==2.3.5 (from -r /usr/lib/rackspace-monitoring-agent/plugins/requirements.txt (line 8))
Storing debug log for failure in /root/.pip/pip.log

The version should be set to 2.3.6 for python-neutronclient at the 9_0_1_omsa_hotfix tag

/root/.auth_ref.json caching can cause issue

We had disks filling up on compute nodes and that caused maas to store corrupt service catalog cache files (took some time to find that). In such cases, I would like to add a parsing of file this file prior to actually using it and if the parser fails, rebuilt it or better storing it under /run which is mounted as tmpfs. So it would not be affected by disks filling up.

Add RabbitMQ monitor to alert when count(channel) > count(connection)

Having any RabbitMQ connection servicing more than a single channel indicates a problem with agent <-> RabbitMQ communication. We need an alert any time this occurs.

Expected behavior:

# rabbitmqctl list_connections channels | sort -n | tail -n1 
1

Indicates a problem:

# rabbitmqctl list_connections channels | sort -n | tail -n1 
186833

Galera Check not working if wsrep_local_state is != 4

In my case the check did not work properly if galera is in the following state:

wsrep_local_state 2
wsrep_local_state_comment Donor/Desynced

wsrep_cluster_size 3
wsrep_cluster_status Primary

wsrep_local_state_uuid a1a1ed88-7582-11e4-a4dd-5be7e191d8a4
wsrep_cluster_state_uuid a1a1ed88-7582-11e4-a4dd-5be7e191d8a4

the check did not return any out, neither a state or metrics

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.