Code Monkey home page Code Monkey logo

plight's Introduction

Plight Build Status

An application agnostic tool to represent node availability.

What is this nonsense?

In many of our deploment pipelines we had to gather credentials from any number of systems, such as load balancers and monitoring, to disable nodes. Each system might have different requirements for authentication, vary quality of APIs or automation libraries, etc. With stricter RBAC requirements we sometime ran into a system that required admin access to do a simple state change. Aside from the cumbersome credentials management, access levels can be a problem in more compliance-oriented environments.

Our original goals:

  • Remove a node from being active in a load balancer pool
  • Ensure monitoring knows the node is not active
  • Determine if the node is done draining connections before continuing maintenance

Most of these external systems did have some means of internally managing the state, whether via health checks or some other function. Thus our initial approach was to work with our developers to enable some kind of health state in their applications that we could configure our systems to utilize. However, this path leaves "off the shelf" software without a path.

Once we tried to cover all of our use cases we decided the best route would be to separate this functionality into a separate standalone web service.

But what about that 3rd goal??

Early on we thought we could easily expose active connections on the host through this service. But really its a separate function, and only tangentially related. If this is something you are interested we’d recommend checking out where we did implement it, the Ansible wait_for module as state=drained. A standalone implementation is available, but not really encouraged as it was more a proof of concept.

Why is it called Plight?

a dangerous, difficult, or otherwise unfortunate situation

Originally it was called 'nodestatus', we took to the thesaurus. Based on where we were with the original path to solve this problem we landed on plight.

Installation

Fedora or EL-based:

We have a COPR that is kept current with releases. After enabling that repository install using: yum install plight

From source

make install

From puppet

  • TODO: push our puppet module to be public and publish to forge

From ansible

Configuration

States

By default Plight comes with 3 explicit states, which as of the 0.1.0 series are configurable in plight.conf.

State Status Code Message
Enabled 200 node is available
Disabled 404 node is unavailable
Offline 503 node is offline

Long term we are going to change the Status codes to default to 200 for these three states. We maintained the states from the previous release for compatability purposes.

Enable the service

chkconfig plightd on
service plightd start

or

systemctl enable plightd
systemctl start plightd

Firewall

The default port configured for plight is 10101. In our examples directory there is a service entry for firewalld.

Usage

Changing states

Using plight --help from the cli will give you a list of all valid plight commands, which includes start, stop, and a dynamically generated list based on configured states.

Put a mode into maintenance mode

plight disable

Put a mode into offline mode

plight offline

Return a node to active mode

plight enable

List the configured states

plight list-states

Checking the current state of a node

plight status
or
curl http://localhost:10101 -D -

Licensing

All files contained with this distribution are licenced either under the Apache License v2.0 or the GNU General Public License v2.0. You must agree to the terms of these licenses and abide by them before viewing, utilizing, modifying, or distributing the source code contained within this distribution.

Build Notes

Tests

make test

Manual

  • Install via makefile sudo make install

Fedora/EL

  • Generate the RPM make or make rpms

EL5

  • Requires buildsys-macros installed

COPR (publishing RPMs)

COPR: https://copr.fedoraproject.org/coprs/xaeth/Plight/

  • Generate SRPM make srpm
  • Publish SRPM to a publicly available HTTP or FTP repo
  • Load the build into COPR copr-cli build Plight http://example.com/paht/to/plight.src.rpm

Debian

  • Generate deb package make debs

PPA (publishing DEBs)

  • Generate source package bits make debsrc
  • Change to ./artifacts/debs/ and generate signed source with changes
cd artifacts/debs
debbuild -S -sa
  • Push to PPA dput ppa:gregswift/plight plight_VERSION_source.changes

plight's People

Contributors

cheord avatar dwalleck avatar gregswift avatar mwhahaha avatar nickbales avatar powellchristoph avatar rackergs avatar scarlettmoonbell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

plight's Issues

Add ability to reload config file

When a new state is added to the config file without a restart, you can set the state. The problem is that the daemon doesn't know about this state, so they default state is used.

cherrypy in el5 missing process plugins

[root@centos5-test plight]# plight start
Traceback (most recent call last):
File "/usr/bin/plight", line 7, in ?
sys.exit(
File "/usr/lib/python2.4/site-packages/pkg_resources.py", line 236, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.4/site-packages/pkg_resources.py", line 2097, in load_entry_point
return ep.load()
File "/usr/lib/python2.4/site-packages/pkg_resources.py", line 1830, in load
entry = import(self.module_name, globals(),globals(), ['name'])
File "build/bdist.linux-x86_64/egg/plight/util.py", line 4, in ?
ImportError: No module named process.plugins

If state is listed under priorities but the section is not defined we need better error handling

Traceback (most recent call last):
  File "/bin/plight", line 9, in <module>
    load_entry_point('plight==0.1.1', 'console_scripts', 'plight')()
  File "/usr/lib/python2.7/site-packages/plight/util.py", line 220, in run
    config = get_config()
  File "/usr/lib/python2.7/site-packages/plight/util.py", line 52, in get_config
    config['states'] = process_states_from_config(parser, logger)
  File "/usr/lib/python2.7/site-packages/plight/util.py", line 105, in process_states_from_config
    states[state][option] = parser.get(state, option)
  File "/usr/lib64/python2.7/ConfigParser.py", line 607, in get
    raise NoSectionError(section)
ConfigParser.NoSectionError: No section: 'disabled'

Logging-related message on status change

I get the following message whenever I run plight disable or plight enable to change the state; the actual functionality isn't affected (i.e. the service still correctly changes status), so I guess it's just the logging that's affected:

# plight disable
No handlers could be found for logger "plight"

# plight enable
No handlers could be found for logger "plight"

Weird behavior on el7

When installed on el7, plight would start but would not respond on port 10101. The connection was established, but nothing was returned. systemctl stop plightd.service caused a stack trace in the logs.

Once we ran "plight stop", the service script started working.

Error when calling init script with no parameters

Currently if you call the plight script with invalid parameters or no parameters, you get the follow:

[user@host ~]# /etc/init.d/plightd
                                                           [FAILED]
Traceback (most recent call last):
  File "/etc/init.d/plightd", line 232, in ?
    print >> sys.stderr, "Usage: %s [start|stop|restart|force-reload|status|test]" % (SERVICE_NAME)
NameError: name 'SERVICE_NAME' is not defined

Should be:

[user@host ~]# /etc/init.d/plightd
Usage: /etc/init.d/plightd [start|stop|restart|force-reload|status|test]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.