Code Monkey home page Code Monkey logo

god's Introduction

God: The Ruby Framework for Process Management

  • Authors: Tom Preston-Werner, Kevin Clark, Eric Lindvall
  • Website: http://godrb.com

Description

God is an easy to configure, easy to extend monitoring framework written in Ruby.

Keeping your server processes and tasks running should be a simple part of your deployment process. God aims to be the simplest, most powerful monitoring application available.

Documentation

See in-repo documentation at REPO_ROOT/doc. See online documentation at http://godrb.com.

Community

Sign up for the god mailing list at http://groups.google.com/group/god-rb

License

See LICENSE file.

god's People

Contributors

b4hand avatar carsonreinke avatar chawco avatar chiriaev avatar cyrilpic avatar dacaga avatar damirainullin avatar danshultz avatar defunkt avatar eric avatar fcheung avatar geoffgarside avatar ice799 avatar jamesds avatar jberkel avatar jnewland avatar kevinclark avatar lukeasrodgers avatar mojombo avatar monde avatar pdlug avatar ps2 avatar rafmagana avatar raggi avatar robmiller avatar spraints avatar tmm1 avatar underley avatar willbryant avatar woahdae avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

god's Issues

can't start task

Running "sudo god log wallpaper_pic_2" results in:

I [2011-03-21 16:55:14] INFO: wallpaper_pic_2 moved 'up' to 'up'
I [2011-03-21 16:55:14] INFO: wallpaper_pic_2 [trigger] process is not running (ProcessRunning)
I [2011-03-21 16:55:14] INFO: wallpaper_pic_2 move 'up' to 'start'
I [2011-03-21 16:55:14] INFO: wallpaper_pic_2 start: rake crawler:wallpaper:pic DIVISOR=5 REMINDER=2 RAILS_ENV=production
I [2011-03-21 16:55:24] INFO: wallpaper_pic_2 moved 'up' to 'up'
I [2011-03-21 16:55:24] INFO: wallpaper_pic_2 [trigger] process is not running (ProcessRunning)
I [2011-03-21 16:55:24] INFO: wallpaper_pic_2 move 'up' to 'start'
I [2011-03-21 16:55:24] INFO: wallpaper_pic_2 start: rake crawler:wallpaper:pic DIVISOR=5 REMINDER=2 RAILS_ENV=production
I [2011-03-21 16:55:44] INFO: wallpaper_pic_2 moved 'up' to 'up'

The system is Red Hat 4.1.2-48, and with Linux version 2.6.18-194.el5 kernel. There's no log in /var/log/ or RAILS_ROOT/log. Thanks!

Ruby 1.9 MonitorMixin::ConditionVariable#wait timeout not supported

Seems that presently the timeout parameter to the MonitorMixin::ConditionVariable#wait method is not supported in Ruby 1.9. This is currently used in lib/god/driver.rb:88. I'm not sure if the timeout used at this point in god is required or if its just adding a back-off so that the event isn't retried too quickly. It might be worth checking to see if god is being invoked by ruby 1.9 and then setting the delay to nil on that case.

uninitialized constant God::Contacts::Jabber

Problem with Jabber notification. Tried with email and it works. Wonder, what I did wrong.

    God::Contacts::Jabber.defaults do |j|
        j.host = 'jabber.org'
        j.port = 5222
        j.from_jid = '[someone]@jabber.org'
        j.password = '[somepassword]'
    end

    God.contact(:jabber) do |z|
    z.name = 'joe'
    z.to_jid = "[email protected]"
    z.subject = "notification"
end 

Result in error : uninitialized constant God::Contacts::Jabber

God cannot run activities as root user (sudo)?

I have a process that I need to run as root using sudo ( a node.js instance ) and the proces hangs in the "starting" state when trying to run first time. I think this is probably because I'm not being given the option to enter the password when sudo tries to run.

Is there a good way around this or is this an enhancement to pursue?

Segmentation fault with Ruby Enterprise Edition 2011.12

God works perfectly with previous ree versions, we have been running it for almost 2 years without problem.

After upgrading ree to ree-1.8.7-2011.12, got the following error, and god process itself disappears after starting monitored processes.

os is ubuntu Ubuntu 10.04.3 LTS

/usr/local/rvm/rubies/ree-1.8.7-2011.12/lib/ruby/1.8/monitor.rb:173: [BUG] Segmentation fault
ruby 1.8.7 (2011-12-28 MBARI 8/0x6770 on patchlevel 357) [x86_64-linux], MBARI 0x6770, Ruby Enterprise Edition 2011.12

disk_usage doesn't work (Ubuntu 10.4)

I have posted this issue as a comment on an old closed issue: http://github.com/mojombo/god/issues/closed#issue/10 but as the issue looks still closed and I don't see the way to re-open it I am writing this new.


I have the same problem than anlek has on the issue referred above.

My SO: Ubuntu 10.4

$ df -P | grep sda1
/dev/sda1             19610300   3112624  15700760      17% /

Actual regex:
$ df -P | grep -i " /dev/sda1$" | awk '{print $5}' | sed 's/%//'
$ (none results)

Proposed regex:
$ df -P | grep -i "^/dev/sda1 " | awk '{print $5}' | sed 's/%//'
$ 17

I don't know if it could be a good idea to modify the valid? method so it could test that the regex responds with not empty result.

Thanks for your work

f.

The server is not available (or you do not have permissions to access it)

We see this error about 1/3rd of the time when trying to restart god (after, say, disabling it via god terminate); all god commands save 'god check' will report this. What are some of the causes of this error?

We also see instances where a watch with both a :process_running and a :cpu_usage condition will fail to restart a process which is no longer running. Checking the god log for that watch, we see a large list of successful status checks for cpu usage -- 0%, naturally.

Are we missing some basic permissions issue here? God is running as root, so it seems unlikely, but...

Duplicate PIDs in .god/pids

We found that god wasn't starting the correct number of workers in our beanstalk cluster. There should have been 32 workers, but there were only 31 registered in our beanstalkd and only 31 showing up in ps. When we looked at .god/pids however, there were 32 files. After poking around a bit we found the same pid was in 2 different files.

log feature undocumented

From the Resque mailing list:

First off, in god, you can use the undocumented .log property on watches to set a log file for stdout output to get logged in (otherwise it goes to /dev/null). After I learnt that, I resolved a couple of problems

Increment/decrement processes

This is a feature request of course, but I'd like to be able to increment or decrement processes. Ideally I would like something like the following

$ god increment group1 4
# or
$ god +1 group1
# then w.start is called, another process is started and monitored.

Documentation wrong for email notifications per smtp

Hi,

on your website you have the following in your documentation:

[...]
=== SMTP Options (when delivery_method = :smtp) ===
server_host     - The String hostname of the SMTP server (default: localhost).
server_port     - The Integer port of the SMTP server (default: 25).
server_auth     - The Boolean of whether or not to use authentication
                  (default: false).

=== SMTP Auth Options (when server_auth = true) ===
[...]

So you define server_auth as a boolean.

Looking into your code in this file:
https://github.com/mojombo/god/blob/master/lib/god/contacts/email.rb#L113

you send the server_auth argument to the smtp server:

def notify_smtp(mail)
    args = [arg(:server_host), arg(:server_port)]
    if arg(:server_auth)
      args << arg(:server_domain)
      args << arg(:server_user)
      args << arg(:server_password)
      args << arg(:server_auth)
    end

    Net::SMTP.start(*args) do |smtp|
      smtp.send_message(mail, arg(:from_email), arg(:to_email))
    end
  end

If I set server_auth to true, like defined in your documentation I get the error

INFO: failed to send email to [email protected] via smtp: wrong authentication type true

because true is not a valid authentication type for smtp.

Correct is for example

d.server_auth = 'plain'

Multiple Processes/PIDs per Watch

Wanting to use God to watch delayed_job for a Rails app. I'd like to allow for multiple unique delayed_job workers using the -n switch available in delayed/command.rb. When this switch is used, multiple delayed_job worker processes are started, all using the same Rails environment to save memory. When these processes are started, PID files like delayed_job.0.pid, delayed_job.1.pid, etc are created in tmp/pids/.

If I specify "script/delayed_job -n 5 start" in my watch configuration for delayed_job, God seems to fail when watching the processes. I tried specifying "Dir['tmp/pids/delayed_job.*.pid'] as the pid_file in the watch, but I assume God is expecting a string here, not an array. I could just configure multiple god watches for multiple delayed_job processes similar to how people do it for Thin configurations, but that takes away the advantage of having multiple delayed_job workers being able to use the same Rails environment to save memory.

Any ideas? By the way, the God application is freaking awesome regardless. :)

Ruby 1.9.1 and god

Attempting to run god with Ruby 1.9.1 results in the following error:

F [2009-11-27 06:14:27] FATAL: Unhandled exception in driver loop - (NoMethodError): undefined method `critical=' for Thread:Class
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:100:in `ensure in pop'
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:100:in `pop'
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:166:in `block (2 levels) in initialize'
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:164:in `loop'
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:164:in `block in initialize'
/usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:112:in `push': undefined method `critical=' for Thread:Class (NoMethodError)
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/driver.rb:192:in `message'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/task.rb:151:in `move'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/watch.rb:96:in `monitor'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god.rb:608:in `block in start'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god.rb:608:in `each'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god.rb:608:in `start'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god.rb:636:in `at_exit'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god.rb:669:in `block in <top (required)>'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/cli/run.rb:87:in `fork'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/cli/run.rb:87:in `run_daemonized'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/cli/run.rb:17:in `dispatch'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/lib/god/cli/run.rb:8:in `initialize'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/bin/god:119:in `new'
    from /usr/lib/ruby/gems/1.9.1/gems/god-0.7.18/bin/god:119:in `<top (required)>'
    from /usr/bin/god:19:in `load'
    from /usr/bin/god:19:in `<main>'

"god check" failure

Running "god check" results in:


using event system: netlink
starting event handler
forking off new process
forked process with pid = 3550
killing process
[fail] never received process exit event

The god log also says that events are not supported. However, "lsmod |grep cn" shows that it is loaded. It seems to me that god is incorrectly thinking that events are not supported on my system, or the documentation is incorrect.

Using Arch Linux, kernel 2.6.24-xen on Slicehost.

God appears to be leaving old god processes around indefinitely

When god forks processes, whether to do monitoring or restart processes, it seems to leave a lot of these processes lying around. I seem to notice this happening when we after we do deploys, though not necessarily in a consistent manner. On deploys, we reload the god config (via goad load) and then we restart some processes. On our staging server, we only have a total of four processes being monitored, and we have 17 god processes running. Seems excessive. Also, on a recent deploy, I noticed this log output after reloading the god config.

We are using ruby 1.9.

I [2011-06-22 16:54:14]  INFO: api-unicorn-master-1 Reloaded config
I [2011-06-22 16:54:14]  INFO: resque-0 unwatched
F [2011-06-22 16:54:14] FATAL: Unhandled exception in driver loop - (NoMethodError): undefined method `handle_event' for nil:NilClass
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:151:in `block (2 levels) in initialize'
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:149:in `loop'
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:149:in `block in initialize'
I [2011-06-22 16:54:14]  INFO: resque-0 Reloaded config
I [2011-06-22 16:54:14]  INFO: resque-1 unwatched
F [2011-06-22 16:54:14] FATAL: Unhandled exception in driver loop - (NoMethodError): undefined method `handle_event' for nil:NilClass
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:151:in `block (2 levels) in initialize'
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:149:in `loop'
/usr/local/rvm/gems/ruby-1.9.2-p180/gems/god-0.11.0/lib/god/driver.rb:149:in `block in initialize'
I [2011-06-22 16:54:14]  INFO: resque-1 Reloaded config
I [2011-06-22 16:54:14]  INFO: resque-scheduler Loaded configI [2011-06-22 16:54:14]  INFO: resque-scheduler move 'unmonitored' to 'init'
D [2011-06-22 16:54:14] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x00000001cbae48> in 0 seconds
I [2011-06-22 16:54:14]  INFO: resque-scheduler moved 'unmonitored' to 'init'
I [2011-06-22 16:54:15]  INFO: resque-scheduler [trigger] process is running (ProcessRunning)
D [2011-06-22 16:54:15] DEBUG: resque-scheduler ProcessRunning [true] {true=>:up, false=>:start}
I [2011-06-22 16:54:15]  INFO: resque-scheduler move 'init' to 'up'
D [2011-06-22 16:54:15] DEBUG: driver schedule #<God::Conditions::MemoryUsage:0x00000001cbb9d8> in 0 seconds
D [2011-06-22 16:54:15] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x00000001cb9980> in 0 seconds
I [2011-06-22 16:54:15]  INFO: resque-scheduler moved 'init' to 'up'
I [2011-06-22 16:54:15]  INFO: resque-scheduler [ok] memory within bounds [73620kb] (MemoryUsage)
D [2011-06-22 16:54:15] DEBUG: resque-scheduler MemoryUsage [false] {true=>:restart}
D [2011-06-22 16:54:15] DEBUG: driver schedule #<God::Conditions::MemoryUsage:0x00000001cbb9d8> in 30 seconds
I [2011-06-22 16:54:15]  INFO: resque-scheduler [ok] process is running (ProcessRunning)D [2011-06-22 16:54:15] DEBUG: resque-scheduler ProcessRunning [false] {true=>:start}
D [2011-06-22 16:54:15] DEBUG: driver schedule #<God::Conditions::ProcessRunning:0x00000001cb9980> in 30 seconds
I [2011-06-22 16:54:15]  INFO: resque-1 move 'unmonitored' to 'restart'I [2011-06-22 16:54:15]  INFO: resque-scheduler move 'up' to 'restart'I [2011-06-22 16:54:15]  INFO: resque-0 move 'unmonitored' to 're
start'
I [2011-06-22 16:54:15]  INFO: resque-scheduler stop: kill -QUIT `cat /var/www/api/current/tmp/pids/resque-scheduler.pid`
I [2011-06-22 16:54:15]  INFO: resque-0 stop: kill -QUIT `cat /var/www/api/current/tmp/pids/resque-0.pid`
I [2011-06-22 16:54:15]  INFO: resque-1 stop: kill -QUIT `cat /var/www/api/current/tmp/pids/resque-1.pid`

sugar.rb causes 5.kilobytes and 5.bytes to return the same value

If you're (for whatever reason) loading God alongside a Rails app, 5.kilobytes and 5.bytes will return the same value of 5 because God's sugar.rb uses kilobytes as the smallest size granularity while ActiveSupport uses bytes. Why not use the same conventions as Rails? This isn't a big deal because there's really no reason to include the God gem in your gemfile (as I mistakenly did), but the question remains - why not go as small as bytes?

[BUG] Segmentation fault using transitions on Ubuntu 10.04.2 LTS

Backtrace:
https://gist.github.com/1002902

God watch file:
https://gist.github.com/1002912

If lines 5/6 are removed and I 'kill -QUIT' the master unicorn process, the given backtrace results. This happens on a regular basis on two machines both running Ubuntu 10.04.2 LTS and Ubuntu 10.10. I can also get the same fault several other ways, but the segmentation fault always occurs on in god-0.11.0/lib/god/event_handler.rb on line 62.

I am running ruby1.9.2-p180. I tried to get a gdb backtrace, but I could not get it to load correctly.

Please let me know if you need further information!

disk_usage does not work

I believe that the actual call for usage fails

Line 21 in disk_usage.rb
usage = df -P | grep -i " #{self.mount_point}$" | awk '{print $5}' | sed 's/%//'
should be
usage = df -P | grep -i #{self.mount_point} | awk '{print $5}' | sed 's/%//'

At least that's what I think it should be.

Andrew

jabber memory leak

There isn't a call to client.close in the jabber code, and thus it
makes God leak memory when you use it.

Solution:

http://github.com/woahdae/god/commit/210907fa9a9fdf765b7241666ac93682adc372db
http://github.com/woahdae/god/commit/66ec9e6f3da373df9acdf6592fb333157391a592

And "discussed":

http://groups.google.com/group/god-rb/browse_thread/thread/c8d70f36e61f4e3d/4814fb7bac15ef77?lnk=gst&q=jabber#4814fb7bac15ef77

I sent a pull request to the author back in december of last year; this isn't a feature request, I'm handing in a very reproduceable bug and fix. I don't use god anymore, and I'm going to delete my fork with the fixes during 'spring cleaning' of my github account. Hope they get picked up by then.

God occasionally does not realise that a process has died and reports 0kb memory usage in log

I'm running god on ubuntu 10.04

Ruby 1.8.7 (standard debian package)
God 0.11.0

I occasionally see an issue where the process is not running (accoring to ps), but god log reports something like the following

$ sudo god log my_process
Please wait...
I [2010-10-18 10:50:24]  INFO: my_process [trigger] process is running (ProcessRunning)
I [2010-10-27 16:14:17]  INFO: my_process [ok] memory within bounds [0kb, 0kb, 0kb, 0kb, 0kb] (MemoryUsage)
I [2010-10-27 16:14:37]  INFO: my_process [ok] memory within bounds [0kb, 0kb, 0kb, 0kb, 0kb] (MemoryUsage)
I [2010-10-27 16:14:57]  INFO: my_process [ok] memory within bounds [0kb, 0kb, 0kb, 0kb, 0kb] (MemoryUsage)
I [2010-10-27 16:15:17]  INFO: my_process [ok] memory within bounds [0kb, 0kb, 0kb, 0kb, 0kb] (MemoryUsage)
I [2010-10-27 16:15:37]  INFO: my_process [ok] memory within bounds [0kb, 0kb, 0kb, 0kb, 0kb] (MemoryUsage)

and god status reports that the process is up. Therefore the process does not get restarted - which is a Very Bad Thing.

The pid file in /var/run/god/my_process.pid reports a pid which doesn't correspond to any running process. Looking at the log files for the process which died, it appears that the process did not shut down cleanly.

Does anyone have any ideas what the issue could be, or how I could debug this when it happens again? Thanks.

Stuck on init

I'm using god to launch camo as seen here: https://gist.github.com/675038

Camo starts up just fine but god never gets out of init. If I restart god, it shows camo as up but if I attempt to stop and start camo with god I get the same error again.

Twitter gem has changed

Instead of Twitter::Base.new(username, password) you have to build a Twitter::HTTPAuth.new('username', 'password') first. I am willing to submit a patch, but I am not sure if backward compatibility with the old twitter gem is necessary?

god load seems unreliable

We're troubleshooting an issue where god load seems to unreliably work after a few times. Running the load command via console works ~3-5x, before the next load command seems to get stuck and there are multiple god processes running. We're using god to manage our Resque workers. This is on an EC2 Ubuntu 10.04 LTS (Lucid Lynx) 64-bit Server using ruby 1.9.2 via RVM. God is being run not as root, as the existing ubuntu user.

We're starting god via an RVM wrapper:

/home/ubuntu/.rvm/bin/bootup_god --no-syslog --no-events -l /tmp/god.log --log-level debug

Gist of our resque.god file.

$ god -v
Version 0.11.0
$ ruby --version
ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux]

To reproduce, we run god load config/god/*.god about 2-3x and everything seems to be happy. Then on the next load, it gets stuck and only prints this:

$ god load config/god/*.god
Sending 'load' command

Checking the processes:

$ ps aux | grep god
ubuntu   16386  0.1  0.2 104904 16816 pts/0    Sl   08:22   0:00 /home/ubuntu/.rvm/rubies/ruby-1.9.2-p180/bin/ruby /home/ubuntu/.rvm/gems/ruby-1.9.2-p180/bin/god --no-syslog --no-events -l /tmp/god.log --log-level debug
ubuntu   21970 127131031  0.1 104904 13512 pts/0 Sl 08:28 17179869:11 /home/ubuntu/.rvm/rubies/ruby-1.9.2-p180/bin/ruby /home/ubuntu/.rvm/gems/ruby-1.9.2-p180/bin/god --no-syslog --no-events -l /tmp/god.log --log-level debug

Note the weird crazy 2nd process (21970).

The god log doesn't print any warnings or anything like that.

I've seen some things online that suggest God isn't 100% compatible with 1.9, so I'm not sure if it's that. I've tried all sorts of combinations of not using syslog, using syslog, running as root, simple, complicated resque.god files and anecdotically nothing seems to change this. That said, making the load command fail is not easily reproducible so I can' say for sure if none of those changes made any difference. I did just repro it now given the above settings and files to create this issue, so I know for sure it does happen in this situation.

Happy to add any more logging or info as needed.

Cannot read ENV variables in God config

I'm trying to use God on my production server (Ubuntu 10.04) with RVM, Ruby 1.9.2p290 and Rails 3.1 as described here: http://makandra.com/notes/1431-resque-+-god-+-capistrano. (I pasted the resque.god config file here: https://gist.github.com/1187383)

The problem is that I cannot set the RAILS_ENV or RAILS_ROOT env variables (or any other for that matter) when I load the config file.
The article uses Capistrano's run helper method on destroy but even if I just execute

RAILS_ROOT=foo RAILS_ENV=bar god load config/god/resque.god

directly on the server I get

Sending 'load' command

RAILS_ENV not set.
/home/rails/apps/myapp/releases/20110901211011/config/god/resque.god:1:in `root_binding'
/home/rails/apps/myapp/shared/bundle/ruby/1.9.1/gems/god-0.11.0/lib/god.rb:549:in `eval'
/home/rails/apps/myapp/shared/bundle/ruby/1.9.1/gems/god-0.11.0/lib/god.rb:549:in `running_load'
/home/rails/apps/myapp/shared/bundle/ruby/1.9.1/gems/god-0.11.0/lib/god/socket.rb:58:in `method_missing'
/home/rails/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/drb/drb.rb:1558:in `perform_without_block'
/home/rails/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/drb/drb.rb:1518:in `perform'
/home/rails/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/drb/drb.rb:1592:in `block (2 levels) in main_loop'
/home/rails/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/drb/drb.rb:1588:in `loop'
/home/rails/.rvm/rubies/ruby-1.9.2-p290/lib/ruby/1.9.1/drb/drb.rb:1588:in `block in main_loop'

When I insert raise ENV.inspect on top of the file, I see lots of env variables, just no the ones I try to set on the command line.

God on custom port & "The server is not available" error.

Just noticed an issue. When you start GOD on a custom port .. you are unable to talk to the process using god command.

god -c blah.god -p 6776

Then try

god status

which result in

The server is not available (or you do not have permissions to access it)

Leaving the default port without specifying the -p .. works.

Race condition makes "god terminate" intermittently fail

Almost every time I run terminate on one of my 8-CPU daemon servers with a few jobs running and one failing, it doesn't work as it doesn't actually run the stop action. It's pretty messy to debug, but I believe I have found at least a problem, if not the only problem; there's a race condition in the event processing.

The God stop_all method does this:

self.watches.sort.each do |name, w|
  Thread.new do
    w.unmonitor if w.state != :unmonitored
    w.action(:stop) if w.alive?
  end
end

The problem is that that unmonitor call gets added to the driver events queue to be run asynchronously. If the driver happens to get a turn before the next line, things work. But if not - if this stop_all method continues running before driver wakes up and grabs the move(:up, :unmonitored) event from the queue - then the :stop action will get queued immediately behind it in the driver event queue.

Unfortunately, when the driver runs the Task#move(:up, :unmonitored), it does this:

    # cleanup from current state
    self.driver.clear_events

This results in the stop event being cleared from the events queue! Accordingly, the unmonitor happens but the stop doesn't, so the terminate method then rolls on, obliviously waiting for the watch to finish even though it's never been stopped, eventually giving up.

I can see a couple of ways to patch this. The most obvious is to move the unmonitor state transition and stop action into one driver event, but that seems like a bit of a hack.

Why does the code clear the events queue? Do we need to unmonitor before queueing the stop action?

syslogger tries to interpolate %

The ruby syslog driver treats % characters specially:

$ ruby -e" require 'syslog'; Syslog.open('test'); Syslog.log(Syslog::LOG_INFO, '%D') "
-e:1:in `log': malformed format string - %D (ArgumentError)
from -e:1

This causes god to break if anything being logged happens to contain a %. For example, if w.start = 'git daemon --interpolated-path=gist%D'

diff --git a/lib/god/sys_logger.rb b/lib/god/sys_logger.rb
index 5979b67..a8ec94d 100644
--- a/lib/god/sys_logger.rb
+++ b/lib/god/sys_logger.rb
@@ -35,7 +35,7 @@ begin
#
# Returns Nothing
def self.log(level, text)
- Syslog.log(SYMBOL_EQUIVALENTS[level], text)
+ Syslog.log(SYMBOL_EQUIVALENTS[level], text.gsub('%','%%'))
end
end

segfault with Ruby 1.9.3-p0 and Ubuntu 10.04

$ sudo bundle exec god check
using event system: netlink
starting event handler
forking off new process
forked process with pid = 31245
killing process

It starts up fine if the process it's monitoring is already running, but if I kill the process, God restarts it then segfaults. So it seems that it's caused by the "start" transition.

Segfault detailed here:
https://gist.github.com/1361613

God / Bundler / RVM

I'm having an extremely difficult time figuring out how to make the combination of God / Bundler / RVM all play nicely together. I do have a setup that is working & running though. My assumption is that it is the fact that god is run with sudo as root rather than as my deploy user. I've tried following rvm's guide on using a wrapper, but it simply resulted in the same issue.

The issue I have is that anytime I now attempt to change a gem in my Gemfile it breaks god. I will receive errors that it cannot find the new gem in my log file. I have the gem installed under both the system default ruby, and for each rvm rubies & gemset combination. I've eliminated all of the rubies not being used so I am now down to just the ubuntu system default 1.8.7, and the rvm installed 1.9.2. The gems are installed under the system default ruby, the ruby 1.9.2 global gemset, and the ruby 1.9.2 custom gemset. I can't get god to find the gems though even though my start/stop commands are being run through bundle exec...

I ran into this issue once before, and after hours of tedious uninstalling rvm, reinstalling, and server reboots etc.. I manage to somehow mysteriously get it to work that I really wasn't sure what solved it. Now I've run into this again, and after trying to basically uninstall / reinstall rvm all over again and rebooting the server with the gem installed under every ruby / gemset combination on the system I still run into gem not found errors.

Basically it runs fine with the current production Gemfile I have, but I cannot possibly update any gems without it causing god to break from being unable to find the new gems. This is all despite the fact that they are installed both during capistrano deployment, and then I will even manually install them under every ruby / gemset combination on the system.

Does anyone else use god with bundler & rvm that could help shed light on this issue? Is there anyway to possibly run god as my deploy user? Or could someone help me debug this issue? I can provide any relevant information that would help track down the root cause of this if it would help just let me know.

Setting working directory not working with Ruby 1.9.2 ?

I've been trying to start a resque worker with god (using ruby 1.9 for the first time - I did this before with 1.8.7 and it worked). If I do the following, rake cannot find the Rakefile:

w.dir           = '/var/sites/appname/current' 
w.start         = "/usr/local/bin/rake resque:work" 

Whereas if I do this, it works (but then god stop won't stop the worker process but only the parent sh process):

w.start         = "cd /var/sites/appname/current && /usr/local/bin/rake resque:work" 

Could it be that w.dir doesn't work as expected with 1.9 ?
Anyone else seeing this behaviour ?

God misreports permissions when run as a non-root user

When i try to run god as the "deploy" user on one of our servers, I see the following output int the god log:

E [2011-05-23 18:23:43] ERROR: PID file directory '/var/www/api/shared/pids' is not writable by deploy                                                                                                        
E [2011-05-23 18:23:43] ERROR: Log file '/dev/null' exists but is not writable by deploy                                                                                                                      
E [2011-05-23 18:23:43] ERROR: Task 'api-unicorn-master-1' is not valid (see above) 

However, an ls -la on the shared/pids directory shows that the deploy user does have write permissions:

# ls -la /var/www/api/shared/pids
drwxrwxr-x 2 deploy deploy 4096 May 23 18:21 .                                                                                                                                                                
drwxrwxr-x 8 deploy deploy 4096 May 20 21:55 ..

The second line in the god log output -- related to /dev/null not being accessible -- may be another bug, but not related to permissions. I have log file locations specified both for the process god is managing, yet god is still trying to use /dev/null for logging, apparently.

God segfaults on new RE with thread patches

Like this

----------------- Register state dump ----------------------
rax = 0x00000000011c959c rbx = 0x0000000001b81bd0 rcx =
0x0000000000000000 rdx = 0x0000000000e56c00
rdi = 0x0000000001b81bd0 rsi = 0x0000000000dc0600 rbp =
0x0000000000000001 rsp = 0x00000000299158b0
r8 = 0x0000000000000001 r9 = 0x0000000000000000 r10 =
0x0000000000000001 r11 = 0x0000000000000000
r12 = 0x0000000000000030 r13 = 0x0000000000000001 r14 =
0x00000000011b9508 r15 = 0x0000000000000000
rip = 0x00000000004138f7 rflags = 0x0000000000010202 cs =
0x0000000000000033 fs = 0x0000000001b81bd0
gs = 0x0000000024fc3f60
/opt/ruby-enterprise-1.8.7-20090928/lib/ruby/1.8/timeout.rb:92: [BUG]
Segmentation fault
ruby 1.8.7 (2009-06-12 patchlevel 174) [x86_64-linux], MBARI 0x6770,
Ruby Enterprise Edition 20090928

I tried SystemTimer instead of timeout and it seems to be working fine

undefined method 'keepalive'

Hello,

When following the tutorial on http://godrb.com/ I get

$ rvmsudo god -c simple.god -D
I [2012-01-13 21:54:09]  INFO: Loading simple.god
I [2012-01-13 21:54:09]  INFO: Syslog enabled.
I [2012-01-13 21:54:09]  INFO: Using pid file directory: /var/run/god
There was an error in simple.god
    undefined method `keepalive' for #<God::Watch:0x00000001377ef8>
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/task.rb:219:in `method_missing'
    /home/ec2-user/god/simple.god:4:in `block in <top (required)>'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god.rb:269:in `task'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god.rb:253:in `watch'
    /home/ec2-user/god/simple.god:1:in `<top (required)>'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:156:in `load'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:156:in `load_god_file'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:148:in `block in load_config'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:147:in `each'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:147:in `load_config'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:72:in `default_run'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:84:in `run_in_front'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:23:in `dispatch'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/lib/god/cli/run.rb:8:in `initialize'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/bin/god:122:in `new'
    /usr/local/rvm/gems/ruby-1.9.2-p290/gems/god-0.11.0/bin/god:122:in `<top (required)>'
    /usr/local/rvm/gems/ruby-1.9.2-p290/bin/god:19:in `load'
    /usr/local/rvm/gems/ruby-1.9.2-p290/bin/god:19:in `<main>'
E [2012-01-13 21:54:09] ERROR: File 'simple.god' could not be loaded

Is the tutorial outdated?

God does nothing...

Hi,

I've been trying to setup God to monitor a process we use in a Django site that runs off-request tasks. My configuration is like this:

God.watch do |w|
  w.name = "ctrl-c-tasks"
  w.interval = 10.seconds
  w.start = "/usr/bin/python /var/www/pastebin/pastebin/manage.py tasks start"
  w.stop =  "/usr/bin/python /var/www/pastebin/pastebin/manage.py tasks stop"
  w.restart = "/usr/bin/python /var/www/pastebin/pastebin/manage.py tasks restart"
  w.start_grace = 10.seconds
  w.restart_grace = 10.seconds
  w.pid_file = "/var/run/ctrlc/tasks_scheduler.pid"
  w.behavior :clean_pid_file

  w.lifecycle do |on|
    on.condition(:flapping) do |c|
      c.to_state = [:start, :restart]
      c.times = 5
      c.within = 5.minute
      c.transition = :unmonitored
      c.retry_in = 10.minutes
      c.retry_times = 5
      c.retry_within = 2.hours
    end
  end

When I ran god it reports:

manu@hosting:~$ sudo /var/lib/gems/1.8/bin/god -c watch-ctrl.god -D
I [2011-05-26 18:12:07]  INFO: Loading watch-ctrl.god
I [2011-05-26 18:12:07]  INFO: Syslog enabled.
I [2011-05-26 18:12:07]  INFO: Using pid file directory: /var/run/god
I [2011-05-26 18:12:07]  INFO: Started on drbunix:///tmp/god.17165.sock
I [2011-05-26 18:12:07]  INFO: ctrl-c-tasks move 'unmonitored' to 'up'
I [2011-05-26 18:12:07]  INFO: ctrl-c-tasks moved 'unmonitored' to 'up'

But when I kill the watched process, god does not notices it, and does nothing at all.

My environment is Ubuntu Server 10.04, ruby 1.8.7, god 0.11.0.

Any ideas?
Best regards,
Manuel.

Unhandled exception (StandardError): No buffer space available

I am seeing the following errors in /var/log/syslog continuously on some of my servers running in the EC2 cloud on Ubuntu 10.04.

It seems to be raised from the following C code somewhere:

https://github.com/mojombo/god/blob/master/ext/god/netlink_handler.c#L26

Sep 19 20:44:40 localhost god[4886]: Unhandled exception (StandardError): No buffer space available#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:62:in `handle_events'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:62:in `start'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:60:in `loop'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:60:in `start'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:59:in `initialize'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:59:in `new'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/event_handler.rb:59:in `start'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:70:in `default_run'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:100:in `run_daemonized'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:91:in `fork'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:91:in `run_daemonized'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:21:in `dispatch'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/../lib/god/cli/run.rb:8:in `initialize'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/god:122:in `new'#012/usr/lib/ruby/gems/1.8/gems/god-0.11.0/bin/god:122#012/usr/bin/god:19:in `load'#012/usr/bin/god:19
$ god check
using event system: netlink
starting event handler
forking off new process
forked process with pid = 31687
killing process
[ok] process exit event received
$ uname -a
Linux HOSTNAME 2.6.35-22-virtual #35-Ubuntu SMP Sat Oct 16 23:57:40 UTC 2010 i686 GNU/Linux
$ ruby --version
ruby 1.8.7 (2010-06-23 patchlevel 299) [i686-linux]
$ god --version
Version 0.11.0

Activate Github wiki

I've been having a hard time finding up to date documentation and examples of god configurations for the services I'm trying to monitor. After going through tons google results, and hobbling together pieces of various different posts I've finally pretty much completed my setup. It would be nice to activate the Github wiki so people could contribute different configurations and help articles in a central location to help ease rookies into using god.

Monitoring multiple servers

Is it possible with God to monitor multiple servers from one main alert server via SSH keys?

For example my config file contains:
%w{80}.each do |port|
God.watch do |w|

Can it contain something like:

%w{192.168.0.1:80}.each do |port|
God.watch do |w|

I can then create a XML document with the list of servers to monitor which have SSH keys installed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.