kostya / eye Goto Github PK
View Code? Open in Web Editor NEWProcess monitoring tool. Inspired from Bluepill and God.
License: MIT License
Process monitoring tool. Inspired from Bluepill and God.
License: MIT License
It seems the default behavior is not to reload the .eye file with the restart command. Is the only way to do it stop monitoring then reload the config file? It might be nice to reload the config file automatically when you need to add new environment variables and such.
I would like to be able to set uid and gid for my app that uses Ruby 1.9.3. I'm moving from Bluepill to Eye, and it would be really helpful if Eye had this functionality. Is support for 1.9.3 impossible with Eye's current structure, or could this be added? When I load eye, I get this error message: :uid not supported (use ruby >= 2.0)
.
When I add something like:
depend_on 'proxy'
It appears any other checks and transitions in the process are completely ignored, for example all of the following gets ignored:
trigger :transition, to: :starting, do: -> {
month = 60 * 60 * 24 * 28
process.wait_for_condition(month, 15) do
info "Waiting for camera #{url} to be up"
timeout = 5 * 1_000_000 # 5 seconds in microseconts
system("ffmpeg -loglevel warning -y -stimeout #{timeout} " \
"-i #{url} -c:v copy -an -vframes 1 -f rawvideo /dev/null")
end
}
check :rtsp, every: 30.seconds, times: 2, addr: rtsp_url
Hello, I am trying to get eye to work. I am setting some variables at the top of the file:
RAILS_ENV = ENV['RAILS_ENV'] || 'development'
RAILS_ROOT = ENV['RAILS_ROOT'] || File.expand_path('.')
Eye.config do
logger File.join(RAILS_ROOT,"log","eye.log")
end
When I start eye, it complains that "/log/eye.log" does not exist. It seems that it is not picking up the variable RAILS_ROOT
This only happens in daemon mode. If I run it in the foreground, it works fine. What is wrong? Is there a better way to do this?
Trying to follow the format of blue pill and god to split the .eye files into separate ones for each service.
Eye.application "production" do
ENV['RAILS_ENV'] = "production"
ENV['RAKE_ROOT'] = "/usr/local/rvm/gems/ruby-2.0.0-p247"
ENV['RAILS_ROOT'] = "/srv/ectasio/current"
#load ENV['RAILS_ROOT'] + "/config/eye/nginx.eye"
#load ENV['RAILS_ROOT'] + "/config/eye/mysql.eye"
#load ENV['RAILS_ROOT'] + "/config/eye/cumulus.eye"
#load ENV['RAILS_ROOT'] + "/config/eye/solr_jetty.eye"
load ENV['RAILS_ROOT'] + "/config/eye/sidekiq.eye"
load ENV['RAILS_ROOT'] + "/config/eye/unicorn.eye"
load ENV['RAILS_ROOT'] + "/config/eye/private_pub.eye"
end
It would be nice to set global env variables and then load each .eye file separately,
what would be the best way to do this?
Is there a way to notify without a restart or stop by observing a process? Also, is there a way to observe the general memory or cpu on the machine and send a notification. I've been scouring the code and docs since yesterday as well as making some attempts at intuition but haven't found anything so far.
Dear Kostya,
Last night spend a lot of time reading docs, sources and old issues and refactored my code based on your previous comments.
However even after extensive research still having several issues getting services started with eye, hope you can help me fix those with suggestions on how to improve.
Hope someone can help fix these problems.
I've got eye set up for managing my Rails and Sidekiq processes, however I wanted to manage eye via Upstart to start eye on boot and monitor eye incase it dies unexpectedly.
I've tried with all three variations on the expect
stanza:
expect
stanza (0 forks)expect fork
(1 fork)expect daemon
(2 forks)And each time, Upstart ends up tracking a different PID than what eye is running on.
Following Upstarts documentation for getting the fork count (http://upstart.ubuntu.com/cookbook/#how-to-establish-fork-count), the strace log returns a value of 17, which seems excessively high.
Here's the Upstart config I'm using for reference:
description 'Eye'
start on runlevel [2]
stop on runlevel [016]
expect daemon
exec su -l -c 'eye load /home/deploy/sample.eye' deploy
respawn
Process forking is a bit new to me in Ruby, so I'm hoping someone else can shed some light on this for me.
Once it's solved, I'll gladly work up a wiki page to help document this for future users.
Thanks in advance!
Amazing work on this code!
eye (>= 0) ruby depends on
activesupport (~> 3.2) ruby
Could you make it support Rails 4?
When I run:
$ bundle exec eye q -s
It hangs forever (well, until the timeout) and when I look at eye.log, I see:
[2014-05-13 19:36:05 +0100] [recorder:52ab35a548616311d3360000:proxy] trigger(transition) Waiting for camera rtsp://username:[email protected]/media/video1 to be up
[2014-05-13 19:36:20 +0100] [recorder:52ab35a548616311d3360000:proxy] trigger(transition) Waiting for camera rtsp://username:[email protected]/media/video1 to be up
[2014-05-13 19:36:40 +0100] [recorder:52ab35a548616311d3360000:proxy] trigger(transition) Waiting for camera rtsp://username:[email protected]/media/video1 to be up
The process in question is:
process :proxy do
trigger :transition, to: :starting, do: -> {
month = 60 * 60 * 24 * 28
process.wait_for_condition(month, 15) do
info "Waiting for camera #{url} to be up"
timeout = 5 * 1_000_000 # 5 seconds in microseconds
system("ffmpeg -loglevel warning -y -stimeout #{timeout} " \
"-i #{url} -c:v copy -an -vframes 1 -f rawvideo /dev/null")
end
}
daemonize true
start_command "live555ProxyServer -p #{port} -V #{url}"
check :rtsp, every: 30.seconds, times: 2, addr: "rtsp://127.0.0.1:#{port}/proxyStream"
end
Hi,
what's the best way to add eye's log files to e.g. logrotate? Does it support a special signal/command to release its file handle and log to a new file?
I have a process group:
group camera_id do
process :proxy do
...
end
process :recorder do
trigger :transition, to: :starting, do: -> {
p = Eye::Control.process_by_name("proxy")
process.wait_for_condition(60, 5) do
p ? p.state_name == :up : false
end
}
...
end
end
However, it seems the process_by_name method just returns a random process called "proxy" instead of returning the process for the process group. Is there anyway to scope the results to the group camera_id?
We get every now and then this:
/unicorn-4.6.2/lib/unicorn/http_server.rb:263:in `syswrite': Broken pipe (Errno::EPIPE)
and unicorn fails to start
doing eye stop unicorn
and then eye start unicorn fixes it
here's the info
RUBY = '/usr/bin/ruby'
rails_env = 'production'
Eye.application('rails_unicorn') do
process('unicorn') do
working_dir '/var/www/projects/current'
if File.exist? File.join(working_dir, 'Gemfile')
clear_bundler_env
env 'BUNDLE_GEMFILE' => File.join(working_dir, 'Gemfile')
end
env "RAILS_ENV" => rails_env
# unicorn requires to be `ruby` in path (for soft restart)
env "PATH" => "#{File.dirname(RUBY)}:#{ENV['PATH']}"
pid_file 'tmp/pids/unicorn.pid'
stdall 'log/eye-unicorn.log'
start_command "/usr/local/bin/bundle exec unicorn_rails -c /var/www/projects/current/config/unicorn/#{rails_env}.rb -E #{rails_env} -D"
stop_command 'kill -QUIT {PID}'
restart_command 'kill -USR2 {PID}'
# stop signals: http://unicorn.bogomips.org/SIGNALS.html
stop_signals [:TERM, 10.seconds]
start_timeout 30.seconds
restart_grace 30.seconds
monitor_children do
stop_command "kill -QUIT {PID}"
check :cpu, :every => 30, :below => 75, :times => 3
check :memory, :every => 30, :below => 500.megabytes, :times => [3,5]
end
end
end
user is deploy and permissions are ok to unicorn socket
this process worked with bluepill without this error message
any idea?
I would like to be able to run eye as my deploy user (rather than root) and be able to start nginx (or any other process that needs root privileges) and monitor that process.
Right now, using the Process.kill(0,pid) way of determining that only works for processes that are owned by my deploy user, not root. At the moment, I'm starting nginx via a sudo command that my deploy user can run.
I tried to search for solutions to do this in the issues and/or the wiki, but didn't find anything specifically related to this. Is there a best practice that I'm missing? Is the best practice to run as root?
Hi
We found out that using the suggested upstart on the wiki is not good enough.
when you issue the stop command, the services that eye was monitoring are staying up while eye got killed by the upstart, which in turn cause the issue of trying to start services that are already up like unicorn etc...
in order to fix this and allow eye to stop all services we added some modifications (tested) to the upstart config file:
description "Eye Monitoring System"
start on runlevel [2345]
stop on runlevel [016]
expect fork
kill timeout 60 # when upstart issued a stop, send SIGTERM, wait 60 sec before sending SIGKILL
setuid deploy
setgid deploy
respawn
# ensure eye home folder is set (stores in .eye the pid, the states history, and the eye socket file
env EYE_HOME=/home/deploy
# log stdout and stderr to /var/log/upstart/eye
console log
# important for unicorn to create a socket folder & set permissions before run
pre-start script
mkdir -p "/var/run/unicorn"
chown -R deploy:deploy "/var/run/unicorn"
end script
# load all eye services - upstart will monitor eye, and eye will monitor its own processes
script
exec /usr/local/bin/eye load /etc/eye/*.eye
end script
# this section is to ensure services won't stay up while upstart kills (when issued a stop) the actual eye process
pre-stop script
/usr/local/bin/eye stop all # Stop all eye services
/bin/sleep 15s # wait 15sec before issue SIGTERM to eye
end script
Hi!
We're monitoring/controlling haproxy instances with eye. The restart_command should do a so called 'soft restart' ('/usr/sbin/haproxy -D -f /etc/haproxy/haproxy_test.conf -sf {PID}'), which works so far and haproxy itself updates (as expected) the pidfile with the new pid. But then eye ignores the new pid entry and rewrites the pid file with the old instance pid. That's really bad, because if the old haproxy instance stops (when all connections on the old instance are closed), eye will restart haproxy again and again
proxy/haproxy_test.pid) changes by itself (pid:6751) => (pid:7118), not under eye control, so ignored
08.04.2014 17:10:51 WARN -- [haproxy_test:haproxy_test] check_alive: pid_file(/var/run/haproxy/haproxy_test.pid) changes by itself (pid:6751) => (pid:7118), not under eye control, so ignored
08.04.2014 17:10:56 WARN -- [haproxy_test:haproxy_test] check_alive: pid_file(/var/run/haproxy/haproxy_test.pid) changes by itself (pid:6751) => (pid:7118), > 120 ago, so rewrited (even if pid_file not under eye control)
Is it possible to add an option so that eye doesn't ignore a pid change?
according eye config:
# vi: set ft=ruby :
Eye.load("/etc/eye/root/config.rb")
Eye.application "haproxy_test" do
working_dir "/"
process "haproxy_test" do
start_command '/usr/sbin/haproxy -D -f /etc/haproxy/haproxy_test.conf'
stop_command '/bin/kill -SIGUSR1 {PID}'
restart_command '/usr/sbin/haproxy -D -f /etc/haproxy/haproxy_test.conf -sf {PID}'
pid_file '/var/run/haproxy/haproxy_test.pid'
stdall '/var/log/haproxy.log'
end
end
I have a problem with eye restart ending up with two unicorn master processes. My config is very similar to the example for unicorn.eye this repo.
Using eye start
and eye stop
seems to work well:
deploy@apps:~$ eye start dagensdoman
command :start sent to [dagensdoman]
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... starting
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... up (19:12, 0%, 75Mb, <8345>)
child-8379 .................... up (19:12, 0%, 71Mb, <8379>)
child-8382 .................... up (19:12, 0%, 71Mb, <8382>)
child-8385 .................... up (19:12, 0%, 71Mb, <8385>)
child-8388 .................... up (19:12, 0%, 71Mb, <8388>)
deploy@apps:~$ eye stop dagensdoman
command :stop sent to [dagensdoman]
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... unmonitored (stop by user at 04 Mar 19:13)
But using eye start
and then eye restart
seems to end up with two master unicorn processes:
deploy@apps:~$ eye start dagensdoman
command :start sent to [dagensdoman]
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... up (19:15, 0%, 75Mb, <8544>)
child-8551 .................... up (19:15, 0%, 71Mb, <8551>)
child-8554 .................... up (19:15, 0%, 71Mb, <8554>)
child-8556 .................... up (19:15, 0%, 71Mb, <8556>)
child-8560 .................... up (19:15, 0%, 71Mb, <8560>)
deploy@apps:~$ eye restart dagensdoman
command :restart sent to [dagensdoman]
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... restarting
child-8551 .................... up (19:15, 0%, 71Mb, <8551>)
child-8554 .................... up (19:15, 0%, 71Mb, <8554>)
child-8556 .................... up (19:15, 0%, 71Mb, <8556>)
child-8560 .................... up (19:15, 0%, 71Mb, <8560>)
deploy@apps:~$ eye info
dagensdoman
unicorn ......................... up (19:15, 0%, 74Mb, <8544>)
child-8551 .................... up (19:15, 0%, 71Mb, <8551>)
child-8554 .................... up (19:15, 0%, 71Mb, <8554>)
child-8556 .................... up (19:15, 0%, 71Mb, <8556>)
child-8560 .................... up (19:15, 0%, 71Mb, <8560>)
child-8626 .................... up (19:16, 0%, 76Mb, <8626>)
deploy@apps:~$ ps x | grep unicorn
8544 ? Sl 0:04 unicorn master (old) -Dc ./config/unicorn.rb -E production
8551 ? Sl 0:00 unicorn worker[0] -Dc ./config/unicorn.rb -E production
8554 ? Sl 0:00 unicorn worker[1] -Dc ./config/unicorn.rb -E production
8556 ? Sl 0:00 unicorn worker[2] -Dc ./config/unicorn.rb -E production
8560 ? Sl 0:00 unicorn worker[3] -Dc ./config/unicorn.rb -E production
8626 ? Sl 0:04 unicorn master -Dc ./config/unicorn.rb -E production
8662 ? Sl 0:00 unicorn worker[0] -Dc ./config/unicorn.rb -E production
8665 ? Sl 0:00 unicorn worker[1] -Dc ./config/unicorn.rb -E production
8667 ? Sl 0:00 unicorn worker[2] -Dc ./config/unicorn.rb -E production
8670 ? Sl 0:00 unicorn worker[3] -Dc ./config/unicorn.rb -E production
8744 pts/0 S+ 0:00 grep --color=auto unicorn
For some reason unicorn master (old)
is still alive and the new unicorn master process 8626
is picked up as a child.
Any ideas?
Thanks,
Martin
I have this config
Eye.application "rails" do
working_dir '/var/www'
# global check for all processes
check :cpu, :below => 90, :times => 3, :every => 30.seconds
check :memory, :every => 30.seconds, :below => 420.megabytes, :times => 3
notify :developers, :debug
trigger :flapping, :times => 3, :within => 2.minute, :retry_in => 15.seconds
#any_status = [:starting, :restarting, :up, :down, :unmonitored, :stopping]
#trigger :transition, to: any_status, from: any_status, do: ->{ process.notify :debug, "is #{s.to_s.upcase}" }
# Notify any transition
trigger :transition, do: ->{ process.notify :debug, "is #{@transition.event.upcase}" }
#start_timeout 30.seconds
#start_grace 30.seconds
#restart_grace 20.seconds
#stop_grace 20.seconds
stop_on_delete true
process :puma do
pid_file "/eye/pid/puma.pid"
stdall '/eye/log/puma.log'
start_command "bundle exec puma"
restart_command "kill -USR2 {{PID}}"
stop_signals [:TERM, 5.seconds, :KILL]
daemonize true
end
process :swf do
pid_file "/eye/pid/swf.pid"
stdall '/eye/log/swf.log'
start_command "bundle exec rake swf:start_workers"
daemonize true
end
end
And puma start, I can curl for some second and it properly servers things, but a little after it's down because Eye probably killed it. I don't know what could be wrong in the config. Any ideas? I've tried my best to scan the source and find an answer without much luck. Thanks.
Hi,
eye info bla
returns an empty line with error code 0. I think it would be better to output an error message with error code > 0. This would be much more useful for scripts or systems like chef or puppet to detect if a process is loaded or not (atm you have to grep for empty lines..)
Is there anyway to tell 1 process in the group to only start when another process in the group is already started?
using eye for adhearsion using a jruby wrapper over ahn
normally app start in 5-6 seconds
but with eye I get
05.03.2014 11:55:13 INFO -- [Eye] client command: ping (0.007122302s)
05.03.2014 11:55:13 INFO -- [Eye] loading: ["/home/letmecallu_voip/eye/vozio.eye"]
05.03.2014 11:55:13 INFO -- [Eye] loading: /home/letmecallu_voip/eye/vozio.eye
05.03.2014 11:55:13 INFO -- [vozio:default] send_command: monitor
05.03.2014 11:55:13 INFO -- [vozio:default] schedule :monitor (reason: monitor by user)
05.03.2014 11:55:13 INFO -- [vozio:default] => monitor (reason: monitor by user)
05.03.2014 11:55:13 INFO -- [vozio:default] starting async with 0.2s chain monitor []
05.03.2014 11:55:13 INFO -- [vozio:ahn] schedule :monitor (reason: monitor by user)
05.03.2014 11:55:13 INFO -- [Eye] loaded: ["/home/letmecallu_voip/eye/vozio.eye"], selfpid <11987>
05.03.2014 11:55:13 INFO -- [vozio:ahn] => monitor (reason: monitor by user)
05.03.2014 11:55:13 INFO -- [vozio:ahn] pid_file not found, starting...
05.03.2014 11:55:13 INFO -- [vozio:default] <= monitor
05.03.2014 11:55:13 INFO -- [vozio:ahn] switch :starting [:unmonitored => :starting](reason: monitor by user)
05.03.2014 11:55:13 INFO -- [vozio:ahn] executing: jruby_ahn -
with start_timeout: 180.0s, start_grace: 2.5s, env: nil, working_dir: /home/letmecallu_voip/voip-server-side/vozio-server-rayo
05.03.2014 11:55:13 INFO -- [Eye] client command: load /home/letmecallu_voip/eye/vozio.eye (0.011918452s)
05.03.2014 11:56:33 WARN -- [Eye::System] [ahn] sending :KILL signal to <960> due to timeout (180s)
05.03.2014 11:56:33 ERROR -- [vozio:ahn] execution failed with #<Timeout::Error: execution expired>; try increasing the start_timeout value (the current value of 180s seems too short)
05.03.2014 11:56:33 ERROR -- [vozio:ahn] process <> failed to start ("#<Timeout::Error: execution expired>")
05.03.2014 11:56:33 INFO -- [vozio:ahn] switch :crashed [:starting => :down](reason: monitor by user)
05.03.2014 11:56:33 INFO -- [vozio:ahn] schedule :check_crash (reason: crashed)
05.03.2014 11:56:33 INFO -- [vozio:ahn] <= monitor
05.03.2014 11:56:33 INFO -- [vozio:ahn] => delete (reason: delete by user)
05.03.2014 11:56:33 INFO -- [vozio:ahn] <= delete
05.03.2014 11:56:33 WARN -- [celluloid] Terminating task: type=:call, meta={:method_name=>:process}, status=:callwait
It would be great to be able to 'reload' a process. Should we add a dedicated reload or a system to add user defined commands via dsl?
Hi,
reel-eye 0.5.2 is awesome, and was working fine until we recently updated its dependencies.
0.5.2 started crashing on all our machines like so:
uninitialized constant HTTP::Header
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/gems/2.0.0/gems/reel-0.4.0/lib/reel/response.rb:3:in `<class:Response>'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/gems/2.0.0/gems/reel-0.4.0/lib/reel/response.rb:2:in `<module:Reel>'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/gems/2.0.0/gems/reel-0.4.0/lib/reel/response.rb:1:in `<top (required)>'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/gems/2.0.0/gems/reel-0.4.0/lib/reel.rb:18:in `<top (required)>'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/gems/2.0.0/gems/reel-rack-0.1.0/lib/reel/rack/server.rb:3:in `<top (required)>'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:58:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:45:in `require'
/usr/local/Cellar/ruby/2.0.0-p247/lib/ruby/2.0.0/rubygems/core_ext/kernel_require.rb:45:in `require'
Any idea what's going on? The error only occurs when we try to configure with http enable:true
. We tried playing with the versions of celluloid, reel, and reel-rack, but no luck.
reel-rack 0.2.0 came out recently, is this related?
Thank you so much!
I'm trying to run eye under supervisord as an user called "deploy".
I'm getting this error only when not running as root:
/home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/local.rb:10:in `join': no implicit conversion of nil into String (TypeError)
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/local.rb:10:in `dir'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/local.rb:31:in `path'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/local.rb:39:in `socket_path'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/cli/commands.rb:5:in `client'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/cli/commands.rb:9:in `_cmd'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/cli/server.rb:5:in `server_started?'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/lib/eye/cli.rb:58:in `load'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/thor-0.18.1/lib/thor/command.rb:27:in `run'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/thor-0.18.1/lib/thor/invocation.rb:120:in `invoke_command'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/thor-0.18.1/lib/thor.rb:363:in `dispatch'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/thor-0.18.1/lib/thor/base.rb:439:in `start'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/gems/eye-0.5/bin/eye:5:in `<top (required)>'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/bin/eye:23:in `load'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/bin/eye:23:in `<main>'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/bin/ruby_executable_hooks:15:in `eval'
from /home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/bin/ruby_executable_hooks:15:in `<main>'
My supervisor conf:
[program:eye]
command=/home/deploy/.rvm/bin/chefbrowser_eye load /home/deploy/apps/chef-browser/eye/puma.eye
autostart=true
autorestart=true
startsecs=5
startretries=0
stopsignal=TERM
stopwaitsecs=5
user=root
redirect_stderr=false
stdout_logfile=/var/log/apps/eye_stdout.log
stdout_logfile_maxbytes=50MB
stdout_logfile_backups=0
stdout_events_enabled=false
stderr_logfile=/var/log/apps/eye_stderr.log
stderr_logfile_maxbytes=50MB
stderr_logfile_backups=0
directory=/home/deploy/apps/chef-browser
environment=PATH="/home/deploy/.rvm/gems/ruby-2.0.0-p353@chefbrowser/bin:/home/deploy/.rvm/gems/ruby-2.0.0-p353@global/bin:/home/deploy/.rvm/rubies/ruby-2.0.0-p353/bin:/home/deploy/.rvm/bin:/usr/local/bin:/usr/bin:/bin"
Any ideas ?
Thanks in advance!
Hi there,
I'm attempting to transfer my bluepill config over to eye, and one thing we do is use an environment variable for the rails environment to configure a few things. However, when I try to do this with eye, I can't seem to get the environment variable to pass into the script, no matter what I try to do.
The way I'd use bluepill before was:
RAILS_ENV=development bluepill load
I've tried this plus a ton of other tricks but I haven't figured out how to do something similar with eye. Do you have any advice for how to pass in a variable from the command line, so I can have the .eye config file perform different things based on that variable?
Thanks!
I have a running ffmpeg recording segments and logging the last 10 video files in recording.csv - something like this but just copying from the camera: http://stackoverflow.com/questions/8767727/transcode-and-segment-with-ffmpeg
I would like to make sure the recording.csv file was updated in the last 2 minutes and if not to kill a parent live555ProxyServer process (the first process in the process group).
Is this possible?
Hi,
Thanks a lot for the gem.
I am trying to figure out how to create the pid file when eye restarts as my process does not do it by default. Is there a check available to see if the process is alive?
Thank you
In the Linux Standard Base, the exit codes are different if the service is running or not running. It would be nice to be able to check the exit code in wrapper scripts. http://refspecs.linuxbase.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html
Hi,
eye history
for child processes seems odd to me, it produces empty lines
$ eye i sauspiel
sauspiel
unicorn ......................... up (Jan16, 0%, 149Mb, <26460>)
child-11039 ................... up (Jan16, 0%, 394Mb, <11039>)
child-28576 ................... up (Jan17, 1%, 396Mb, <28576>)
child-9926 .................... up (Jan18, 0%, 373Mb, <9926>)
child-28010 ................... up (Jan18, 0%, 386Mb, <28010>)
child-10222 ................... up (Jan18, 0%, 376Mb, <10222>)
child-4095 .................... up (Jan18, 0%, 397Mb, <4095>)
child-4268 .................... up (Jan18, 2%, 339Mb, <4268>)
child-9074 .................... up (Jan19, 0%, 369Mb, <9074>)
child-6947 .................... up (Jan19, 0%, 386Mb, <6947>)
child-9776 .................... up (Jan19, 0%, 371Mb, <9776>)
child-9961 .................... up (Jan19, 2%, 353Mb, <9961>)
child-13943 ................... up (Jan19, 0%, 390Mb, <13943>)
child-15247 ................... up (Jan19, 0%, 341Mb, <15247>)
child-15335 ................... up (Jan19, 0%, 369Mb, <15335>)
child-18407 ................... up (Jan19, 0%, 341Mb, <18407>)
child-18695 ................... up (Jan19, 0%, 344Mb, <18695>)
child-16495 ................... up (Jan19, 0%, 344Mb, <16495>)
child-16903 ................... up (Jan19, 0%, 352Mb, <16903>)
child-28104 ................... up (Jan19, 0%, 337Mb, <28104>)
child-24127 ................... up (Jan19, 0%, 329Mb, <24127>)
$ eye history sauspiel:unicorn
sauspiel:unicorn:
16 Jan 13:48 - restart (restart by user)
08 Jan 10:28 - ... 10 times (...)
08 Jan 10:28 - restart (restart by user)
08 Jan 10:28 - monitor (monitor by user)
$ eye history sauspiel:unicorn:\*
$
If the server was rebooted uncleanly, the existing ~/.eye directory causes eye to think processes are running when they are not, or to kill the wrong processes (based on PID?), or just not to start at all.
A quick fix is to rm -$ ~/.eye followed by eye load recorder.eye after a server reboot, but would be nice if eye could detect if it was rebooted?
I'm not sure how eye internals work, but maybe it can check for PID as well as process name or something before assuming a process is up when it is not?
When calling eye incorrectly e.g. eye unicorn restart
the command currently returns status 0.
It should indicate that it has failed with status 1.
I'm getting a lot of errors like this:
01.04.2014 20:49:13 WARN -- [Eye::System] [puma] sending :KILL signal to <12563> due to timeout (120s)
01.04.2014 20:49:13 ERROR -- [production:services:puma] execution failed with #<Timeout::Error: execution expired>; try increasing the start_timeout value (the current value of 120s seems too short)
01.04.2014 20:49:13 ERROR -- [production:services:puma] process <12541> failed to start ("#<Timeout::Error: execution expired>")
01.04.2014 20:49:13 INFO -- [production:services:puma] switch :crashed [:starting => :down] (reason: crashed)
01.04.2014 20:49:13 INFO -- [production:services:puma] schedule :check_crash (reason: crashed)
01.04.2014 20:49:13 INFO -- [production:services:puma] <= restore
01.04.2014 20:49:13 INFO -- [production:services:puma] => check_crash (reason: crashed)
01.04.2014 20:49:13 WARN -- [production:services:puma] check crashed: process is down
01.04.2014 20:49:13 INFO -- [production:services:puma] schedule :restore (reason: crashed)
01.04.2014 20:49:13 INFO -- [production:services:puma] <= check_crash
01.04.2014 20:49:13 INFO -- [production:services:puma] => restore (reason: crashed)
01.04.2014 20:49:13 INFO -- [production:services:puma] pid_file found, but process <12563> is down, starting...
01.04.2014 20:49:13 INFO -- [production:services:puma] switch :starting [:down => :starting] (reason: crashed)
The process that's being monitored seems to be running fine until eye decides it's not and kills and restarts it.
I've tried extending the start_timeout as suggested, but that's clearly not the problem. How can I troubleshoot this?
Would it be possible to implement prowl ( http://www.prowlapp.com )
or if you know better service which allows for iphone/android push notification.
We currently use god with prowl notification to get iphone push notification if certain services crash.
Also on a sidenode, would it be possible to use the config/environments/production.rb actionmailer config block with eye? It seems Eye.config mail block does not allow :username and password attributes?
We use mandrillapp.com to send mails and have it configures with username + password in action mailer config block
I have this in my Rails controller:
system('bundle exec eye quit')
And this kills my running rails app which is not managed by eye. Any ideas what could be causing that and how to prevent this?
I have this check:
require 'rtsp/client'
class Rtsp < Eye::Checker::CustomDefer
param :addr, String, true
def initialize(*args)
super
@addr = addr
@rtsp_client = RTSP::Client.new(@addr)
end
def get_value
begin
if @rtsp_client.describe.code == 200
check_with_ffmpeg
else
false
end
rescue
false
end
end
def good?(value)
value
end
def human_value(value)
value == true ? 'Ok' : 'Err'
end
private
def check_with_ffmpeg
system("ffmpeg -loglevel warning -y -stimeout 5000000 " \
"-i #{@addr} -c:v copy -an -vframes 1 -f rawvideo /dev/null")
end
end
It works most of the time, however sometimes, I get this in eye.log and it stops monitoring all processes:
[2014-05-29 21:55:57 +0100] [celluloid] Terminating task: type=:timer, meta=nil, status=:receiving
[2014-05-29 21:55:57 +0100] [recorder:537b68987a616e64da431e00:proxy] check:rtsp task was terminated ["/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `each'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:404:in `cleanup'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:375:in `shutdown'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:185:in `run'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/actor.rb:157:in `block in initialize'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/thread_handle.rb:13:in `block in initialize'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `call'", "/home/deployer/xanagent/shared/bundle/ruby/2.1.0/gems/celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `block in create'"]
The formatted output looks like this:
[2014-05-29 21:55:57 +0100] [celluloid] Terminating task: type=:timer, meta=nil, status=:receiving
[2014-05-29 21:55:57 +0100] [recorder:537b68e17a616e64de651e00:proxy] check:rtsp task was terminated
celluloid-0.15.2/lib/celluloid/tasks/task_fiber.rb:32:in `terminate'
celluloid-0.15.2/lib/celluloid/actor.rb:404:in `block in cleanup'
celluloid-0.15.2/lib/celluloid/actor.rb:404:in `each'
celluloid-0.15.2/lib/celluloid/actor.rb:404:in `cleanup'
celluloid-0.15.2/lib/celluloid/actor.rb:375:in `shutdown'
celluloid-0.15.2/lib/celluloid/actor.rb:185:in `run'
celluloid-0.15.2/lib/celluloid/actor.rb:157:in `block in initialize'
celluloid-0.15.2/lib/celluloid/thread_handle.rb:13:in `block in initialize'
celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `call'
celluloid-0.15.2/lib/celluloid/internal_pool.rb:100:in `block in create'
Any ideas why it happened and how to prevent this from happening?
I want to see if an RTSP server is alive after starting, I have this check:
check :socket, every: 30.seconds, times: 2, timeout: 3.seconds,
addr: "tcp://localhost:12345",
send_data: "DESCRIBE rtsp://localhost/proxyStream RTSP/1.0\nCSeq: 1\n\n",
expect_data: /RTSP\/1\.0 200 OK/
I can see in my RTSP server that it gets a request and sends the correct response, however I see this in my eye.log:
11.03.2014 10:34:10 INFO -- [record:Test:proxy] check:socket [*ReadTimeout<2.0>, *ReadTimeout<2.0>] => Fail
11.03.2014 10:34:10 ERROR -- [record:Test:proxy] NOTIFY: Bounded socket: [*ReadTimeout<2.0>, *ReadTimeout<2.0>] send to [:restart]
...
I think this is happening because there is no way to tell the RTSP server to close the connection after this DESCRIBE command. Is there anyway to get eye to just force a timeout, but still read the response and not consider that a failure?
I got error config error: undefined method
logger=' for Eye:Module` and maybe it happens because I'm use version 0.3
$ gem list eye
*** LOCAL GEMS ***
eye (0.3.beta1, 0.2.3)
$ gem uninstall eye -v 0.3.beta1
INFO: gem "eye" is not installed
How I can uninstall gem?
Hi guys, just want to check if eye has configuration to rotate all log files from code instead of setting up some other 3rd party process.
Sorry for posting a question here, let me know if you have a group or mailing list.
I have playing with eye for a while but couldn't make email notification work. We already have a :smpt server running so our configuration is very simple.
Eye.config do
mail domain: 'mailer.xxx.org', host: 'mailer.xxx.org', port: 25, from_mail: '[email protected]', from_name: 'Eye'
contact :dev, :mail, '[email protected]'
end
Eye.application 'app' do
notify :dev
process :process do
notify :dev
end
end
I cannot see anything from the log? Additionally, when is notify :dev
triggered?
Many thanks.
Couldn't find anything in the docs or a quick scour of the code. Ideally I would like something like:
bundle exec eye stop resque --wait
which wouldn't return until the signaled process(s) stopped. This is because ultimately I want to be able to do:
bundle exec eye stop all --wait
bundle exec eye quit
Hi,
I'm looking for an alternative for bluepill and eye looks quite nice. But what I'm missing so far are custom check conditions. Are there plans for implementation?
Especially the cpu checks wouldn't be sufficent for me. eye (and bluepill) determines the cpu usage of a process with 'ps aux'. According to the man page of ps
CPU usage is currently expressed as the percentage of time spent running during the entire lifetime of a process. This is not ideal, and it does not conform to the standards that ps otherwise conforms to. CPU usage is unlikely to add up to exactly 100%.
the cpu check will never trigger if you have a long running process, because it doesn't represent the process' current cpu usage (e.g. as reported by top
), which could suddenly increase to 100% after many hours or days due to a bug and should get restarted. With bluepill I wrote a custom check condition as a workaround
module Bluepill
module ProcessConditions
class CpuUsagePercental < ProcessCondition
def initialize(options = {})
@below = options[:below]
end
def run(pid, include_children = false)
`top -b -p #{pid} -n 1 |grep '#{pid}'|awk '{print $9}'`.chomp.to_i
end
def check(value)
value < @below
end
end
end
end
What do you think?
Hi,
Just want to ask how to set up eye to monitor CPU both high CPU and low CPU.
Currently I set it up to monitor high CPU like this.
checks :cpu, every: 60, below: 90, times: 15
But couldn't see any docs regarding low CPU.
Thanks man,
Son.
Hi.
after testing eye config on our staging servers (it was fine) and moving to production, I'm not able to load config file and start eye
got this
$ bin/eye load script/eye/rp.eye
eye started!
stack level too deep
/usr/local/rvm/gems/ruby-1.9.3-p448/gems/state_machine-1.2.0/lib/state_machine/macro_methods.rb:1
No single line in log file. Can send list of gems, if necessary. is there any way how to trace in which part it failed?
Thanks
tom
Is it possible to use custom pid and socket file? I saw EYE_PID
environment variable, but path given by eye xinfo
is different when eye
is running from root and from user.
I want to be able to start eye
on the user and then manage it from root :)
Hi!
on one box eye info
reports the wrong process timestamp.
e.g.
$ eye version
Eye v0.5.1 (c) 2012-2014 @kostya
$ eye i sauspiel_delayed_job
sauspiel_delayed_job
dj_rake_task .................... up (12:21, 0%, 159Mb, <24243>)
but the last restart was at 12:42
03.02.2014 12:42:56 INFO -- [sauspiel_delayed_job:dj_rake_task] switch :restarting [:up => :restarting] (reason: bounded memory(500Mb))
03.02.2014 12:42:56 INFO -- [sauspiel_delayed_job:dj_rake_task] switch :stopping [:restarting => :stopping] (reason: bounded memory(500Mb))
$ ps -ef |grep -v grep | grep 24243
deploy 24243 1 7 12:43 ? 00:01:12 ruby ./script/delayed_job start
This time shift of ~20 minutes applies to all monitored processes on this box. On other boxes it is correct. Any hints?
Hi there,
I have a problem with output from process which is managed by eye
. This process prints some text to stdout every two seconds, but the file does not have output until I restart process via eye
. It seems like output is collected somewhere and save is delayed.
example process:
require 'logger'
logger = Logger.new('log/hi.log', 1, 1024 * 1024)
loop do
str = (0...50).map { ('a'..'z').to_a[rand(26)] }.join
logger.info str
puts ENV['TEST'] + ENV['TESTT']
puts 123
sleep(2)
end
explain of eye configuration:
{:settings=>
{:logger=>["/var/log/eye/eye.log", 3, 1048576],
:mail=>{:host=>"...", :port=>..., :type=>:mail},
:contacts=>{"smefju"=>{:name=>"smefju", :type=>:mail, :contact=>"smefju@...", :opts=>{}}}},
:applications=>
{"Custom processes"=>
{:name=>"Custom processes",
:working_dir=>"/home/app/app/current/",
:stdall=>"/home/app/app/shared/log/eye.log",
:stdout=>"/home/app/app/shared/log/eye.log",
:stderr=>"/home/app/app/shared/log/eye.log",
:notify=>{"smefju"=>:warn},
:triggers=>{:flapping=>{:times=>3, :within=>60, :retry_in=>300, :type=>:flapping}},
:groups=>
{"__default__"=>
{:name=>"__default__",
:working_dir=>"/home/app/app/current/",
:stdall=>"/home/app/app/shared/log/eye.log",
:stdout=>"/home/app/app/shared/log/eye.log",
:stderr=>"/home/app/app/shared/log/eye.log",
:notify=>{"smefju"=>:warn},
:triggers=>{:flapping=>{:times=>3, :within=>60, :retry_in=>300, :type=>:flapping}},
:application=>"Custom processes",
:processes=>
{"hi"=>
{:name=>"hi",
:working_dir=>"/home/app/app/current/",
:stdall=>"/home/app/app/shared/log/hi_stdall.log",
:stdout=>"/home/app/app/shared/log/hi_stdall.log",
:stderr=>"/home/app/app/shared/log/hi_stdall.log",
:notify=>{"smefju"=>:warn},
:triggers=>{:flapping=>{:times=>3, :within=>60, :retry_in=>300, :type=>:flapping}},
:application=>"Custom processes",
:group=>"__default__",
:pid_file=>"/var/run/app/hi.pid",
:start_command=>"ruby processes/hi.rb",
:environment=>{"TEST"=>1, "TESTT"=>2},
:daemonize=>true}}}}}}}
Any idea?
The code I'm using now is:
trigger :transition, to: :starting, do: -> {
month = 60 * 60 * 24 * 28
process.wait_for_condition(month, 15) do
info "Waiting for camera #{url} to be up"
timeout = 5 * 1_000_000 # 5 seconds in microseconds
system("ffmpeg -loglevel warning -y -stimeout #{timeout} " \
"-i #{url} -c:v copy -an -vframes 1 -f rawvideo /dev/null")
end
}
But even after a month, I want to be able to wait_for_condition endlessly. If the camera is offline, there is no reason to ever start the proxy server process, the condition just needs to be retried every 15 seconds until the camera shows up on the network (if ever).
How would I make wait_for_condition never time out? - What is the rational for the wait_for_condition to time out? - in other words, under what circumstances is it useful to time out and ignore a condition? (why have a condition in the first place if it will be ignored after a while?)
Hello,
I'm having issues with the default behaviour of creating pid/socket files in $HOME. I have a system-wide installation of eye on an Ubuntu system, I thus invoke eye either via sudo from my account or logging directly as root. $HOME is however different in both cases leading to communication issues.
On LTS system I would put eye's files in /var/run/eye and I do in my fork but of course this has to be hardcoded and would not be suitable for those who run it as normal user.
This is why I'm not proposing a pull request.
Maybe it would be appropriate to make the server's directory configurable between "user" and "system-wide" mode and make the client search pid/socket in the home dir and in /var/run by default with maybe the possibility to override this behaviour as a CLI option.
If you deem this approach viable I can make a patch and pull request.
Bye,
Would this be possible? Instead of launching external processes I would like to fork different bits of code and monitor them.
Thanks.
When I make my command something like:
start_command 'command_1.py | command_2.rb'
It appears the first command rans and the '|' (pipe) is just provided as an escaped argument instead of running the command and piping the output from the first command.
Any ideas?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.