Comments (15)
Is this really a problem or does it only look ugly in the logs?
from server-tools.
The cron job crashes on the client instance, then the server keeps sending "instance seems to be dead" emails until we restart the client instance (even though Odoo was still running fine). Having a stack trace in the log is definitely ok, but the cron shouldn't crash IMO.
from server-tools.
hmm, but that's weird, because cronjobs' exceptions should be handled safely by cron anyways: https://github.com/OCA/OCB/blob/7.0/openerp/addons/base/ir/ir_cron.py#L140 - that's also the reason I didn't do any error handling in the first place.
Could you find our why this is not true on the client instance in question? Maybe a very old code base?
from server-tools.
Thanks for this answer which makes a lot of sense. Let me check it and get back to you!
from server-tools.
I checked the code of ir_cron.py and it's exactly the same as OCB v7
from server-tools.
then the cron thread shouldn't crash. Do some debugging to find out what goes wrong here
from server-tools.
The Odoo source code was pretty old anyway. I updated it to the latest OCB v7. I'll first check if it still crashes (it shouldn't take long) and add some log if it's the case. Else I'll close the issue :)
from server-tools.
@seb-elico does it work now?
from server-tools.
@hbrunn Hi Holger! No, unfortunately it's still crashing. I haven't had time to further investigate it so far. When I have time, I'll try to add some logs or catch the exception inside alive(self)
to see if it solves the problem. We have a lot of issues with other modules due to the bad quality of internet so China is a good playground to test Odoo in "extreme" conditions ;) For instance, the fetchmail module has several issues due to connection timeout. But exceptions are handled inside the module and it doesn't cause the cron to crash, hence my first suggestion https://github.com/OCA/OCB/blob/7.0/addons/fetchmail/fetchmail.py#L259
from server-tools.
@hbrunn I added some logs and changed the log level to debug. It starts to be clearer now (but still not completely clear).
First, about the exceptions: indeed, when an exception is raised, it's handled by the cron. And after the exception has been raised, the cron is able to start a new job. FYI, so far I have seen:
URLError: <urlopen error [Errno 110] Connection timed out>
URLError: <urlopen error [Errno 104] Connection reset by peer>
Second, regarding the cron that "crashes": after the server stopped receiving HTTP requests, I saw the following message in the log every time the cron tried to start the job:
openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job
Dead man's switch client, skipping it.
However, 18 minutes later, I got a Connection timed out
and then the cron started to launch jobs again. It feels like the cron was "frozen" and that it took 18 minutes to reach the timeout!
It might be possible that sometimes it's "frozen" even longer and maybe even forever...
I'm going to add more logs (including the data in order to use the RAM usage as a way to make sure that the job that fails is the same than the one that was launched a while ago, 18 minutes in my previous example). I'll also try to setup a short timeout (like 30 seconds) to see if I keep having long gaps between the send and the exception. The good thing is: it happens very often thanks to the bad internet connection!
To be continued...
from server-tools.
Please find below a log showing what I explained in my previous message (I added a sent
log in case of success and a could not send
in case of failure):
09:28:36,770 8 DEBUG stable openerp.addons.base.ir.ir_cron: Starting job `Dead man's switch client`.
09:28:36,776 8 DEBUG stable openerp.addons.base.ir.ir_cron: cron.object.execute(u'stable', 1, '*', u'dead.mans.switch.client', u'alive')
09:28:36,780 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: sending {'ram': 13.310339821349043, 'user_count': 0, 'cpu': 0.0, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:28:37,324 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: sent {'ram': 13.310339821349043, 'user_count': 0, 'cpu': 0.0, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:28:37,324 8 DEBUG stable openerp.addons.base.ir.ir_cron: 0.548s (dead.mans.switch.client, alive)
09:28:43,596 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:29:37,386 8 DEBUG ? openerp.service.cron: cron0 polling for jobs
09:29:44,638 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:30:39,299 8 DEBUG ? openerp.service.cron: cron0 polling for jobs
09:30:39,303 8 DEBUG stable openerp.addons.base.ir.ir_cron: Starting job `Dead man's switch client`.
09:30:39,310 8 DEBUG stable openerp.addons.base.ir.ir_cron: cron.object.execute(u'stable', 1, '*', u'dead.mans.switch.client', u'alive')
09:30:39,315 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: sending {'ram': 13.310339821349043, 'user_count': 0, 'cpu': 0.1, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:30:45,701 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:30:45,705 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:31:46,766 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:31:46,770 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:32:47,832 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:32:47,837 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:33:48,857 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:33:48,861 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:34:49,889 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:34:49,894 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:35:52,570 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:35:52,574 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:36:53,601 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:36:53,606 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:37:54,637 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:37:54,642 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:38:55,673 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:38:55,678 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:39:56,740 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:39:56,744 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:40:59,440 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:40:59,445 8 DEBUG stable openerp.addons.base.ir.ir_cron: Another process/thread is already busy executing job `Dead man's switch client`, skipping it.
09:41:01,950 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: could not send {'ram': 13.310339821349043, 'user_count': 0, 'cpu': 0.1, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:41:01,950 8 ERROR stable openerp.addons.base.ir.ir_cron: Call of self.pool.get('dead.mans.switch.client').alive(cr, uid, *()) failed in Job 10
Traceback (most recent call last):
File "/opt/odoo/sources/odoo/openerp/addons/base/ir/ir_cron.py", line 136, in _callback
method(cr, uid, *args)
File "/opt/odoo/additional_addons/dead_mans_switch_client/models/dead_mans_switch_client.py", line 72, in alive
raise e
URLError: <urlopen error [Errno 104] Connection reset by peer>
09:42:00,504 8 DEBUG ? openerp.service.cron: cron1 polling for jobs
09:42:00,508 8 DEBUG stable openerp.addons.base.ir.ir_cron: Starting job `Dead man's switch client`.
09:42:00,514 8 DEBUG stable openerp.addons.base.ir.ir_cron: cron.object.execute(u'stable', 1, '*', u'dead.mans.switch.client', u'alive')
09:42:00,518 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: sending {'ram': 13.324073147619526, 'user_count': 0, 'cpu': 0.0, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:42:01,206 8 DEBUG stable openerp.addons.dead_mans_switch_client.models.dead_mans_switch_client: sent {'ram': 13.324073147619526, 'user_count': 0, 'cpu': 0.0, 'database_uuid': u'12345678-90ab-cdef-1234-567890abcdef'}
09:42:01,206 8 DEBUG stable openerp.addons.base.ir.ir_cron: 0.692s (dead.mans.switch.client, alive)
from server-tools.
I added a 30 seconds timeout to urllib2.urlopen
, it seems much more stable now. The exceptions in the log are slightly different:
URLError: <urlopen error timed out>
URLError: <urlopen error _ssl.c:495: The handshake operation timed out>
I've never seen those exceptions before... Oh and I forgot to mention: the server is contacted on an HTTPS URL (hence the SSL exception).
I'm gonna let it run all night long to check if it "crashes/freezes" again and I'll let you know by tomorrow :)
from server-tools.
@hbrunn Hi Holger! It seems that the timeout did the trick :D No crash/freeze of the cron during the night, a lot of timeouts have been raised (1 out of 10 requests in average). FYI, here's the patch I did, feel free to reuse it! The most important being the SEND_TIMEOUT
part.
--- a/dead_mans_switch_client/models/dead_mans_switch_client.py
+++ b/dead_mans_switch_client/models/dead_mans_switch_client.py
@@ -11,6 +11,7 @@ except ImportError:
import urllib2
from openerp.osv import orm
+SEND_TIMEOUT = 30
class DeadMansSwitchClient(orm.AbstractModel):
_name = 'dead.mans.switch.client'
@@ -53,15 +54,21 @@ class DeadMansSwitchClient(orm.AbstractModel):
logger.error('No server configured!')
return
data = self._get_data(cr, uid, context=context)
- logger.debug('sending %s', data)
- urllib2.urlopen(
- urllib2.Request(
- url,
- json.dumps({
- 'jsonrpc': '2.0',
- 'method': 'call',
- 'params': data,
- }),
- {
- 'Content-Type': 'application/json',
- }))
+ logger.debug('Sending %s', data)
+ try:
+ urllib2.urlopen(
+ urllib2.Request(
+ url,
+ json.dumps({
+ 'jsonrpc': '2.0',
+ 'method': 'call',
+ 'params': data,
+ }),
+ {
+ 'Content-Type': 'application/json',
+ }),
+ timeout=SEND_TIMEOUT)
+ logger.debug('Successfully sent %s', data)
+ except Exception, e:
+ logger.debug('Failed to send %s', data)
+ raise e
from server-tools.
nice! Please make a PR with that, but read the timeout from an ir.config_parameter
, maybe dead_mans_switch_client.send_timeout
from server-tools.
Done! #309
I also created a PR on your 7.0 branch so that you can update your own 7.0 PR :)
from server-tools.
Related Issues (20)
- product_multi_image odoo module
- Frontend integration for sentry HOT 1
- upgrade_analysis: Odoo-version HOT 5
- [16.0] excel_import_export: template cannot be found if it is created by duplicating another template
- Migration to version 17.0 HOT 21
- Database Auto-Backup fails on pg_dump error HOT 6
- [16] base_sequence_default - Improve readme HOT 1
- problème d'installation module auditlog HOT 1
- How to Install auditlog module. HOT 1
- [17.0] module views_migration_17 - [Errno 30] Read-only file system: '/opt/odoo/custom/src/odoo/odoo/addons/base/views/ir_actions_views.xml'
- Auditlog does not log changes to Reordering Rules if user does not hit "enter" or click "Save". HOT 1
- [15.0] base_fontawesome: broken icons when updgrade HOT 1
- [16.0] database_cleanup - Purge columns purges totp_secret, resulting in server crash HOT 1
- Separate tools into different repos HOT 2
- Create a neutralized database backup HOT 6
- 14.0: Errors on uninstalling `tracking_manager` HOT 1
- [16.0] Tests failing: cron_daylight_saving_time_resistant HOT 1
- [17.0] auto_backup - fail or disk space alert
- Some icons of fontawesome are not shown
- [16.0] Unable to install sentry-sdk dependency: urllib3 conflict HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server-tools.