Code Monkey home page Code Monkey logo

Comments (12)

blodone avatar blodone commented on July 30, 2024

after 24h the process is moving on. but its very slow. i gonna try different simultanious read and write values

from benji.

elemental-lf avatar elemental-lf commented on July 30, 2024

Sorry for reopening this, but I don't think that this is normal. Could you gather anything else from the logfile (according to your strace you were logging at level DEBUG)? What version of Python are you using?
This suggests that you're still making progress:

[pid 1354678] write(2, " DEBUG: Queued block 11171 for"..., 70 <unfinished ...>

But something was hanging or is very slow either on the read or the write side. And I'd like to know where and what...

from benji.

blodone avatar blodone commented on July 30, 2024

Hi were using python 3.6.7

So we would now debug further into this... The difference between the start and progress of the backup is really incredible:

2019-02-11 20:52:14,982 [410382] $ /backup/benji/venv/bin/benji --log-level DEBUG -m ls name == "RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add" and snapshot_name == "FREEZE_2019-02-11T13:18:43.911538+00:00"
2019-02-11 20:52:16,774 [410417] $ /backup/benji/venv/bin/benji --log-level DEBUG backup -s FREEZE_2019-02-11T19:52:00.370238+00:00 -r /backup/benji/diff/65338.diff -f 19 rbd://RBD_SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5/RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add@FREEZE_2019-02-11T19:52:00.370238+00:00 RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add
2019-02-11 20:56:47,596 [410417] Starting sanity check with 0.1% of the ignored blocks.
2019-02-11 20:56:50,299 [410417] Finished sanity check. Checked 87 blocks {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 41020, 36932, 66131, 122452, 35330, 47206, 116839, 58995, 77961, 86161, 22692, 31398, 21670, 114342, 108710, 76971, 65717, 125115, 18967, 76997, 30406, 126664, 65235, 213, 28374, 125656, 2267, 37106, 116987, 118809, 93446, 44296, 82724, 21797, 122166, 19767, 125758, 31559, 39761, 103762, 119127, 50523, 5981, 14691, 127844, 32106, 85873, 62840, 77176, 67961, 89478, 60813, 2447, 57250, 71729, 19384, 93627, 96189, 97236, 97238, 2011, 77800}.
2019-02-12 10:41:00,751 [436428] $ /backup/benji/venv/bin/benji --log-level DEBUG -m ls
2019-02-12 10:41:02,884 [436441] $ /backup/benji/venv/bin/benji --log-level DEBUG -m stats
2019-02-12 12:27:33,075 [410417] Backed up 14/2696 blocks (0.5%)
2019-02-12 12:27:33,383 [410417] Backed up 28/2696 blocks (1.0%)

Seems like there is anything blocking like throtteling or something like this.

from benji.

elemental-lf avatar elemental-lf commented on July 30, 2024

Ah, you're affected by the problem that the logfile until very recently only gets log level INFO and above. Could you either update to the latest master (or the image with tag "latest") or provide the output directly from the console? Thank you.

from benji.

blodone avatar blodone commented on July 30, 2024

ok, i've updated and this evening it's running again ;)

were testing now with following settings:

storages:
  - name: file
    storageId: 1
    module: file
    configuration:
       path: /backup/benji/data
       consistencyCheckWrites: True
       simultaneousWrites: 5
       simultaneousReads: 5

ios:
  - name: file
    module: file
    configuration:
      simultaneousReads: 5

  - name: rbd
    module: rbd
    configuration:      
       cephConfigFile: /etc/ceph/ceph.conf
       simultaneousReads: 5
       newImageFeatures:
         - RBD_FEATURE_LAYERING
         - RBD_FEATURE_EXCLUSIVE_LOCK
         - RBD_FEATURE_OBJECT_MAP
         - RBD_FEATURE_FAST_DIFF
         - RBD_FEATURE_DEEP_FLATTEN

from benji.

blodone avatar blodone commented on July 30, 2024

so, now i got plenty of logging, but the last entry in the log is:

2019-02-12 21:00:14,612 461639/MainThread benji.py:761 DEBUG Keeping block 73926

process with ps auxwww tells me

invento+  461639  0.5 10.4 2034436 854600 ?      Ssl  Feb12   3:35 benji [Backing up version V0000000023 from rbd://RBD_XenStorage-PROD-SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5/RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add@FREEZE_2019-02-12T19:55:13.469743+00:00: Queueing blocks to read (57.8%)]

so again the strace (approx. 10secs):

[pid 461759] futex(0x7f1b5407ab00, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff <unfinished ...>
[pid 461758] futex(0x7f1b5400f670, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff <unfinished ...>
[pid 461757] futex(0x7f1b54059d50, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff <unfinished ...>
[pid 461756] futex(0x7f1b2c004b40, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff <unfinished ...>
[pid 461755] futex(0x7f1b54014e70, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, 0xffffffff <unfinished ...>
[pid 461658] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid 461657] futex(0x1a36d78, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461656] futex(0x1a36b88, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461655] futex(0x1a368c8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461654] futex(0x1a363f8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461653] restart_syscall(<... resuming interrupted restart_syscall ...> <unfinished ...>
[pid 461652] futex(0x1a83448, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461651] futex(0x1a83284, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461650] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid 461649] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461648] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461649] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91091149}) = 0
[pid 461648] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91106849}) = 0
[pid 461649] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461648] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461649] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91163687}) = 0
[pid 461648] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91179052}) = 0
[pid 461649] epoll_wait(12,  <unfinished ...>
[pid 461647] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461648] epoll_wait(9,  <unfinished ...>
[pid 461647] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91288108}) = 0
[pid 461646] restart_syscall(<... resuming interrupted futex ...> <unfinished ...>
[pid 461647] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461645] futex(0x13c25a0, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461647] <... clock_gettime resumed> {tv_sec=326183, tv_nsec=91351773}) = 0
[pid 461647] epoll_wait(6,  <unfinished ...>
[pid 461639] write(2, "   DEBUG: Keeping block 73927\n", 30 <unfinished ...>
[pid 461658] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040900, tv_nsec=156538158}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040902, tv_nsec=156538158}, 0xffffffff <unfinished ...>
[pid 461653] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=603285489}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=603400125}) = 0
[pid 461653] futex(0x13c25a0, FUTEX_WAKE_PRIVATE, 1) = 1
[pid 461653] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid 461645] <... futex resumed> )      = 0
[pid 461653] <... clock_gettime resumed> {tv_sec=1550040901, tv_nsec=603601695}) = 0
[pid 461645] futex(0x13c24f8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 461653] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid 461645] <... futex resumed> )      = 0
[pid 461653] <... clock_gettime resumed> {tv_sec=1550040901, tv_nsec=603675665}) = 0
[pid 461645] futex(0x13c25a4, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=603766771}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=603863654}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=603950786}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=604050811}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=604135270}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=604218483}) = 0
[pid 461653] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=604302110}) = 0
[pid 461653] write(14, "c", 1)          = 1
[pid 461649] <... epoll_wait resumed> [{EPOLLIN, {u32=13, u64=13}}], 5000, 30000) = 1
[pid 461653] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid 461649] clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
[pid 461653] <... clock_gettime resumed> {tv_sec=1550040901, tv_nsec=604492755}) = 0
[pid 461649] <... clock_gettime resumed> {tv_sec=326185, tv_nsec=928921781}) = 0
[pid 461653] futex(0x1a36240, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 461649] read(13,  <unfinished ...>
[pid 461653] <... futex resumed> )      = 0
[pid 461649] <... read resumed> "c", 256) = 1
[pid 461653] futex(0x1a362c8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040911, tv_nsec=604492755}, 0xffffffff <unfinished ...>
[pid 461649] read(13, 0x7f1b6ef599d0, 256) = -1 EAGAIN (Resource temporarily unavailable)
[pid 461649] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=604731299}) = 0
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=929222918}) = 0
[pid 461649] sendmsg(15, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="\16E\277c\\\243w\v$", iov_len=9}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 9
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=929426393}) = 0
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=929503207}) = 0
[pid 461649] epoll_wait(12, [{EPOLLIN, {u32=15, u64=15}}], 5000, 29255) = 1
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=929996927}) = 0
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=930068288}) = 0
[pid 461649] read(15, "\17E\277c\\\243w\v$", 4096) = 9
[pid 461649] read(15, 0x1b1e3c0, 4096)  = -1 EAGAIN (Resource temporarily unavailable)
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=930274051}) = 0
[pid 461649] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326185, tv_nsec=930339229}) = 0
[pid 461649] epoll_wait(12,  <unfinished ...>
[pid 461646] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 461646] futex(0x186cdd0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461646] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040901, tv_nsec=706634390}) = 0
[pid 461646] futex(0x186ce48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040906, tv_nsec=706634390}, 0xffffffff <unfinished ...>
[pid 461650] <... restart_syscall resumed> ) = -1 ETIMEDOUT (Connection timed out)
[pid 461650] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040902, tv_nsec=109295100}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326186, tv_nsec=433829674}) = 0
[pid 461650] futex(0x1b0ef68, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326186, tv_nsec=434018248}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326186, tv_nsec=434200894}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326186, tv_nsec=434312897}) = 0
[pid 461650] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040902, tv_nsec=109974000}) = 0
[pid 461650] futex(0x1b0efbc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040907, tv_nsec=109861997}, 0xffffffff <unfinished ...>
[pid 461658] <... futex resumed> )      = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040902, tv_nsec=156794890}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040904, tv_nsec=156794890}, 0xffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040904, tv_nsec=157178601}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040906, tv_nsec=157178601}, 0xffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040906, tv_nsec=157469339}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040908, tv_nsec=157469339}, 0xffffffff <unfinished ...>
[pid 461646] <... futex resumed> )      = -1 ETIMEDOUT (Connection timed out)
[pid 461646] futex(0x186cdd0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461646] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040906, tv_nsec=707038340}) = 0
[pid 461646] futex(0x186ce48, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040911, tv_nsec=707038340}, 0xffffffff <unfinished ...>
[pid 461650] <... futex resumed> )      = -1 ETIMEDOUT (Connection timed out)
[pid 461650] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040907, tv_nsec=110185553}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326191, tv_nsec=434697152}) = 0
[pid 461650] futex(0x1b0ef68, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326191, tv_nsec=434860886}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326191, tv_nsec=434985917}) = 0
[pid 461650] clock_gettime(CLOCK_MONOTONIC, {tv_sec=326191, tv_nsec=435080394}) = 0
[pid 461650] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040907, tv_nsec=110744103}) = 0
[pid 461650] futex(0x1b0efbc, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040912, tv_nsec=110649626}, 0xffffffff <unfinished ...>
[pid 461658] <... futex resumed> )      = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040908, tv_nsec=157739777}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040910, tv_nsec=157739777}, 0xffffffff) = -1 ETIMEDOUT (Connection timed out)
[pid 461658] clock_gettime(CLOCK_REALTIME, {tv_sec=1550040910, tv_nsec=158038382}) = 0
[pid 461658] futex(0x1a370c8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 461658] futex(0x1a37140, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1550040912, tv_nsec=158038382}, 0xffffffffstrace: Process 461639 detached
strace: Process 461645 detached
strace: Process 461646 detached
strace: Process 461647 detached
strace: Process 461648 detached
strace: Process 461649 detached
strace: Process 461650 detached
strace: Process 461651 detached
strace: Process 461652 detached
strace: Process 461653 detached
strace: Process 461654 detached
strace: Process 461655 detached
strace: Process 461656 detached
strace: Process 461657 detached
strace: Process 461658 detached
 <detached ...>
strace: Process 461755 detached
strace: Process 461756 detached
strace: Process 461757 detached
strace: Process 461758 detached
strace: Process 461759 detached

log file previous to keeping blocks:

2019-02-12 20:55:28,048 461612/MainThread logging.py:50 INFO $ /backup/benji/venv/bin/benji --log-level DEBUG -m ls name == "RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add" and snapshot_name =
= "FREEZE_2019-02-11T19:52:00.370238+00:00"
2019-02-12 20:55:28,048 461612/MainThread benji.py:863 DEBUG commands.ls(**{'filter_expression': 'name == "RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add" and snapshot_name == "FREEZE_2019-02-
11T19:52:00.370238+00:00"', 'include_labels': False})
2019-02-12 20:55:28,054 461612/MainThread utils.py:103 DEBUG Using block hash BLAKE2b with kwargs {'digest_bits': 256}.
2019-02-12 20:55:28,055 461612/MainThread config.py:86 DEBUG Resolved schema for benji.io.base-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simu
ltaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:28,055 461612/MainThread config.py:86 DEBUG Resolved schema for benji.io.file-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simu
ltaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:28,077 461612/MainThread config.py:86 DEBUG Resolved schema for benji.io.base-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simu
ltaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:28,077 461612/MainThread config.py:86 DEBUG Resolved schema for benji.io.rbd-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simul
taneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'cephConfigFile': {'type': 'string', 'empty': False, 'default': '/etc/ceph/ceph.conf'}, 'clientIdentifier': {'
type': 'string', 'empty': False, 'default': 'admin'}, 'newImageFeatures': {'type': 'list', 'empty': False, 'default': ['RBD_FEATURE_LAYERING'], 'schema': {'type': 'string', 'regex': '^RB
D_FEATURE_.*'}}}, 'empty': False}}.
2019-02-12 20:55:28,084 461612/MainThread config.py:86 DEBUG Resolved schema for benji.storage.base-v1: {'configuration': {'type': 'dict', 'empty': False, 'schema': {'activeTransforms': 
{'type': 'list', 'empty': False, 'schema': {'type': 'string', 'empty': False}}, 'simultaneousWrites': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'simultaneousReads': {'
type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'bandwidthRead': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'bandwidthWrite': {'type': 'integer', 'empty': Fa
lse, 'min': 0, 'default': 0}, 'consistencyCheckWrites': {'type': 'boolean', 'empty': False, 'default': False}, 'hmac': {'type': 'dict', 'empty': False, 'schema': {'password': {'type': 's
tring', 'empty': False, 'required': True, 'minlength': 8}, 'kdfSalt': {'type': 'string', 'empty': False, 'required': True}, 'kdfIterations': {'type': 'integer', 'empty': False, 'required': True, 'min': 1000}}}}}}.
2019-02-12 20:55:28,084 461612/MainThread config.py:86 DEBUG Resolved schema for benji.storage.file-v1: {'configuration': {'type': 'dict', 'empty': False, 'schema': {'activeTransforms': {'type': 'list', 'empty': False, 'schema': {'type': 'string', 'empty': False}}, 'simultaneousWrites': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'bandwidthRead': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'bandwidthWrite': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'consistencyCheckWrites': {'type': 'boolean', 'empty': False, 'default': False}, 'hmac': {'type': 'dict', 'empty': False, 'schema': {'password': {'type': 'string', 'empty': False, 'required': True, 'minlength': 8}, 'kdfSalt': {'type': 'string', 'empty': False, 'required': True}, 'kdfIterations': {'type': 'integer', 'empty': False, 'required': True, 'min': 1000}}}, 'path': {'type': 'string', 'empty': False, 'required': True}}, 'required': True}}.
2019-02-12 20:55:28,133 461612/MainThread database.py:564 DEBUG Current database schema revision: fe79ce75cefa.
2019-02-12 20:55:28,134 461612/MainThread database.py:565 DEBUG Expected database schema revision: fe79ce75cefa.
2019-02-12 20:55:30,447 461639/MainThread logging.py:50 INFO $ /backup/benji/venv/bin/benji --log-level DEBUG backup -s FREEZE_2019-02-12T19:55:13.469743+00:00 -r /backup/benji/diff/65608.diff -f 21 rbd://RBD_XenStorage-PROD-SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5/RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add@FREEZE_2019-02-12T19:55:13.469743+00:00 RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add
2019-02-12 20:55:30,448 461639/MainThread benji.py:863 DEBUG commands.backup(**{'base_version_uid': '21', 'block_size': None, 'labels': None, 'rbd_hints': '/backup/benji/diff/65608.diff', 'snapshot_name': 'FREEZE_2019-02-12T19:55:13.469743+00:00', 'source': 'rbd://RBD_XenStorage-PROD-SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5/RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add@FREEZE_2019-02-12T19:55:13.469743+00:00', 'storage': '', 'version_name': 'RBD-bd0ed693-82b8-45b3-8cca-32ccfd140add'})
2019-02-12 20:55:30,457 461639/MainThread utils.py:103 DEBUG Using block hash BLAKE2b with kwargs {'digest_bits': 256}.
2019-02-12 20:55:30,458 461639/MainThread config.py:86 DEBUG Resolved schema for benji.io.base-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:30,458 461639/MainThread config.py:86 DEBUG Resolved schema for benji.io.file-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:30,487 461639/MainThread config.py:86 DEBUG Resolved schema for benji.io.base-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}}}}.
2019-02-12 20:55:30,488 461639/MainThread config.py:86 DEBUG Resolved schema for benji.io.rbd-v1: {'configuration': {'type': 'dict', 'nullable': True, 'required': True, 'schema': {'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'cephConfigFile': {'type': 'string', 'empty': False, 'default': '/etc/ceph/ceph.conf'}, 'clientIdentifier': {'type': 'string', 'empty': False, 'default': 'admin'}, 'newImageFeatures': {'type': 'list', 'empty': False, 'default': ['RBD_FEATURE_LAYERING'], 'schema': {'type': 'string', 'regex': '^RBD_FEATURE_.*'}}}, 'empty': False}}.
2019-02-12 20:55:30,498 461639/MainThread config.py:86 DEBUG Resolved schema for benji.storage.base-v1: {'configuration': {'type': 'dict', 'empty': False, 'schema': {'activeTransforms': {'type': 'list', 'empty': False, 'schema': {'type': 'string', 'empty': False}}, 'simultaneousWrites': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'bandwidthRead': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'bandwidthWrite': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'consistencyCheckWrites': {'type': 'boolean', 'empty': False, 'default': False}, 'hmac': {'type': 'dict', 'empty': False, 'schema': {'password': {'type': 'string', 'empty': False, 'required': True, 'minlength': 8}, 'kdfSalt': {'type': 'string', 'empty': False, 'required': True}, 'kdfIterations': {'type': 'integer', 'empty': False, 'required': True, 'min': 1000}}}}}}.
2019-02-12 20:55:30,498 461639/MainThread config.py:86 DEBUG Resolved schema for benji.storage.file-v1: {'configuration': {'type': 'dict', 'empty': False, 'schema': {'activeTransforms': {'type': 'list', 'empty': False, 'schema': {'type': 'string', 'empty': False}}, 'simultaneousWrites': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'simultaneousReads': {'type': 'integer', 'empty': False, 'min': 1, 'default': 1}, 'bandwidthRead': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'bandwidthWrite': {'type': 'integer', 'empty': False, 'min': 0, 'default': 0}, 'consistencyCheckWrites': {'type': 'boolean', 'empty': False, 'default': False}, 'hmac': {'type': 'dict', 'empty': False, 'schema': {'password': {'type': 'string', 'empty': False, 'required': True, 'minlength': 8}, 'kdfSalt': {'type': 'string', 'empty': False, 'required': True}, 'kdfIterations': {'type': 'integer', 'empty': False, 'required': True, 'min': 1000}}}, 'path': {'type': 'string', 'empty': False, 'required': True}}, 'required': True}}.
2019-02-12 20:55:30,557 461639/MainThread database.py:564 DEBUG Current database schema revision: fe79ce75cefa.
2019-02-12 20:55:30,557 461639/MainThread database.py:565 DEBUG Expected database schema revision: fe79ce75cefa.
2019-02-12 20:56:02,132 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 11.57s
2019-02-12 20:56:18,708 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 8.15s
2019-02-12 20:56:39,876 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 9.32s
2019-02-12 20:56:56,529 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 5.97s
2019-02-12 20:57:16,265 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 5.70s
2019-02-12 20:57:36,039 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 5.48s
2019-02-12 20:57:56,037 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 5.17s
2019-02-12 20:58:18,289 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 7.43s
2019-02-12 20:58:39,729 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 8.87s
2019-02-12 20:58:54,855 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 3.99s
2019-02-12 20:59:16,026 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 5.16s
2019-02-12 20:59:35,189 461639/MainThread database.py:820 DEBUG Commited metadata transaction in 4.32s
2019-02-12 21:00:05,112 461639/MainThread benji.py:702 INFO Starting sanity check with 0.1% of the ignored blocks.
2019-02-12 21:00:05,112 461639/MainThread benji.py:702 INFO Starting sanity check with 0.1% of the ignored blocks.
2019-02-12 21:00:05,212 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 32 in 0.02s
2019-02-12 21:00:05,225 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 9 in 0.03s
2019-02-12 21:00:05,253 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 34 in 0.05s
2019-02-12 21:00:05,253 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 33 in 0.06s
2019-02-12 21:00:05,254 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 1 in 0.06s
2019-02-12 21:00:05,318 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 36 in 0.04s
2019-02-12 21:00:05,332 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 35 in 0.06s
2019-02-12 21:00:05,335 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 38 in 0.04s
2019-02-12 21:00:05,365 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 39 in 0.07s
2019-02-12 21:00:05,391 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 37 in 0.10s
2019-02-12 21:00:05,411 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 40 in 0.05s
2019-02-12 21:00:05,434 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 42 in 0.07s
2019-02-12 21:00:05,441 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 41 in 0.07s
2019-02-12 21:00:05,485 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 44 in 0.07s
2019-02-12 21:00:05,495 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 43 in 0.11s
2019-02-12 21:00:05,520 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 45 in 0.08s
2019-02-12 21:00:05,520 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 47 in 0.05s
2019-02-12 21:00:05,527 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 46 in 0.06s
2019-02-12 21:00:05,555 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 48 in 0.03s
2019-02-12 21:00:05,575 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 49 in 0.04s
2019-02-12 21:00:05,591 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 50 in 0.03s
2019-02-12 21:00:05,637 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 51 in 0.08s
2019-02-12 21:00:05,653 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 53 in 0.07s
2019-02-12 21:00:05,680 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 52 in 0.11s
2019-02-12 21:00:05,716 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 55 in 0.08s
2019-02-12 21:00:05,716 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 54 in 0.09s
2019-02-12 21:00:05,728 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 56 in 0.05s
2019-02-12 21:00:05,770 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 58 in 0.05s
2019-02-12 21:00:05,770 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 57 in 0.09s
2019-02-12 21:00:05,797 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 59 in 0.03s
2019-02-12 21:00:05,802 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 1373 in 0.03s
2019-02-12 21:00:05,846 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 4712 in 0.07s
2019-02-12 21:00:05,871 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 5354 in 0.07s
2019-02-12 21:00:05,896 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 21956 in 0.06s
2019-02-12 21:00:05,896 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 11916 in 0.09s
2019-02-12 21:00:05,912 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 20734 in 0.08s
2019-02-12 21:00:05,924 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 27341 in 0.04s
2019-02-12 21:00:05,970 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 29403 in 0.04s
2019-02-12 21:00:06,037 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 36825 in 0.07s
2019-02-12 21:00:06,052 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 29839 in 0.10s
2019-02-12 21:00:06,054 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 36329 in 0.09s
2019-02-12 21:00:06,065 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 32967 in 0.12s
2019-02-12 21:00:06,113 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 37239 in 0.04s
2019-02-12 21:00:06,137 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 41403 in 0.05s
2019-02-12 21:00:06,149 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 43379 in 0.05s
2019-02-12 21:00:06,149 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 44303 in 0.05s
2019-02-12 21:00:06,170 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 45563 in 0.06s
2019-02-12 21:00:06,256 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 46048 in 0.05s
2019-02-12 21:00:06,269 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 48849 in 0.06s
2019-02-12 21:00:06,273 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 47001 in 0.06s
2019-02-12 21:00:06,295 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 45737 in 0.08s
2019-02-12 21:00:06,295 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 49206 in 0.08s
2019-02-12 21:00:06,330 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 60322 in 0.03s
2019-02-12 21:00:06,346 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 49475 in 0.05s
2019-02-12 21:00:06,357 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 57405 in 0.06s
2019-02-12 21:00:06,381 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 60381 in 0.03s
2019-02-12 21:00:06,386 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 61385 in 0.03s
2019-02-12 21:00:06,432 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 63120 in 0.04s
2019-02-12 21:00:06,459 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 75736 in 0.04s
2019-02-12 21:00:06,469 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 65272 in 0.05s
2019-02-12 21:00:06,487 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 64054 in 0.09s
2019-02-12 21:00:06,490 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 64786 in 0.08s
2019-02-12 21:00:06,533 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 77222 in 0.06s
2019-02-12 21:00:06,565 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 80106 in 0.05s
2019-02-12 21:00:06,566 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 79539 in 0.08s
2019-02-12 21:00:06,573 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 83103 in 0.04s
2019-02-12 21:00:06,604 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 90451 in 0.03s
2019-02-12 21:00:06,626 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 84863 in 0.09s
2019-02-12 21:00:06,653 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 98528 in 0.05s
2019-02-12 21:00:06,654 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 99948 in 0.03s
2019-02-12 21:00:06,689 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 99375 in 0.08s
2019-02-12 21:00:06,692 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 100626 in 0.05s
2019-02-12 21:00:06,715 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 100711 in 0.05s
2019-02-12 21:00:06,749 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 101756 in 0.05s
2019-02-12 21:00:06,754 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 101152 in 0.05s
2019-02-12 21:00:06,754 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 106387 in 0.03s
2019-02-12 21:00:06,797 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 111763 in 0.03s
2019-02-12 21:00:06,802 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 111608 in 0.07s
2019-02-12 21:00:06,822 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 113749 in 0.03s
2019-02-12 21:00:06,845 461639/IO-Reader_4 rbd.py:129 DEBUG IO-Reader_4 read block 116444 in 0.04s
2019-02-12 21:00:06,857 461639/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 115392 in 0.06s
2019-02-12 21:00:06,896 461639/IO-Reader_2 rbd.py:129 DEBUG IO-Reader_2 read block 122719 in 0.03s
2019-02-12 21:00:06,923 461639/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 117972 in 0.06s
2019-02-12 21:00:06,927 461639/IO-Reader_0 rbd.py:129 DEBUG IO-Reader_0 read block 120501 in 0.06s
2019-02-12 21:00:06,949 461639/MainThread benji.py:731 INFO Finished sanity check. Checked 84 blocks {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 99375, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 57405, 64054, 49206, 1543, 113749, 4712, 99948, 82454, 11916, 29839, 63120, 111763, 83103, 45737, 79539, 120501, 115392, 32967, 100888, 27341, 48849, 117972, 29403, 116444, 98528, 80106, 5354, 65272, 20734, 44303, 64786, 100626, 101152, 49475, 90451, 1373, 122719, 100711, 126317, 43379, 37239, 101756, 84863, 106387, 47001, 60322, 77222, 41403, 21956, 61385, 10777, 75736, 36825, 60381, 46048, 36329, 111608, 45563}.

from benji.

blodone avatar blodone commented on July 30, 2024

once again its running fast for 20secs... but suddenly it stops...

...
2019-02-13 20:57:22,817 524282/IO-Reader_3 rbd.py:129 DEBUG IO-Reader_3 read block 123072 in 0.05s
2019-02-13 20:57:22,829 524282/MainThread benji.py:731 INFO Finished sanity check. Checked 88 blocks {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 70205, 124987, 37958, 17926, 15440, 71252, 83037, 117343, 44648, 76909, 120434, 73859, 85123, 65156, 64137, 111755, 80016, 21157, 20155, 123072, 78547, 59096, 35045, 24299, 75501, 69875, 126203, 25339, 94986, 114445, 100117, 78112, 25388, 63816, 30538, 58705, 84821, 106340, 67432, 99179, 98155, 40816, 83318, 10120, 72738, 12176, 109471, 31649, 47522, 78788, 77768, 63491, 4055, 54747, 102371, 41958, 7663, 34800, 68602, 110075, 112143}.
2019-02-13 20:57:22,829 524282/MainThread benji.py:761 DEBUG Keeping block 1
...
2019-02-13 20:57:32,616 524282/MainThread benji.py:761 DEBUG Keeping block 75125
2019-02-13 20:57:32,616 524282/MainThread benji.py:761 DEBUG Keeping block 75126
2019-02-13 20:57:32,616 524282/MainThread benji.py:761 DEBUG Keeping block 75127
2019-02-14 00:12:48,801 524282/MainThread benji.py:761 DEBUG Keeping block 75128
2019-02-14 00:12:48,802 524282/MainThread benji.py:761 DEBUG Keeping block 75129
2019-02-14 00:12:48,802 524282/MainThread benji.py:761 DEBUG Keeping block 75130
...
2019-02-14 00:13:05,410 524282/MainThread benji.py:761 DEBUG Keeping block 127999
2019-02-14 00:13:05,457 524282/MainThread benji.py:817 DEBUG Queued block 39422 for write (checksum 89365d47075af28a...)
...
2019-02-14 00:15:13,974 524282/IO-Reader_1 rbd.py:129 DEBUG IO-Reader_1 read block 116770 in 0.09s
2019-02-14 00:15:13,985 524282/Storage-Writer_2 base.py:201 DEBUG Storage-Writer_2 wrote data of uid 19-1c824 in 0.02s
2019-02-14 00:15:13,997 524282/MainThread benji.py:817 DEBUG Queued block 116770 for write (checksum 79c75dc933e828c4...)
2019-02-14 00:15:14,001 524282/MainThread benji.py:845 INFO Backed up 4301/4301 blocks (100.0%)
2019-02-14 00:15:14,011 524282/Storage-Writer_0 base.py:201 DEBUG Storage-Writer_0 wrote data of uid 19-1c823 in 0.01s
2019-02-14 00:15:16,273 524282/MainThread database.py:708 INFO Set status of version V0000000025 to valid.
2019-02-14 00:15:46,649 524282/MainThread benji.py:963 INFO Backed up version V0000000025 metadata.
2019-02-14 00:15:46,650 524282/MainThread benji.py:887 DEBUG Stats: {'bytes_read': 18039701504, 'bytes_written': 17922260992, 'bytes_dedup': 117440512, 'bytes_sparse': 0, 'start_time': 1550087544.5311801}
2019-02-14 00:15:49,073 524282/MainThread benji.py:906 INFO New version V0000000025 created, backup successful.

holds on a while .... and then continues without any kind of an error

from benji.

elemental-lf avatar elemental-lf commented on July 30, 2024

2019-02-13 20:57:32,616 524282/MainThread benji.py:761 DEBUG Keeping block 75127
2019-02-14 00:12:48,801 524282/MainThread benji.py:761 DEBUG Keeping block 75128

This is interesting. Thank you for providing the information. So even though we're keeping block 75127 and 75128 there is this pause of three hours. The code in question is quite straightforward and reads (edited for brevity):

           for i, block in enumerate(blocks):
                if block.id in read_blocks or not block.valid:
                    io.read(block)
                    read_jobs += 1
                elif block.id in sparse_blocks:
                    if block.uid is not None:
                        self._database_backend.set_block(...)
                        logger.debug('Skipping block (had data, turned sparse) {}'.format(block.id))
                    else:
                        logger.debug('Skipping block (sparse) {}'.format(block.id))
                else:
                    logger.debug('Keeping block {}'.format(block.id))

We're enumerating all blocks and decide for each block:

  • If the hints indicate that it should be read or the block is invalid it is queued for reading.
  • If it is sparse the database is updated if it wasn't sparse before.
  • If the block stayed the same according to the hints, nothing is done.

According to your logs, we were taking the last else branch both times. The only operations we're doing between these messages are block.id in read_blocks or not block.valid and block.id in sparse_blocks and these are both set lookups which should be fast. I'm puzzled and I think we're missing something. Interference from another thread? Python garbage collection? I'll need to think about this.

from benji.

elemental-lf avatar elemental-lf commented on July 30, 2024

@blodone I might have an idea. I'm assuming that you attached to the process after

2019-02-12 21:00:14,612 461639/MainThread benji.py:761 DEBUG Keeping block 73926

... while it was hanging. But your strace lists:

[pid 461639] write(2, " DEBUG: Keeping block 73927\n", 30 <unfinished ...>

This is a write to stderr by the Benji process. The write() syscall is still in progress and so blocks the process. Which means that whoever consumes the stderr of Benji is blocking, probably because it is not reading it and the pipe is full! So what are doing with the stderr output of the Benji process?

from benji.

blodone avatar blodone commented on July 30, 2024

hmm, thank you, that could be some very useful information 👍
currently we trigger the backup on the benji server over SSH via paramiko.

Although the network is stable, there can be a situation where the connection is somehow broken and the stderr is not being read by the client...
I'll try to restructure the backup call to reassign stderr and stdout not to use network paths and check if the error remains.

from benji.

blodone avatar blodone commented on July 30, 2024

so with nohup BENJI_CMD > log 2> err_log the issue seems the be solved... perhaps another logging mechanism could be used to prevent such slowdowns... but thats only nice to have i think..

from benji.

elemental-lf avatar elemental-lf commented on July 30, 2024

@blodone thanks for the update. You can log to a file and disable most logging to the console with --log-level ERROR. But I also think that this is general problem with your setup as Benji just writes to stderr. What kind of logging mechanism would you like to see? I've been thinking about introducing structured logging as this would be a better fit with my customer's setup and another logging mechanism could a by-product.

from benji.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.