open-cas / open-cas-linux Goto Github PK
View Code? Open in Web Editor NEWOpen CAS Linux
Home Page: https://open-cas.com
License: BSD 3-Clause "New" or "Revised" License
Open CAS Linux
Home Page: https://open-cas.com
License: BSD 3-Clause "New" or "Revised" License
@mmichal10 @robertbaldyga we find opencas service case kernel panic in our product env,detail info as fellows,is exists some bugs in opencas?
3.10.0-693.el7.x86_64
[[email protected]:/root]
# casadm --version
+=========================+===================+
| Name | Version |
+=========================+===================+
| CAS Cache Kernel Module | 19.09.00.00001210 |
| CAS Disk Kernel Module | 19.09.00.00001210 |
| CAS CLI Utility | 19.09.00.00001210 |
+=========================+===================+
[[email protected]:/root]
# casadm -L
type id disk status write policy device
cache 1 /dev/sdc Running wt -
+core 1 /dev/sdb Active - /dev/cas1-1
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405271] ------------[ cut here ]------------
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405297] WARNING: CPU: 10 PID: 39797 at lib/list_debug.c:53 __list_del_entry+0x63/0xd0
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405299] list_del corruption, ffff885fac756ff0->next is LIST_POISON1 (dead000000000100)
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405300] Modules linked in: fuse binfmt_misc cas_cache(OE) cas_disk(OE) tcp_diag inet_diag dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg pcspkr hpilo hpwdt ioatdma shpchp i2c_i801 lpc_ich dca ipmi_si ipmi_devintf ipmi_msghandler pcc_cpufreq wmi acpi_power_meter ip_tables xfs sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper bnx2x syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common crc32c_intel tg3 serio_raw hpsa mdio i2c_core ptp pps_core scsi_transport_sas libcrc32c
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405335] CPU: 10 PID: 39797 Comm: cas_io_cache1_1 Tainted: G W OE ------------ 3.10.0-693.el7.x86_64 #1
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405337] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 02/17/2017
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405338] ffff882fbda7bc88 000000004fed9b41 ffff882fbda7bc38 ffffffff816a3d91
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405340] ffff882fbda7bc78 ffffffff810879c8 00000035bda7bce0 ffff885fac756ff0
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405341] ffff885fac756000 ffff885fb9c8b2e0 0000000000000282 ffff885fbbf12ab8
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405344] Call Trace:
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405352] [<ffffffff816a3d91>] dump_stack+0x19/0x1b
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405357] [<ffffffff810879c8>] __warn+0xd8/0x100
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405360] [<ffffffff81087a4f>] warn_slowpath_fmt+0x5f/0x80
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405370] [<ffffffffc05ba3de>] ? cas_rpool_try_get+0x5e/0x80 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405371] [<ffffffff8133d963>] __list_del_entry+0x63/0xd0
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405373] [<ffffffff8133d9dd>] list_del+0xd/0x30
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405377] [<ffffffffc05ba3cf>] cas_rpool_try_get+0x4f/0x80 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405382] [<ffffffffc05bf08d>] __cas_ctx_data_alloc+0xed/0x300 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405387] [<ffffffffc05bf2b0>] cas_ctx_data_alloc+0x10/0x20 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405393] [<ffffffffc05c8163>] _ocf_read_generic_do+0x1a3/0x2b0 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405397] [<ffffffffc05c8348>] ocf_read_generic+0xd8/0x100 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405405] [<ffffffffc05e0813>] ocf_io_handle+0x23/0x40 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405413] [<ffffffffc05e0875>] ocf_queue_run_single+0x45/0x50 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405419] [<ffffffffc05e08a8>] ocf_queue_run+0x28/0x50 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405424] [<ffffffffc05b7d9a>] _cas_io_queue_thread+0xfa/0x150 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405429] [<ffffffff810b1910>] ? wake_up_atomic_t+0x30/0x30
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405434] [<ffffffffc05b7ca0>] ? cas_service_ioctl_ctrl+0x1bd0/0x1bd0 [cas_cache]
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405436] [<ffffffff810b098f>] kthread+0xcf/0xe0
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405438] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405442] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405444] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
Feb 28 09:46:14 10-193-69-43 kernel: [6872898.405446] ---[ end trace 808e8663e223caae ]---
Feb 28 09:46:14 10-193-69-43 k
For example for the following cache configuration:
type id disk status write policy device
cache 1 /dev/nvme0n1p1 Running wt -
the following commands return exit code 0
casadm -X -n seq-cutoff -i 1 -j 50 -t 28
casadm -X -n seq-cutoff -i 1 -j 50 -p full
casadm -X -n seq-cutoff -i 5555 -j 50 -p full
casadm -X -n seq-cutoff -i 5555 -p full
When I add core this error still occurs (for different core ids than the one I added). As stated this behaviour occurs when setting per cache or per core.
Setup:
RHEL 7.5
[root@gklab-108-180 mnt]# casadm -L
type id disk status write policy device
cache 1 /dev/nvme5n1 Running wb -
└core 1 /dev/md125 Flushing (85.9 %) - /dev/cas1-1
/dev/md125 is a IMSM raid5 array created upon four drives.
I created xfs fisystem on cas device and wrote around 500GB of data.
Flushing is in progress, but I observe ugly call traces:
[ 1080.224573] casadm D ffff9a08b5c8eeb0 0 4601 4596 0x00000080
[ 1080.231764] Call Trace:
[ 1080.234258] [] ? default_wake_function+0x12/0x20
[ 1080.240605] [] ? autoremove_wake_function+0x2b/0x40
[ 1080.247218] [] schedule+0x29/0x70
[ 1080.255648] [] schedule_timeout+0x239/0x2c0
[ 1080.264948] [] ? __wake_up+0x44/0x50
[ 1080.273635] [] ? cas_kick_queue_thread+0x23/0x30 [cas_cache]
[ 1080.284434] [] ? _cas_queue_kick+0xe/0x10 [cas_cache]
[ 1080.294586] [] wait_for_completion+0xfd/0x140
[ 1080.304014] [] ? wake_up_state+0x20/0x20
[ 1080.313025] [] _cache_mngt_core_flush_sync.isra.0+0x7f/0xb0 [cas_cache]
[ 1080.324791] [] _cache_mngt_remove_core_prepare+0x5d/0xc0 [cas_cache]
[ 1080.336989] [] cache_mngt_remove_core_from_cache+0xba/0x220 [cas_cache]
[ 1080.348797] [] ? vmalloc+0x52/0x60
[ 1080.357411] [] cas_service_ioctl_ctrl+0x1d7/0x1930 [cas_cache]
[ 1080.368563] [] ? enqueue_task_fair+0x208/0x6c0
[ 1080.378327] [] ? sched_clock_cpu+0x85/0xc0
[ 1080.387737] [] ? enqueue_task+0x57/0x80
[ 1080.396887] [] do_vfs_ioctl+0x350/0x560
[ 1080.405955] [] ? wake_up_new_task+0x122/0x190
[ 1080.415499] [] ? do_fork+0x107/0x320
[ 1080.424188] [] SyS_ioctl+0xa1/0xc0
[ 1080.432673] [] system_call_fastpath+0x1c/0x21
[ 1200.160372] INFO: task casadm:4601 blocked for more than 120 seconds.
[ 1200.170266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Steps to reproduce:
Actual status:
Inactive core without dirty data could not be removed without "force" option
Expected:
It should be possible to remove clean inactive core without "force" option
Reproduction:
Expected:
Info that core does not exist
Result:
Error while removing core device X from cache instance 1
Unknown error 1000015
Core block stats names differ slightly depending on using or not the '--ioclass-id' parameter:
# casadm -P -i 1 -j 1 -f blk -d # casadm -P -i 1 -j 1 -f blk
Reads from core(s) Reads from core
Writes to core(s) Writes to core
Total to/from core(s) Total to/from core
Reads from exported object(s) Reads from exported object
Writes to exported object(s) Writes to exported object
Total to/from exported object(s) Total to/from exported object
4K 100% read miss IOPS (with cache occupancy close to 0%) is order of magnitude lower on kernel 4.13 vs 5.3 :
CAS 19.6 + kernel 4.19 = ~50 KIOPS [expected]
CAS 19.9 + kernel 4.19 = ~27 KIOPS [too low]
CAS 19.9 + kernel 5.3 = ~450 KIOPS [expected]
Major increase in 19.9 vs 19.6 is expected as a result of locking optimizations. However on kernel 4.19 there is actually a decrease. Looks like all CAS threads are wasting time trying to acquire spin lock inside submit_bio (cache write in backfill engine). This looks like cache device queue saturation, but actual queue depth on the cache device as reported by iostat is only ~4 reqs.
Cache device queue settings:
[root@localhost open-cas-linux]# cat /sys/block/nvme0n1/queue/scheduler
[none] mq-deadline
[root@localhost open-cas-linux]# cat /sys/block/nvme0n1/queue/nr_requests
1023
Reproduction:
# 500G cache, 100G core
casadm -S -i 1 -d /dev/nvme0n1 -c wb --force
modprobe null_blk queue_mode=2 gb=100 nr_devices=2 irqmode=0 submit_queues=16
casadm -A -i 1 -d /dev/nullb0
/root/fio/fio -ioengine=libaio -direct=1 -verify=0 -group_reporting=1 -time_based=1 -ramp_time=0 -rw=randread -bs=4k -norandommap=1 -numjobs=16 -iodepth=16 --percentile_list=50:90:99:99.9:99.99:99.999 -cpus_allowed=0-19 -cpus_allowed_policy=split -filename=/dev/cas1-1 -name=aa -runtime=30
Atached are flame graphs for CAS 19.9 + kernel 4.19.
Trying to set one of the following (for any valid value)
casadm -X -n cleaning-acp -w 50 -i 1
casadm -X -n cleaning-acp -b 100 -i 1
gives the following output:
Error during options handling
Try `casadm --help | -H' for more information.
Kernel warning during nhit promotion policy operations:
[21563.552417] WARNING: CPU: 15 PID: 0 at block/blk-core.c:3255 blk_end_bidi_request+0x68/0x70
[21563.561406] Modules linked in: cas_cache(O) cas_disk(O) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables devlink bpfilter nls_utf8 rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache sunrpc intel_rapl iosf_mbi iTCO_wdt ioatdma mxm_wmi iTCO_vendor_support sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm_intel i2c_i801 lpc_ich intel_pch_thermal ipmi_si ipmi_devintf ipmi_msghandler wmi pcc_cpufreq acpi_pad ixgbe igb mdio i2c_algo_bit ptp crc32c_intel nvme i2c_core pps_core nvme_core dca xhci_pci xhci_hcd
[21563.629400] CPU: 15 PID: 0 Comm: swapper/15 Tainted: G O 4.19.56-t2.el7.twitter.x86_64 #1
[21563.639501] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1c 10/03/2016
[21563.648390] RIP: 0010:blk_end_bidi_request+0x68/0x70
[21563.653675] Code: 43 00 89 de 48 89 ef 49 89 c5 e8 63 fe ff ff 49 8b bc 24 c0 01 00 00 4c 89 ee e8 33 38 43 00 31 d2 5b 89 d0 5d 41 5c 41 5d c3 <0f> 0b eb b0 0f 1f 40 00 0f 1f 44 00 00 48 8b 07 48 83 b8 00 01 00
[21563.673317] RSP: 0018:ffff9c95e7bc3d68 EFLAGS: 00010286
[21563.678859] RAX: 0000000000000000 RBX: ffff9c95e44892d8 RCX: 0000000000000000
[21563.686341] RDX: 0000000000001000 RSI: 0000000000000000 RDI: ffff9c95a1b417c0
[21563.693795] RBP: ffff9c95a1b417c0 R08: ffff9c95e1316a00 R09: 0000000000001000
[21563.701293] R10: ffff9c95e051ff98 R11: ffff9c95e1316450 R12: ffff9c95e3935658
[21563.708745] R13: ffff9c95e13163f8 R14: 0000000000000000 R15: 0000000000000000
[21563.716217] FS: 0000000000000000(0000) GS:ffff9c95e7bc0000(0000) knlGS:0000000000000000
[21563.724885] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[21563.730946] CR2: 00007fc2c8894630 CR3: 000000015a20a004 CR4: 00000000003606e0
[21563.738426] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[21563.745874] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[21563.753359] Call Trace:
[21563.756135]
[21563.758498] blk_end_request_all+0x1f/0x40
[21563.762919] block_dev_complete_rq+0x6f/0x90 [cas_cache]
[21563.768592] ? ocf_req_complete+0xeb/0x150 [cas_cache]
[21563.774070] ? _ocf_read_generic_miss_complete+0x75/0xd0 [cas_cache]
[21563.780744] ? ocf_submit_volume_req_cmpl+0x16/0x20 [cas_cache]
[21563.787000] ? cas_bd_io_end+0x37/0x50 [cas_cache]
[21563.792128] ? cas_bd_io_end_callback+0x5e/0xb0 [cas_cache]
[21563.798037] ? blk_update_request+0x78/0x2c0
[21563.802626] ? scsi_end_request+0x2c/0x240
[21563.807054] ? scsi_io_completion+0x88/0x610
[21563.811640] ? blk_done_softirq+0x84/0xa0
[21563.815980] ? __do_softirq+0xe2/0x2dd
[21563.820064] ? irq_exit+0xc5/0xd0
[21563.823691] ? do_IRQ+0x85/0xd0
[21563.827169] ? common_interrupt+0xf/0xf
[21563.831350]
[21563.833764] ? cpuidle_enter_state+0xa2/0x310
[21563.838468] ? cpuidle_enter_state+0x93/0x310
[21563.843163] ? do_idle+0x1b6/0x210
[21563.846885] ? cpu_startup_entry+0x5f/0x70
[21563.851339] ? start_secondary+0x182/0x1c0
[21563.855751] ? secondary_startup_64+0xa4/0xb0
[21563.860455] ---[ end trace 650fe3afc3dad17c ]---
[21563.865452] WARNING: CPU: 15 PID: 0 at block/blk-core.c:3200 blk_finish_request+0x11c/0x150
[21563.874422] Modules linked in: cas_cache(O) cas_disk(O) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables devlink bpfilter nls_utf8 rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache sunrpc intel_rapl iosf_mbi iTCO_wdt ioatdma mxm_wmi iTCO_vendor_support sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm_intel i2c_i801 lpc_ich intel_pch_thermal ipmi_si ipmi_devintf ipmi_msghandler wmi pcc_cpufreq acpi_pad ixgbe igb mdio i2c_algo_bit ptp crc32c_intel nvme i2c_core pps_core nvme_core dca xhci_pci xhci_hcd
[21563.942349] CPU: 15 PID: 0 Comm: swapper/15 Tainted: G W O 4.19.56-t2.el7.twitter.x86_64 #1
[21563.952448] Hardware name: Supermicro SYS-5018D-FN4T/X10SDV-8C-TLN4F, BIOS 1.1c 10/03/2016
[21563.961327] RIP: 0010:blk_finish_request+0x11c/0x150
[21563.966608] Code: 6a 1a 00 00 48 8b 43 40 48 8d 53 40 48 39 c2 0f 84 4a ff ff ff 0f 0b 4c 89 ee 48 89 df e8 7c eb 00 00 8b 43 18 e9 1c ff ff ff <0f> 0b e9 07 ff ff ff 0f b6 43 14 83 e8 20 83 f8 03 0f 86 2c ff ff
[21563.986225] RSP: 0018:ffff9c95e7bc3d40 EFLAGS: 00010086
[21563.991768] RAX: 0000139c70051421 RBX: ffff9c95a1b417c0 RCX: 0000000000000018
[21563.999235] RDX: 000008a5cc704076 RSI: 002327a315fb43d7 RDI: ffffffff85234040
[21564.006687] RBP: ffff9c95e3935658 R08: 0000000000000001 R09: 0000000000000000
[21564.014157] R10: ffff9c95e051ea00 R11: 0000000000006500 R12: 0000000000000000
[21564.021610] R13: 0000139c70051421 R14: 0000000000000000 R15: 0000000000000000
[21564.029081] FS: 0000000000000000(0000) GS:ffff9c95e7bc0000(0000) knlGS:0000000000000000
[21564.037776] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[21564.043837] CR2: 00007fc2c8894630 CR3: 000000015a20a004 CR4: 00000000003606e0
[21564.051319] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[21564.058773] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[21564.066244] Call Trace:
[21564.069042]
[21564.071388] blk_end_bidi_request+0x4d/0x70
[21564.075884] blk_end_request_all+0x1f/0x40
[21564.080336] block_dev_complete_rq+0x6f/0x90 [cas_cache]
[21564.085990] ? ocf_req_complete+0xeb/0x150 [cas_cache]
[21564.091464] ? _ocf_read_generic_miss_complete+0x75/0xd0 [cas_cache]
[21564.098159] ? ocf_submit_volume_req_cmpl+0x16/0x20 [cas_cache]
[21564.104427] ? cas_bd_io_end+0x37/0x50 [cas_cache]
[21564.109573] ? cas_bd_io_end_callback+0x5e/0xb0 [cas_cache]
[21564.115491] ? blk_update_request+0x78/0x2c0
[21564.120092] ? scsi_end_request+0x2c/0x240
[21564.124535] ? scsi_io_completion+0x88/0x610
[21564.129142] ? blk_done_softirq+0x84/0xa0
[21564.133479] ? __do_softirq+0xe2/0x2dd
[21564.137560] ? irq_exit+0xc5/0xd0
[21564.141212] ? do_IRQ+0x85/0xd0
[21564.144673] ? common_interrupt+0xf/0xf
[21564.148818]
[21564.151271] ? cpuidle_enter_state+0xa2/0x310
[21564.155980] ? cpuidle_enter_state+0x93/0x310
[21564.160654] ? do_idle+0x1b6/0x210
[21564.164400] ? cpu_startup_entry+0x5f/0x70
[21564.168816] ? start_secondary+0x182/0x1c0
[21564.173244] ? secondary_startup_64+0xa4/0xb0
[21564.177915] ---[ end trace 650fe3afc3dad17d ]---
In Openstack environment, dozens of VMs shares one single cache. Some VMs may occupy too much cache and other VMs does not. It is not fair for the tenant of the VMs.
It would be great if open-cas can allocate a fixed size cache for a specific VM, the cache belongs only to this VM, cannot be shared.
ps: A VM may contains one or more core device.
Files mentioned in title have old io class define format, so in consequence there is not possibility to load io classes from this files.
When having kernel-headers package installed, but missing kernel-devel package Open CAS configure script shows misleading error:
# ./configure
Error: missing kernel headers
It would be expected for configure script to either shows error like 'missing kernel headers and/or kernel devel' or even better to distinguish between this two cases.
OS: RHEL 8.1
Kernel: 4.18.0-147.el8.x86_64
After following the admin guide installation instructions exactly, I'm finding that the kernel modules are not loading on reboot.
stack@ceph01liang: ~ / open-cas-linux$ sudo make install
cd modules && make install
make[1]: Entering directory '/opt/stack/open-cas-linux/modules'
Installing Open-CAS modules
modprobe: ERROR: could not insert 'cas_cache': Unknown symbol in module, or unknown parameter (see dmesg)
make[1]: *** [Makefile:58: install] Error 1
make[1]: Leaving directory '/opt/stack/open-cas-linux/modules'
make: *** [Makefile:15: modules] Error 2
stack@ceph01liang: ~ / open-cas-linux$ uname -r
5.0.0-29-generic
stack@ceph01liang:~/open-cas-linux$ dmesg | grep cas_disk
[91506.368989] cas_disk: loading out-of-tree module taints kernel.
[91506.369108] cas_disk: module verification failed: signature and/or required key missing - tainting kernel
No reproduction so far
[ 311.242921] ------------[ cut here ]------------
[ 311.248403] WARNING: CPU: 17 PID: 43868 at mm/page_alloc.c:3189 __alloc_pages_slowpath+0x6f/0x724
[ 311.258838] Modules linked in: cas_cache(OE) cas_disk(OE) brd xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc dm_mirror dm_region_hash dm_log dm_mod sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support mei_me cryptd pcspkr sg joydev lpc_ich i2c_i801 mei ioatdma wmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic drm_kms_helper syscopyarea sysfillrect crct10dif_pclmul sysimgblt crct10dif_common fb_sys_fops ttm nvme crc32c_intel isci igb ahci drm libsas scsi_transport_sas libahci ptp nvme_core libata pps_core drm_panel_orientation_quirks dca i2c_algo_bit [last unloaded: cas_disk]
[ 311.378189] CPU: 17 PID: 43868 Comm: mkfs.ext4 Kdump: loaded Tainted: G OE ------------ 3.10.0-957.el7.x86_64 #1
[ 311.391341] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[ 311.404378] Call Trace:
[ 311.407394] [] dump_stack+0x19/0x1b
[ 311.413423] [] __warn+0xd8/0x100
[ 311.419164] [] warn_slowpath_null+0x1d/0x20
[ 311.425977] [] __alloc_pages_slowpath+0x6f/0x724
[ 311.433289] [] __alloc_pages_nodemask+0x405/0x420
[ 311.440692] [] alloc_pages_current+0x98/0x110
[ 311.447702] [] __get_free_pages+0xe/0x40
[ 311.454238] [] kmalloc_order_trace+0x2e/0xa0
[ 311.461160] [] ? cas_rpool_try_get+0x5e/0x80 [cas_cache]
[ 311.469242] [] __kmalloc+0x211/0x230
[ 311.475374] [] ocf_req_alloc_map+0x3e/0x60 [cas_cache]
[ 311.483251] [] ocf_req_alloc_map_discard+0x57/0x60 [cas_cache]
[ 311.492118] [] ocf_core_volume_submit_discard+0xdd/0x270 [cas_cache]
[ 311.501563] [] ? ocf_core_io_allocator_new+0x1c/0x20 [cas_cache]
[ 311.510619] [] ? ocf_io_new+0x78/0xf0 [cas_cache]
[ 311.518037] [] ocf_volume_submit_discard+0x33/0x80 [cas_cache]
[ 311.526925] +0x1a/0x20 [cas_cache]
[ 311.535186] [] _blockdev_make_request_fast+0x23c/0x370 [cas_cache]
[ 311.544445] [] ? generic_make_request_checks+0x27f/0x390
[ 311.552525] [] _casdsk_exp_obj_make_rq_fn+0xa3/0x1d0 [cas_disk]
[ 311.561478] [] generic_make_request+0x147/0x380
[ 311.568687] [] submit_bio+0x70/0x150
[ 311.574828] [] ? bio_alloc_bioset+0x115/0x310
[ 311.581849] [] blkdev_issue_discard+0x1fa/0x2d0
[ 311.589071] [] blk_ioctl_discard+0xe5/0x130
[ 311.595896] [] blkdev_ioctl+0x632/0xa20
[ 311.602313] [] block_ioctl+0x41/0x50
[ 311.608442] [] do_vfs_ioctl+0x3a0/0x5a0
[ 311.614860] [] ? vfs_write+0x168/0x1f0
[ 311.621179] [] SyS_ioctl+0xa1/0xc0
[ 311.627121] [] ? system_call_after_swapgs+0xa2/0x146
[ 311.634802] [] system_call_fastpath+0x22/0x27
[ 311.641809] [] ? system_call_after_swapgs+0xae/0x146
[ 311.649509] ---[ end trace 946d2bba3d539123 ]---
open-cas-linux commit: 3812015
The following is printed to dmesg when attempting to start cache with not enough RAM to store cache metadata:
[ 48.756416] 'OCF Core' volume operations registered
[ 48.782582] 'Block Device' volume operations registered
[ 48.784876] 'Atomic Writes NVMe' volume operations registered
[ 48.790116] [Open-CAS] Not found configuration for upgrade. Standard module initialization.
[ 48.794881] Open Cache Acceleration Software Linux Version 19.06.00.00000827 (3.10.0-693.el7.x86_64)::Module loaded successfully
[ 51.415805] Inserting cache cache1
[ 51.418145] Thread ocf_metadata_updater_cache1 started
[ 51.418149] cache1: Metadata initialized
[ 51.418323] cache1: Successfully added
[ 51.420240] cache1: Cache mode : wt
[ 51.422473] Thread cas_io_cache1_0 started
[ 51.422688] Thread cas_io_cache1_1 started
[ 51.422890] Thread cas_io_cache1_2 started
[ 51.423122] Thread cas_io_cache1_3 started
[ 51.423235] Thread cas_io_cache1_4 started
[ 51.423386] Thread cas_io_cache1_5 started
[ 51.423500] Thread cas_io_cache1_6 started
[ 51.423508] Thread cas_io_cache1_7 started
[ 51.423566] Thread cas_io_cache1_8 started
[ 51.423719] Thread cas_io_cache1_9 started
[ 51.423875] Thread cas_io_cache1_10 started
[ 51.423971] Thread cas_io_cache1_11 started
[ 51.424225] Thread cas_io_cache1_12 started
[ 51.424353] Thread cas_io_cache1_13 started
[ 51.424501] Thread cas_io_cache1_14 started
[ 51.424612] Thread cas_io_cache1_15 started
[ 51.424612] Thread cas_io_cache1_-1 started
[ 51.424612] [Open-CAS] [Classifier] Initialized IO classifier
[ 51.427412] cache1: Not enough free RAM for cache metadata to start cache
[ 51.430615] cache1: Available RAM: 32356962304 B
[ 51.432839] cache1: Needed RAM: 63837936025 B
[ 51.434943] cache1: Attaching cache device failed
[ 51.437311] BUG: unable to handle kernel paging request at ffffc90003ad9000
[ 51.438272] IP: [] ocf_volume_init+0x2b/0x110 [cas_cache]
[ 51.438272] PGD 17fc62067 PUD 17fc63067 PMD 174697067 PTE 0
[ 51.438272] Oops: 0002 [#1] SMP
[ 51.438272] Modules linked in: cas_cache(OE) null_blk ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter xfs libcrc32c cas_disk(OE) sg ppdev joydev pcspkr parport_pc parport i2c_piix4 ip_tables btrfs xor raid6_pq sd_mod crc_t10dif crct10dif_generic sr_mod cdrom crct10dif_common ata_generic virtio_blk pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix libata e1000 virtio_pci floppy
[ 51.438272] virtio_ring i2c_core serio_raw virtio [last unloaded: cas_cache]
[ 51.438272] CPU: 7 PID: 6027 Comm: casadm Tainted: G OE ------------ 3.10.0-693.el7.x86_64 #1
[ 51.438272] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[ 51.438272] task: ffff8807e92e3f40 ti: ffff8800bb614000 task.ti: ffff8800bb614000
[ 51.438272] RIP: 0010:[] [] ocf_volume_init+0x2b/0x110 [cas_cache]
[ 51.438272] RSP: 0018:ffff8800bb617c70 EFLAGS: 00010286
[ 51.438272] RAX: ffffffffc05c3200 RBX: ffffc90003ad9000 RCX: 0000000000000000
[ 51.438272] RDX: ffff8800bb617d70 RSI: ffff8807f3b5ba00 RDI: 0000000000000040
[ 51.438272] RBP: ffff8800bb617c90 R08: ffff8800bb617c90 R09: 0000000000000000
[ 51.438272] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90003cfa000
[ 51.438272] R13: ffffc90003ad9000 R14: 00007fff7c09f710 R15: ffffc90003a99020
[ 51.438272] FS: 00007f39dd745740(0000) GS:ffff88081f3c0000(0000) knlGS:0000000000000000
[ 51.438272] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 51.438272] CR2: ffffc90003ad9000 CR3: 0000000035ec4000 CR4: 00000000000006e0
[ 51.438272] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 51.438272] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 51.438272] Stack:
[ 51.438272] ffff8800bb617d70 ffffc90003cfa000 ffffc90003ad9000 00007fff7c09f710
[ 51.438272] ffff8800bb617cc8 ffffffffc05afd5f ffffc90003a98000 ffff8808147c8600
[ 51.438272] ffff8800bb617d70 00007fff7c09f710 0000000000000001 ffff8800bb617d48
[ 51.438272] Call Trace:
[ 51.438272] [] ocf_mngt_get_ram_needed+0x5f/0x100 [cas_cache]
[ 51.438272] [] cache_mngt_init_instance+0x22c/0x430 [cas_cache]
[ 51.438272] [] cas_service_ioctl_ctrl+0x16d0/0x1ad0 [cas_cache]
[ 51.438272] [] ? x2apic_send_IPI_mask+0x13/0x20
[ 51.438272] [] ? native_smp_send_reschedule+0x4c/0x70
[ 51.438272] [] ? resched_curr+0xa8/0xc0
[ 51.438272] [] do_vfs_ioctl+0x33d/0x540
[ 51.438272] [] ? file_has_perm+0x9f/0xb0
[ 51.438272] [] SyS_ioctl+0xa1/0xc0
[ 51.438272] [] ? SyS_clone+0x16/0x20
[ 51.438272] [] system_call_fastpath+0x16/0x1b
[ 51.438272] Code: 0f 1f 44 00 00 55 48 85 f6 48 89 e5 41 56 41 55 41 54 53 48 8b 06 48 89 fb 8b 78 0c 0f 84 96 00 00 00 48 85 db 0f 84 8d 00 00 00 <48> 89 33 c6 43 18 00 be d0 80 00 00 49 89 d4 41 89 cd e8 5e 1e
[ 51.438272] RIP [] ocf_volume_init+0x2b/0x110 [cas_cache]
[ 51.438272] RSP
[ 51.438272] CR2: ffffc90003ad9000
Can't stop the core flushing process with SIGINT. When running in background also can't kill the process with SIGKILL or SIGTERM although the message that the process is terminated or killed is displayed. Situation with cache flush looks similar.
Test Config:
Reproduce Steps:
80% fill up cache device with 4K seq write.
fio --filename=/dev/cas1-1 --direct=1 --thread --rw=write --bs=4k --size=4G --iodepth=128 --ioengine=libaio --numjobs=1 --group_reporting --name=4k_seq_write
Enable seq-cutoff, threshold=4KB
casadm -X -n seq-cutoff -p always -i 1 -j 1 -t 4
Reset CAS statistic counters
casadm -Z -i 1
4K seq read
fio --filename=/dev/cas1-1 --direct=1 --thread --rw=read --bs=4k --size=4G --iodepth=128 --ioengine=libaio --numjobs=1 --group_reporting --name=4k_seq_read
** Expected Result **
Similar 4K seq read performance as cache device's, because 4K seq read requests should be 100% cache hit. Even it is not 100% cache hit, it should equal to core device's performance.
Observed symptoms:
** CAS IO statistics **
casadm -P -i 1
Cache Id 1
Cache Size 1290176 [4KiB Blocks] / 4.92 [GiB]
Cache Device /dev/nvme2n1p1
Core Devices 1
Inactive Core Devices 0
Write Policy wb
Eviction Policy lru
Cleaning Policy nop
Promotion Policy always
Cache line size 4 [KiB]
Metadata Memory Footprint 84.4 [MiB]
Dirty for 13744 [s] / 3 [h] 49 [m] 4 [s]
Metadata Mode normal
Status Running
╔══════════════════╤═════════╤══════╤═════════════╗
║ Usage statistics │ Count │ % │ Units ║
╠══════════════════╪═════════╪══════╪═════════════╣
║ Occupancy │ 1048705 │ 81.3 │ 4KiB Blocks ║
║ Free │ 241471 │ 18.7 │ 4KiB Blocks ║
║ Clean │ 1048704 │ 81.3 │ 4KiB Blocks ║
║ Dirty │ 1 │ 0.0 │ 4KiB Blocks ║
╚══════════════════╧═════════╧══════╧═════════════╝
╔══════════════════════╤═════════╤═══════╤══════════╗
║ Request statistics │ Count │ % │ Units ║
╠══════════════════════╪═════════╪═══════╪══════════╣
║ Read hits │ 1 │ 0.0 │ Requests ║
║ Read partial misses │ 0 │ 0.0 │ Requests ║
║ Read full misses │ 0 │ 0.0 │ Requests ║
║ Read total │ 1 │ 0.0 │ Requests ║
╟──────────────────────┼─────────┼───────┼──────────╢
║ Write hits │ 0 │ 0.0 │ Requests ║
║ Write partial misses │ 0 │ 0.0 │ Requests ║
║ Write full misses │ 0 │ 0.0 │ Requests ║
║ Write total │ 0 │ 0.0 │ Requests ║
╟──────────────────────┼─────────┼───────┼──────────╢
║ Pass-Through reads │ 1048575 │ 100.0 │ Requests ║
║ Pass-Through writes │ 0 │ 0.0 │ Requests ║
║ Serviced requests │ 1 │ 0.0 │ Requests ║
╟──────────────────────┼─────────┼───────┼──────────╢
║ Total requests │ 1048576 │ 100.0 │ Requests ║
╚══════════════════════╧═════════╧═══════╧══════════╝
╔══════════════════════════════════╤═════════╤═══════╤═════════════╗
║ Block statistics │ Count │ % │ Units ║
╠══════════════════════════════════╪═════════╪═══════╪═════════════╣
║ Reads from core(s) │ 1048575 │ 50.0 │ 4KiB Blocks ║
║ Writes to core(s) │ 1048575 │ 50.0 │ 4KiB Blocks ║
║ Total to/from core(s) │ 2097150 │ 100.0 │ 4KiB Blocks ║
╟──────────────────────────────────┼─────────┼───────┼─────────────╢
║ Reads from cache │ 1048576 │ 100.0 │ 4KiB Blocks ║
║ Writes to cache │ 0 │ 0.0 │ 4KiB Blocks ║
║ Total to/from cache │ 1048576 │ 100.0 │ 4KiB Blocks ║
╟──────────────────────────────────┼─────────┼───────┼─────────────╢
║ Reads from exported object(s) │ 1048576 │ 100.0 │ 4KiB Blocks ║
║ Writes to exported object(s) │ 0 │ 0.0 │ 4KiB Blocks ║
║ Total to/from exported object(s) │ 1048576 │ 100.0 │ 4KiB Blocks ║
╚══════════════════════════════════╧═════════╧═══════╧═════════════╝
╔════════════════════╤═══════╤═════╤══════════╗
║ Error statistics │ Count │ % │ Units ║
╠════════════════════╪═══════╪═════╪══════════╣
║ Cache read errors │ 0 │ 0.0 │ Requests ║
║ Cache write errors │ 0 │ 0.0 │ Requests ║
║ Cache total errors │ 0 │ 0.0 │ Requests ║
╟────────────────────┼───────┼─────┼──────────╢
║ Core read errors │ 0 │ 0.0 │ Requests ║
║ Core write errors │ 0 │ 0.0 │ Requests ║
║ Core total errors │ 0 │ 0.0 │ Requests ║
╟────────────────────┼───────┼─────┼──────────╢
║ Total errors │ 0 │ 0.0 │ Requests ║
╚════════════════════╧═══════╧═════╧══════════╝
Kernel bug occured during cache stop:
[ 4799.934072] Cache cache1 successfully stopped
[ 4799.939197] [Open-CAS] [Classifier] Deinitialized IO classifier
[ 4799.946112] Thread cas_io_cache1_-1 stopped
[ 4799.951269] Thread ocf_metadata_updater_cache1 stopped
[ 4799.957307] ------------[ cut here ]------------
[ 4799.962688] kernel BUG at /root/open-cas-linux/modules/cas_cache/src/ocf/utils/utils_refcnt.c:19!
[ 4799.973045] invalid opcode: 0000 [#1] SMP
[ 4799.977873] Modules linked in: cas_cache(OE) cas_disk(OE) brd nls_utf8 isofs xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache loop sunrpc dm_mirror dm_region_hash dm_log dm_mod sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper iTCO_wdt cryptd iTCO_vendor_support mei_me joydev sg pcspkr mei lpc_ich wmi i2c_i801 ioatdma ipmi_si ipmi_devintf ipmi_msghandler ip_tables ext4 mbcache jbd2 sd_mod
[ 4800.060336] crc_t10dif crct10dif_generic drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops crct10dif_pclmul crct10dif_common ttm isci crc32c_intel igb ahci libsas nvme drm scsi_transport_sas libahci ptp nvme_core libata pps_core drm_panel_orientation_quirks dca i2c_algo_bit [last unloaded: cas_disk]
[ 4800.095851] CPU: 0 PID: 103074 Comm: casadm Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1
[ 4800.108634] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[ 4800.121629] task: ffff9608cffa1040 ti: ffff960760804000 task.ti: ffff960760804000
[ 4800.130406] RIP: 0010:[] [] ocf_refcnt_dec+0x45/0x50 [cas_cache]
[ 4800.141184] RSP: 0018:ffff960760807c18 EFLAGS: 00010297
[ 4800.147350] RAX: 00000000ffffffff RBX: ffffa4acc49dc000 RCX: dead000000000200
[ 4800.155552] RDX: ffff960760807ca0 RSI: 00000000fff0bdbb RDI: ffffa4acc49dc5d0
[ 4800.163748] RBP: ffff960760807c30 R08: ffffa4accc420000 R09: 00000001802a0016
[ 4800.171958] R10: 00000000f6158001 R11: ffffe732c0d85600 R12: 00000000fff0bdbb
[ 4800.180156] R13: ffff960760807ca0 R14: ffffa4accc420000 R15: ffffa4acc4ae4b18
[ 4800.188374] FS: 00007f3a2e465740(0000) GS:ffff960aee000000(0000) knlGS:0000000000000000
[ 4800.197843] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4800.204475] CR2: 00007f37133dc0a0 CR3: 00000000ae0c6000 CR4: 00000000000607f0
[ 4800.212663] Call Trace:
[ 4800.215618] [] ? ocf_mngt_cache_put+0x20/0x60 [cas_cache]
[ 4800.223716] [] _ocf_mngt_cache_lock_complete+0x85/0x90 [cas_cache]
[ 4800.232887] [] _ocf_async_lock_run_waiters+0x4b/0x70 [cas_cache]
[ 4800.241865] [] ocf_async_lock_deinit+0x8d/0xb0 [cas_cache]
[ 4800.250060] [] ocf_mngt_cache_put+0x3b/0x60 [cas_cache]
[ 4800.257974] [] ocf_queue_put+0x49/0x60 [cas_cache]
[ 4800.265398] [] cache_mngt_exit_instance+0x177/0x260 [cas_cache]
[ 4800.274274] [] cas_service_ioctl_ctrl+0x1039/0x1e50 [cas_cache]
[ 4800.283166] [] ? sched_slice.isra.58+0x66/0xc0
[ 4800.290205] [] ? __enqueue_entity+0x78/0x80
[ 4800.296945] [] ? enqueue_entity+0x2ef/0xbe0
[ 4800.303690] [] ? cpumask_next_and+0x35/0x50
[ 4800.310423] [] ? enqueue_task_fair+0x208/0x6c0
[ 4800.317455] [] ? sched_clock_cpu+0x85/0xc0
[ 4800.324106] [] do_vfs_ioctl+0x3a0/0x5a0
[ 4800.330455] [] ? wake_up_new_task+0x131/0x1a0
[ 4800.337389] [] ? do_fork+0x107/0x320
[ 4800.343463] [] SyS_ioctl+0xa1/0xc0
[ 4800.349348] [] ? system_call_after_swapgs+0xa2/0x146
[ 4800.356957] [] system_call_fastpath+0x22/0x27
[ 4800.363882] [] ? system_call_after_swapgs+0xae/0x146
[ 4800.371487] Code: 75 0e b8 01 00 00 00 f0 0f b1 5f 08 85 c0 75 06 89 d8 5b 5d c3 90 48 8b 57 18 48 8b 47 10 48 89 d7 e8 d0 50 6e ca 89 d8 5b 5d c3 <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 53
[ 4800.394090] RIP [] ocf_refcnt_dec+0x45/0x50 [cas_cache]
[ 4800.402005] RSP
OS: Fedora 31
Commit: 9c7b485
LOG:
Invalid sequential cutoff policy!
Dec 18 07:44:38 localhost.localdomain kernel: WARNING: CPU: 3 PID: 2699 at /root/open-cas-linux/modules/cas_cache/src/ocf/engine/cache_engine.c:194 ocf_seq_cutoff_check+0x35/0x70 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: Modules linked in: cas_cache(OE) cas_disk(OE) xt_CHECKSUM xt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sunrpc snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm sb_edac snd_timer snd crct10dif_pclmul crc32_pclmul soundcore ghash_clmulni_intel joydev i2c_piix4 vboxguest intel_rapl_perf vmwgfx drm_kms_helper crc32c_intel ttm drm e1000 serio_raw ata_generic pata_acpi video [last unloaded: cas_disk]
Dec 18 07:44:38 localhost.localdomain kernel: CPU: 3 PID: 2699 Comm: systemd-udevd Tainted: G W OE 5.2.16-200.fc30.x86_64 #1
Dec 18 07:44:38 localhost.localdomain kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Dec 18 07:44:38 localhost.localdomain kernel: RIP: 0010:ocf_seq_cutoff_check+0x35/0x70 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: Code: 55 49 89 cd 41 54 49 89 d4 55 48 89 fd 53 89 f3 e8 b0 75 00 00 85 c0 74 1f 83 f8 02 74 0e 48 c7 c7 f8 cc 97 c0 e8 9a bb 78 db <0f> 0b 5b 44 89 f0 5d 41 5c 41 5d 41 5e c3 39 9d f0 00 00 00 75 ec
Dec 18 07:44:38 localhost.localdomain kernel: RSP: 0018:ffffb36549f4b868 EFLAGS: 00010286
Dec 18 07:44:38 localhost.localdomain kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Dec 18 07:44:38 localhost.localdomain kernel: RDX: 0000000000000001 RSI: ffffffff9dbd9641 RDI: 0000000000000246
Dec 18 07:44:38 localhost.localdomain kernel: RBP: ffffb3654a1c9c50 R08: 000002620e5cf242 R09: 0000000000000021
Dec 18 07:44:38 localhost.localdomain kernel: R10: ffffb36549f4b6b8 R11: ffffb36549f4b6bd R12: 0000004fffff9000
Dec 18 07:44:38 localhost.localdomain kernel: R13: 0000000000005000 R14: 0000000000000000 R15: 0000000000000040
Dec 18 07:44:38 localhost.localdomain kernel: FS: 00007fe68beb8940(0000) GS:ffff8fc40f780000(0000) knlGS:0000000000000000
Dec 18 07:44:38 localhost.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 18 07:44:38 localhost.localdomain kernel: CR2: 00005576758ec000 CR3: 00000003fc024001 CR4: 00000000000606e0
Dec 18 07:44:38 localhost.localdomain kernel: Call Trace:
Dec 18 07:44:38 localhost.localdomain kernel: ? ocf_get_effective_cache_mode+0x81/0xd0 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: ? ocf_core_submit_io_mode+0x453/0x490 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: ? cas_rpool_try_get+0x94/0xa0 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: ? env_allocator_new+0x49/0xa0 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: ? _block_dev_queue_request+0x836/0xad0 [cas_cache]
Dec 18 07:44:38 localhost.localdomain kernel: ? wbt_track+0x21/0x30
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_add_rq_to_plug+0x27/0x70
Dec 18 07:44:38 localhost.localdomain kernel: ? _casdsk_exp_obj_queue_rq+0x59/0x90 [cas_disk]
Dec 18 07:44:38 localhost.localdomain kernel: ? __blk_mq_try_issue_directly+0x121/0x1c0
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_mq_request_issue_directly+0x48/0xe0
Dec 18 07:44:38 localhost.localdomain kernel: ? generic_make_request+0xcf/0x320
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_mq_try_issue_list_directly+0x51/0xc0
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_mq_sched_insert_requests+0xa7/0xf0
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_mq_flush_plug_list+0x214/0x2b0
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_flush_plug_list+0xec/0x110
Dec 18 07:44:38 localhost.localdomain kernel: ? blk_finish_plug+0x21/0x2e
Dec 18 07:44:38 localhost.localdomain kernel: ? read_pages+0x80/0x1b0
Dec 18 07:44:38 localhost.localdomain kernel: ? 0xffffffff9c000000
Dec 18 07:44:38 localhost.localdomain kernel: ? __do_page_cache_readahead+0x168/0x1d0
Dec 18 07:44:38 localhost.localdomain kernel: ? force_page_cache_readahead+0x93/0x100
Dec 18 07:44:38 localhost.localdomain kernel: ? generic_file_read_iter+0x85d/0xe60
Dec 18 07:44:38 localhost.localdomain kernel: ? file_fdatawait_range+0x20/0x20
Dec 18 07:44:38 localhost.localdomain kernel: ? new_sync_read+0x12a/0x1c0
Dec 18 07:44:38 localhost.localdomain kernel: ? vfs_read+0x9d/0x150
Dec 18 07:44:38 localhost.localdomain kernel: ? ksys_read+0x5f/0xe0
Dec 18 07:44:38 localhost.localdomain kernel: ? do_syscall_64+0x5f/0x1a0
Dec 18 07:44:38 localhost.localdomain kernel: ? page_fault+0x8/0x30
Dec 18 07:44:38 localhost.localdomain kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 18 07:44:38 localhost.localdomain kernel: ---[ end trace 0fb21fbcffa4e526 ]---
When using ALRU cleaning policy and setting big values for flush-max-buffers
param completing even the simplest management commands takes very long time.
flush-max-buffers
param-P
, -L
multiple timesCommands complete instantaneously.
Commands can take very long time to complete (depending on when in relation to cleaning iteration starts you issue them)
Instead of actual CLI version, cas_disk version is printed.
When starting cache following error shows up in dmesg:
[ 264.001894] cache1: Discarding whole cache device failed
and the cache device is not discarded (not even attempted to discard), which has some performance impact.
It looks that error is thrown in the first iteration of discard loop within ocf_submit_volume_discard() function. The reason is that ocf_volume_new_io() return NULL since the 'bytes' parameter is not sector size (512 bytes) aligned. This is due to setting max_length to (uint32_t)~0, so all the 32 bits are set and the number is not divisible by 512.
I was trying opencas on rhel 7.6. I would like to build an rpm with the list of cas related binaries, configuration files. Unfortunately , I was not able to find the list of files to put them all together.Below are the list of files I was able to find:
$) ll /sbin/cas*
-rwxr-xr-x 1 root root 192112 Sep 16 14:19 /sbin/casadm
lrwxrwxrwx 1 root root 19 Sep 16 14:19 /sbin/casctl -> /lib/opencas/casctl
$) ll /lib/opencas/
total 40
-rwxr-xr-x 1 root root 4651 Sep 16 14:19 casctl
-rwxr-xr-x 1 root root 1736 Sep 16 14:19 open-cas-loader
-rwxr-xr-x 1 root root 815 Sep 16 14:19 open-cas-mount-utility
-rw-r--r-- 1 root root 23408 Sep 16 14:19 opencas.py
I'm getting the following kernel oops when loading cas_cache:
[ 76.666811] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
[ 76.676546] PGD 0 P4D 0
[ 76.679676] Oops: 0002 [#1] PREEMPT SMP [ 76.706648] Workqueue: events _cas_rpool_pre_alloc_do [cas_cache]
[ 76.714251] RIP: 0010:_cas_rpool_pre_alloc_do+0x54/0xa0 [cas_cache]
[ 76.722043] Code: 48 c1 e3 04 49 03 5 GS:ffff88a6e4300000(0000) knlGS:0000000000000000
[ 76.806438] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 76.813597] CR2: 0000000000000000 CR3: 000000002a012002 CR4: 00000000003606e0
[ 76.822470] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 76.831355] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 76.840252] Call Trace:
[ 76.843316] process_one_work+0x1d5/0x316
[ 76.848329] ? create_worker+0x1dc/0x1dc
[ 76.853214] worker_thread+0x1cd/0x2bf
[ 76.857848] ? create_worker+0x1dc/0x1dc
[ 76.862698] kthread+0x14f/0x157
I'm using open-cas-linux on v4.19.80 kernel and configured the cache device to be an NVMe optane drive.
My open-cas-linux sha1 is 06b3e69
Cache configuration:
WB | LINESIZE16KB | ACP | METADATA_LAYOUT=SEQUENTIAL | MAKE_REQUEST_FN
dmesg:
[1750126.837810] cache1: Flushing cache
[1750126.890931] cache1: Wrong number of dirty blocks (390656!=390657)
[1750216.042064] cache1: Flushing cache completed
[1750216.046847] ------------[ cut here ]------------
[1750216.051805] kernel BUG at /root/open-cas-linux/modules/cas_cache/src/ocf/mngt/ocf_mngt_flush.c:605!
[1750216.061268] invalid opcode: 0000 [#1] SMP
[1750216.065620] Modules linked in: cas_cache(OE) cas_disk(OE) xfs iotrace(OE) brd null_blk xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink nls_utf8 isofs loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc dm_mirror dm_region_hash dm_log dm_mod btusb btrtl btbcm btintel intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel ath3k aesni_intel lrw gf128mul glue_helper bluetooth joydev ablk_helper cryptd snd_hda_codec_hdmi eeepc_wmi snd_hda_codec_realtek asus_wmi snd_hda_codec_generic sparse_keymap rfkill snd_hda_intel pcspkr snd_hda_codec snd_seq iTCO_wdt iTCO_vendor_support snd_hda_core snd_hwdep sg snd_seq_device snd_pcm snd_timer snd mei_me shpchp soundcore mei lpc_ich i2c_i801 ip_tables ext4 mbcache jbd2 sr_mod sd_mod cdrom crc_t10dif crct10dif_generic nouveau video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci libahci drm e1000e crct10dif_pclmul crct10dif_common nvme crc32c_intel mxm_wmi libata ptp nvme_core serio_raw i2c_core pps_core wmi [last unloaded: cas_disk]
[1750216.185920] CPU: 11 PID: 14971 Comm: cas_io_cache1_- Kdump: loaded Tainted: G OE ------------ 3.10.0-862.el7.x86_64 #1
[1750216.198563] Hardware name: System manufacturer System Product Name/RAMPAGE IV EXTREME, BIOS 4701 11/18/2013
[1750216.208772] task: ffff9c23b6012f70 ti: ffff9c2345ee0000 task.ti: ffff9c2345ee0000
[1750216.216686] RIP: 0010:[] [] _ocf_mngt_flush_finish+0xc0/0xe0 [cas_cache]
[1750216.227071] RSP: 0018:ffff9c2345ee3dd0 EFLAGS: 00010202
[1750216.232785] RAX: 0000000000000001 RBX: ffffad9fcce60028 RCX: ffffad9ffaac8058
[1750216.240268] RDX: 0000000000000001 RSI: ffffad9fcce60028 RDI: ffffad9ffa99ebf0
[1750216.247747] RBP: ffff9c2345ee3df8 R08: ffffffffc0c4a330 R09: 0000000000000000
[1750216.255326] R10: ffff9c287fbd4400 R11: ffffdce70d114400 R12: ffffad9ffa99e000
[1750216.262892] R13: ffffad9fcce60000 R14: 0000000000000000 R15: 0000000000000000
[1750216.270376] FS: 0000000000000000(0000) GS:ffff9c28674c0000(0000) knlGS:0000000000000000
[1750216.278830] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1750216.284989] CR2: 00007fd7b21ea718 CR3: 0000000164672000 CR4: 00000000000607e0
[1750216.292475] Call Trace:
[1750216.295289] [] _ocf_pipeline_run_step+0x47/0xf0 [cas_cache]
[1750216.302959] [] ocf_io_handle+0x23/0x40 [cas_cache]
[1750216.309833] [] ocf_queue_run_single+0x45/0x50 [cas_cache]
[1750216.317325] [] ocf_queue_run+0x28/0x50 [cas_cache]
[1750216.324202] [] _cas_io_queue_thread+0xfa/0x150 [cas_cache]
[1750216.331777] [] ? wake_up_atomic_t+0x30/0x30
[1750216.337943] [] ? atomic_dev_deinit+0x40/0x40 [cas_cache]
[1750216.345342] [] kthread+0xd1/0xe0
[1750216.350622] [] ? insert_kthread_work+0x40/0x40
[1750216.357137] [] ret_from_fork_nospec_begin+0x21/0x21
[1750216.364095] [] ? insert_kthread_work+0x40/0x40
[1750216.370607] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 49 8b 86 f8 00 00 00 8b 40 04 85 c0 74 8d 0f 0b 4c 89 e7 e8 e8 fe ff ff 84 c0 0f 84 7b ff ff ff <0f> 0b 66 0f 1f 44 00 00 48 8b 73 28 48 8b 43 20 44 89 fa 4c 89
[1750216.390818] RIP [] _ocf_mngt_flush_finish+0xc0/0xe0 [cas_cache]
[1750216.398850] RSP
When define OCF_METADATA_HASH_DEBUG is set to 1 in src/metadata/metadata_hash.c file kernel panic occurs during cache start:
[ 496.322666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 496.330491] PGD 0 P4D 0
[ 496.333029] Oops: 0000 [#1] SMP PTI
[ 496.336522] CPU: 57 PID: 7506 Comm: casadm Kdump: loaded Tainted: G OE --------- - - 4.18.0-147.el8.x86_64 #1
[ 496.347642] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0009.092820190230 09/28/2019
[ 496.358079] RIP: 0010:ocf_log_raw+0x2c/0x80 [cas_cache]
[ 496.363300] Code: 44 00 00 55 48 89 e5 48 83 ec 50 48 89 4c 24 38 4c 89 44 24 40 4c 89 4c 24 48 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 <48> 8b 07 48 83 78 10 00 74 38 48 8d 4d 10 c7 04 24 18 00 00 00 48
[ 496.382045] RSP: 0018:ffffbf8983b2bb78 EFLAGS: 00010246
[ 496.387273] RAX: 0000000000000000 RBX: ffffbf8987695000 RCX: 0000000000000010
[ 496.394405] RDX: ffffffffc0e8c209 RSI: 0000000000000006 RDI: 0000000000000010
[ 496.401537] RBP: ffffbf8983b2bbc8 R08: ffffffffc0e88fb0 R09: 8000000000000063
[ 496.408668] R10: 0000000000000000 R11: 0000000000200000 R12: 0000000000000010
[ 496.415801] R13: ffffbf8983b2bcd8 R14: 00007ffd24874690 R15: ffff9a6b7691c068
[ 496.422933] FS: 00007fb9af5b9740(0000) GS:ffff9a6b7f940000(0000) knlGS:0000000000000000
[ 496.431019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 496.436767] CR2: 0000000000000010 CR3: 000000083e58e004 CR4: 00000000007606e0
[ 496.443897] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 496.451029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 496.458154] PKRU: 55555554
[ 496.460869] Call Trace:
[ 496.463329] ocf_metadata_hash_init+0x46/0x200 [cas_cache]
[ 496.468819] ocf_metadata_init+0x40/0x80 [cas_cache]
[ 496.473786] ocf_mngt_cache_start+0x2aa/0x760 [cas_cache]
[ 496.479182] ? cas_blk_identify_type_by_bdev+0xdc/0x170 [cas_cache]
[ 496.485449] cache_mngt_init_instance+0xc1/0x460 [cas_cache]
[ 496.491109] ? cache_mngt_prepare_cache_cfg+0x324/0x360 [cas_cache]
[ 496.497372] ? map_vm_area+0x32/0x40
[ 496.500954] cas_service_ioctl_ctrl+0x1695/0x18f0 [cas_cache]
[ 496.506698] ? select_task_rq_fair+0xa77/0xd00
[ 496.511141] ? selinux_file_ioctl+0x161/0x200
[ 496.515504] do_vfs_ioctl+0xa4/0x630
[ 496.519081] ksys_ioctl+0x60/0x90
[ 496.522398] __x64_sys_ioctl+0x16/0x20
[ 496.526151] do_syscall_64+0x5b/0x1b0
[ 496.529817] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 496.534869] RIP: 0033:0x7fb9aeeb8b2b
[ 496.538449] Code: 0f 1e fa 48 8b 05 5d a3 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d a3 2c 00 f7 d8 64 89 01 48
[ 496.557195] RSP: 002b:00007ffd24874508 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 496.564761] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fb9aeeb8b2b
[ 496.571893] RDX: 00007ffd24874690 RSI: ffffffffd040ba15 RDI: 0000000000000005
[ 496.579026] RBP: 00007ffd24874580 R08: 0000000000000000 R09: 00007fb9aedc5700
[ 496.586155] R10: 00007fb9aedc59d0 R11: 0000000000000246 R12: ffffffffd040ba15
[ 496.593279] R13: 00007ffd24874690 R14: 0000000000000005 R15: 00007ffd24874510
Looks like the reason is that owner field (which contains logger instance) within ocf_cache_t struct is not yet set when ocf_metadata_hash_init() is called.
Stack trace:
[164203.133681] XFS (cas1-1): Mounting V5 Filesystem
[164203.141577] XFS (cas1-1): Ending clean mount
[164205.121591] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[164205.130820] IP: [<ffffffffc081d9d2>] remove_lru_list+0x62/0x380 [cas_cache]
[164205.138906] PGD 80000001035ab067 PUD 3f0139067 PMD 0
[164205.144868] Oops: 0000 [#1] SMP
[164205.148771] Modules linked in: cas_cache(OE) cas_disk(OE) nls_utf8 isofs xfs brd xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink rpcsec_gss_krb5 auth_rpcgss nfsv4 loop dns_resolver nfs lockd grace fscache sunrpc dm_mirror dm_region_hash dm_log dm_mod iTCO_wdt iTCO_vendor_support gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev pcspkr ipmi_devintf i2c_i801 lpc_ich ioatdma ipmi_msghandler i7core_edac sg acpi_cpufreq ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_common drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel ahci drm_panel_orientation_quirks igb libahci nvme ptp libata pps_core nvme_core dca i2c_algo_bit [last unloaded: cas_disk]
[164205.246769] CPU: 4 PID: 15396 Comm: cas_io_cache1_4 Kdump: loaded Tainted: G W IOE ------------ 3.10.0-957.el7.x86_64 #1
[164205.272173] task: ffff8bb0112c6180 ti: ffff8bac045b0000 task.ti: ffff8bac045b0000
[164205.281047] RIP: 0010:[<ffffffffc081d9d2>] [<ffffffffc081d9d2>] remove_lru_list+0x62/0x380 [cas_cache]
[164205.292083] RSP: 0018:ffff8bac045b3c98 EFLAGS: 00010246
[164205.298301] RAX: 0000000000000000 RBX: ffffa1f7c1d8d000 RCX: 0000000000000000
[164205.306732] RDX: 0000000000000000 RSI: ffffa1f7c2329dd8 RDI: ffff8bac045b3ca8
[164205.315231] RBP: ffff8bac045b3ce8 R08: ffffffffc0833ee0 R09: 00000000000000ff
[164205.323665] R10: 000000000000003f R11: 00000000000167ba R12: 00000000000167ba
[164205.332131] R13: 000000000002fc30 R14: ffffa1f7c1d8e848 R15: 0000000000000000
[164205.340562] FS: 0000000000000000(0000) GS:ffff8bb01f280000(0000) knlGS:0000000000000000
[164205.350092] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[164205.356794] CR2: 0000000000000018 CR3: 0000000219866000 CR4: 00000000000007e0
[164205.365252] Call Trace:
[164205.368280] [<ffffffffc081e069>] evp_lru_rm_cline+0x89/0xb0 [cas_cache]
[164205.376092] [<ffffffffc0820fb4>] __set_cache_line_invalid+0x134/0x160 [cas_cache]
[164205.385044] [<ffffffffc0821089>] set_cache_line_invalid+0xa9/0xf0 [cas_cache]
[164205.393587] [<ffffffffc082fdf0>] ocf_write_wi_update_and_flush_metadata+0xd0/0x190 [cas_cache]
[164205.403785] [<ffffffffc083dd3c>] ocf_io_handle+0x3c/0x50 [cas_cache]
[164205.411280] [<ffffffffc083dd95>] ocf_queue_run_single+0x45/0x50 [cas_cache]
[164205.419443] [<ffffffffc083ddc8>] ocf_queue_run+0x28/0x50 [cas_cache]
[164205.426928] [<ffffffffc08080fa>] _cas_io_queue_thread+0xfa/0x150 [cas_cache]
[164205.435399] [<ffffffff8d6c2d00>] ? wake_up_atomic_t+0x30/0x30
[164205.442207] [<ffffffffc0808000>] ? 0xffffffffc0807fff
[164205.448217] [<ffffffff8d6c1c31>] kthread+0xd1/0xe0
[164205.453951] [<ffffffff8d6c1b60>] ? insert_kthread_work+0x40/0x40
[164205.461089] [<ffffffff8dd74c37>] ret_from_fork_nospec_begin+0x21/0x21
[164205.468669] [<ffffffff8d6c1b60>] ? insert_kthread_work+0x40/0x40
[164205.475766] Code: 8d 04 76 4c 8d 74 c7 60 41 39 d5 0f 86 1e 03 00 00 48 8b 87 38 04 00 00 41 89 cf 48 8d 55 b8 44 89 e6 e8 a2 8d 16 cd 49 8b 46 08 <44> 3b 60 18 0f 94 c1 44 3b 60 10 40 0f 94 c6 44 3b 60 1c 0f 94
[164205.498396] RIP [<ffffffffc081d9d2>] remove_lru_list+0x62/0x380 [cas_cache]
[164205.506577] RSP <ffff8bac045b3c98>
[164205.510749] CR2: 0000000000000018
Preceeded by multiple failures:
[164201.849619] mkfs.xfs: page allocation failure: order:10, mode:0x10c010
[164201.857240] CPU: 10 PID: 27884 Comm: mkfs.xfs Kdump: loaded Tainted: G W IOE ------------ 3.10.0-957.el7.x86_64 #1
[164201.882061] Call Trace:
[164201.885117] [<ffffffff8dd61dc1>] dump_stack+0x19/0x1b
[164201.891180] [<ffffffff8d7bc830>] warn_alloc_failed+0x110/0x180
[164201.898116] [<ffffffff8dd5d3ce>] __alloc_pages_slowpath+0x6b6/0x724
[164201.905541] [<ffffffff8d7c0e95>] __alloc_pages_nodemask+0x405/0x420
[164201.912956] [<ffffffff8d80dcf8>] alloc_pages_current+0x98/0x110
[164201.919976] [<ffffffff8d7bb5de>] __get_free_pages+0xe/0x40
[164201.926528] [<ffffffff8d81938e>] kmalloc_order_trace+0x2e/0xa0
[164201.933457] [<ffffffffc0811e1e>] ? cas_rpool_try_get+0x5e/0x80 [cas_cache]
[164201.941564] [<ffffffff8d81d3d1>] __kmalloc+0x211/0x230
[164201.947733] [<ffffffffc081937e>] ocf_req_alloc_map+0x3e/0x60 [cas_cache]
[164201.955654] [<ffffffffc08193f7>] ocf_req_alloc_map_discard+0x57/0x60 [cas_cache]
[164201.964553] [<ffffffffc082c6bd>] ocf_core_volume_submit_discard+0xdd/0x270 [cas_cache]
[164201.974005] [<ffffffffc082c1bc>] ? ocf_core_io_allocator_new+0x1c/0x20 [cas_cache]
[164201.983074] [<ffffffffc082ba78>] ? ocf_io_new+0x78/0xf0 [cas_cache]
[164201.990506] [<ffffffffc083cb53>] ocf_volume_submit_discard+0x33/0x80 [cas_cache]
[164201.999448] [<ffffffffc083ca2a>] ? ocf_volume_new_io+0x1a/0x20 [cas_cache]
[164202.007558] [<ffffffffc080defc>] _blockdev_make_request_fast+0x23c/0x370 [cas_cache]
[164202.016813] [<ffffffff8d942eaf>] ? generic_make_request_checks+0x27f/0x390
[164202.024920] [<ffffffffc0648ee3>] _casdsk_exp_obj_make_rq_fn+0xa3/0x1d0 [cas_disk]
[164202.033871] [<ffffffff8d944c87>] generic_make_request+0x147/0x380
[164202.041097] [<ffffffff8d944f30>] submit_bio+0x70/0x150
[164202.047327] [<ffffffff8d87e2e5>] ? bio_alloc_bioset+0x115/0x310
[164202.054432] [<ffffffff8d94d52a>] blkdev_issue_discard+0x1fa/0x2d0
[164202.061717] [<ffffffff8d956ac5>] blk_ioctl_discard+0xe5/0x130
[164202.068584] [<ffffffff8d957612>] blkdev_ioctl+0x632/0xa20
[164202.075033] [<ffffffff8d881401>] ? __blkdev_put+0x141/0x1a0
[164202.081673] [<ffffffff8d880761>] block_ioctl+0x41/0x50
[164202.087831] [<ffffffff8d856210>] do_vfs_ioctl+0x3a0/0x5a0
[164202.094377] [<ffffffff8d8640a4>] ? mntput+0x24/0x40
[164202.100321] [<ffffffff8d843476>] ? __fput+0x186/0x260
[164202.106424] [<ffffffff8d8564b1>] SyS_ioctl+0xa1/0xc0
[164202.112475] [<ffffffff8dd74d15>] ? system_call_after_swapgs+0xa2/0x146
[164202.120169] [<ffffffff8dd74ddb>] system_call_fastpath+0x22/0x27
[164202.127208] [<ffffffff8dd74d21>] ? system_call_after_swapgs+0xae/0x146
Cache line id is 32 bit long, so with small cache line size and big cache there may be more cache lines on the cache device than we can address.
Got metadata corrupted after system reboot.
After changing cache mode from PT to WB kernel bug appears:
[75237.190008] cache1: Changing cache mode from 'Pass Through' to 'Write Back' successful
[75237.832680] BUG: unable to handle kernel paging request at ffffa273a25d5168
[75237.840949] PGD 107d2d067 P4D 107d2d067 PUD 447c01067 PMD 42aed0067 PTE 0
[75237.848997] Oops: 0000 [#1] SMP PTI
[75237.853359] CPU: 35 PID: 31128 Comm: ocf_metadata_up Tainted: G O 4.19.56-t2.el7.twitter.x86_64 #1
[75237.865661] Hardware name: Intel Corporation S2600GZ/S2600GZ, BIOS SE5C600.86B.02.02.0002.122320131210 12/23/2013
[75237.878011] RIP: 0010:ocf_engine_push_req_front+0xa4/0x110 [cas_cache]
[75237.885766] Code: 8b 4c 24 08 4c 89 61 08 48 89 8d 00 01 00 00 4c 89 b5 08 01 00 00 4c 89 63 10 f0 ff 43 08 48 8b 34 24 4c 89 ff e8 7c a6 0b eb <80> bd 20 01 00 00 00 78 17 48 8b 3d ec ec a8 eb e8 d7 ef 9d ea 48
[75237.907987] RSP: 0018:ffffa273845a7e20 EFLAGS: 00010282
[75237.914281] RAX: 0000000000000001 RBX: ffff8b8e4b809a20 RCX: ffff8b8e4b809a30
[75237.922712] RDX: ffff8b8e4b809a30 RSI: 0000000000000282 RDI: ffff8b8e4b809a40
[75237.931145] RBP: ffffa273a25d5048 R08: ffff8b8e4b809a30 R09: ffff8b8b85850098
[75237.939580] R10: ffffa273845a7ca8 R11: 0000000000000000 R12: ffffa273a25d5148
[75237.948013] R13: 0000000000000001 R14: ffff8b8e4b809a30 R15: ffff8b8e4b809a40
[75237.956465] FS: 0000000000000000(0000) GS:ffff8b8e739c0000(0000) knlGS:0000000000000000
[75237.966346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[75237.973223] CR2: ffffa273a25d5168 CR3: 000000015c20a003 CR4: 00000000001606e0
[75237.981662] Call Trace:
[75237.984854] ? ocf_metadata_updater_run+0x163/0x1b0 [cas_cache]
[75237.991929] ? _cas_cleaner_complete+0x20/0x20 [cas_cache]
[75237.998520] ? _cas_metadata_updater_thread+0x51/0xd0 [cas_cache]
[75238.005796] ? finish_wait+0x80/0x80
[75238.010246] ? kthread+0x113/0x130
[75238.014498] ? kthread_create_worker_on_cpu+0x50/0x50
[75238.020601] ? ret_from_fork+0x35/0x40
[75238.025240] Modules linked in: cas_cache(O) cas_disk(O) nls_utf8 xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables bpfilter devlink rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache sunrpc intel_rapl iosf_mbi iTCO_wdt iTCO_vendor_support sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm_intel ioatdma ipmi_si ipmi_devintf ipmi_msghandler i2c_i801 lpc_ich wmi pcc_cpufreq isci igb libsas ptp crc32c_intel scsi_transport_sas pps_core i2c_algo_bit i2c_core dca nvme nvme_core [last unloaded: cas_disk]
[75238.101405] CR2: ffffa273a25d5168
[75238.105565] ---[ end trace 05612fd206589865 ]---
[75238.413095] RIP: 0010:ocf_engine_push_req_front+0xa4/0x110 [cas_cache]
[75238.420864] Code: 8b 4c 24 08 4c 89 61 08 48 89 8d 00 01 00 00 4c 89 b5 08 01 00 00 4c 89 63 10 f0 ff 43 08 48 8b 34 24 4c 89 ff e8 7c a6 0b eb <80> bd 20 01 00 00 00 78 17 48 8b 3d ec ec a8 eb e8 d7 ef 9d ea 48
[75238.443090] RSP: 0018:ffffa273845a7e20 EFLAGS: 00010282
[75238.449389] RAX: 0000000000000001 RBX: ffff8b8e4b809a20 RCX: ffff8b8e4b809a30
[75238.457831] RDX: ffff8b8e4b809a30 RSI: 0000000000000282 RDI: ffff8b8e4b809a40
[75238.466270] RBP: ffffa273a25d5048 R08: ffff8b8e4b809a30 R09: ffff8b8b85850098
[75238.474709] R10: ffffa273845a7ca8 R11: 0000000000000000 R12: ffffa273a25d5148
[75238.483154] R13: 0000000000000001 R14: ffff8b8e4b809a30 R15: ffff8b8e4b809a40
[75238.491589] FS: 0000000000000000(0000) GS:ffff8b8e739c0000(0000) knlGS:0000000000000000
[75238.501481] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[75238.508363] CR2: ffffa273a25d5168 CR3: 000000015c20a003 CR4: 00000000001606e0
Steps to reproduce:
modprobe cas_cache metadata_layout=1 unaligned_io=0 use_io_scheduler=0
casadm --start-cache --cache-device <dev_id> --cache-mode wb --cache-line-size 64
(Starting cache failed due to existing previous cache's metadata on cache device)
Dmesg after unsuccessful cache start:
[ 248.269482] cache1: Attaching cache device failed
[ 248.274921] Thread cas_io_cache1_25 stopped
[ 248.274932] Thread cas_io_cache1_19 stopped
[ 248.274938] Thread cas_io_cache1_16 stopped
[ 248.274950] Thread cas_io_cache1_9 stopped
[ 248.274956] Thread cas_io_cache1_14 stopped
[ 248.274958] Thread cas_io_cache1_30 stopped
[ 248.274961] Thread cas_io_cache1_15 stopped
[ 248.274963] Thread cas_io_cache1_31 stopped
[ 248.274965] Thread cas_io_cache1_0 stopped
[ 248.274965] Thread cas_io_cache1_3 stopped
[ 248.274979] Thread cas_io_cache1_24 stopped
[ 248.274981] Thread cas_io_cache1_8 stopped
[ 248.274995] Thread cas_io_cache1_26 stopped
[ 248.274996] Thread cas_io_cache1_10 stopped
[ 248.274997] Thread cas_io_cache1_11 stopped
[ 248.274998] Thread cas_io_cache1_27 stopped
[ 248.274999] Thread cas_io_cache1_12 stopped
[ 248.275000] Thread cas_io_cache1_28 stopped
[ 248.275001] Thread cas_io_cache1_29 stopped
[ 248.275002] Thread cas_io_cache1_13 stopped
[ 248.275003] cache1: Cache cache1 successfully stopped
[ 248.275008] [Open-CAS] [Classifier] Deinitialized IO classifier
[ 248.275010] Thread cas_io_cache1_1 stopped
[ 248.275010] Thread cas_io_cache1_17 stopped
[ 248.275012] Thread cas_io_cache1_18 stopped
[ 248.275013] Thread cas_io_cache1_2 stopped
[ 248.275014] Thread cas_io_cache1_4 stopped
[ 248.275015] Thread cas_io_cache1_20 stopped
[ 248.275016] Thread cas_io_cache1_21 stopped
[ 248.275017] Thread cas_io_cache1_5 stopped
[ 248.275018] Thread cas_io_cache1_22 stopped
[ 248.275019] Thread cas_io_cache1_6 stopped
[ 248.275020] Thread cas_io_cache1_23 stopped
[ 248.275021] Thread cas_io_cache1_7 stopped
[ 248.386529] ata2.00: Enabling discard_zeroes_data
modprobe -r cas_cache
Kernel oops detested on serial console:
[ 277.459743] BUG: unable to handle kernel paging request at ffffffffc05ba0c2
[ 277.467699] PGD 64220c067 P4D 64220c067 PUD 64220e067 PMD 827fac067 PTE 0
[ 277.475441] Oops: 0010 [#1] SMP PTI
[ 277.479486] CPU: 3 PID: 23121 Comm: cas_io_cache1_- Tainted: G O 4.19.56-t2.el7.twitter.x86_64 #1
[ 277.491109] Hardware name: Intel Corporation S2600GZ ........../S2600GZ, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[ 277.503885] RIP: 0010:0xffffffffc05ba0c2
[ 277.508415] Code: Bad RIP value.
[ 277.512165] RSP: 0018:ffffbdc284c7fec0 EFLAGS: 00010202
[ 277.518152] RAX: 0000000000000001 RBX: ffff9b19d2aa6120 RCX: 0000000000000000
[ 277.526283] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff9b19d2aa6120
[ 277.534415] RBP: 0000000000000000 R08: ffff9b15ef8e2038 R09: 00000000000002a3
[ 277.542546] R10: ffffbdc28397bda8 R11: fefefefefefefeff R12: ffff9b12c599f8c0
[ 277.550688] R13: ffff9b12c599f910 R14: ffff9b12c599f8c8 R15: ffffffffc05ba000
[ 277.558823] FS: 0000000000000000(0000) GS:ffff9b15ef8c0000(0000) knlGS:0000000000000000
[ 277.568094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 277.574665] CR2: ffffffffc05ba098 CR3: 000000064220a005 CR4: 00000000000606e0
[ 277.582786] Call Trace:
[ 277.585677] ? finish_wait+0x80/0x80
[ 277.589820] ? kthread+0x113/0x130
[ 277.593779] ? kthread_create_worker_on_cpu+0x50/0x50
[ 277.599577] ? ret_from_fork+0x35/0x40
[ 277.603912] Modules linked in: xfs nls_utf8 xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables devlink bpfilter rpcsec_gss_krb5 auth_rpcgss oid_registry nfsv4 nfs lockd grace fscache sunrpc intel_rapl iosf_mbi sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel kvm_intel iTCO_wdt iTCO_vendor_support mpt3sas ipmi_si ipmi_devintf ipmi_msghandler ioatdma raid_class i2c_i801 lpc_ich wmi pcc_cpufreq igb isci ptp crc32c_intel libsas pps_core scsi_transport_sas i2c_algo_bit nvme i2c_core nvme_core dca [last unloaded: cas_disk]
[ 277.676855] CR2: ffffffffc05ba0c2
[ 277.680705] ---[ end trace 843c9a3cb25b1d23 ]---
[ 277.710148] RIP: 0010:0xffffffffc05ba0c2
[ 277.714701] Code: Bad RIP value.
[ 277.718472] RSP: 0018:ffffbdc284c7fec0 EFLAGS: 00010202
[ 277.724466] RAX: 0000000000000001 RBX: ffff9b19d2aa6120 RCX: 0000000000000000
[ 277.732602] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff9b19d2aa6120
[ 277.740732] RBP: 0000000000000000 R08: ffff9b15ef8e2038 R09: 00000000000002a3
[ 277.748860] R10: ffffbdc28397bda8 R11: fefefefefefefeff R12: ffff9b12c599f8c0
[ 277.756987] R13: ffff9b12c599f910 R14: ffff9b12c599f8c8 R15: ffffffffc05ba000
[ 277.765113] FS: 0000000000000000(0000) GS:ffff9b15ef8c0000(0000) knlGS:0000000000000000
[ 277.774387] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 277.780960] CR2: ffffffffc05ba098 CR3: 000000064220a005 CR4: 00000000000606e0
[ 278.886770] [CAS Disk] CAS Disk Version 20.01.00.00000220 (4.19.56-t2.el7.twitter.x86_64)::Module loaded successfully
[ 279.196346] 'OCF Core' volume operations registered
[ 279.245800] 'Block_Device' volume operations registered
[ 279.251820] [Open-CAS] Not found configuration for upgrade. Standard module initialization.
[ 279.261495] Open Cache Acceleration Software Linux Version 20.01.00.00000220_
Is there any possibility that we might lose data?
Since thread names are trimmed to 16 characters, the result is missing the vital part. For example typically management thread name is "cas_io_cache1_-".
Following command:
casadm --stats --cache-id 1 --io-class-id
successfully returns statistics for all defined io classes in stdout, but the exit code is equal to 1.
After unplugging core from cache, core status is still active.
Steps to reproduce:
casadm -S -d /dev/nvme1n1p1 && casadm -A -i 1 -d /dev/sdc
echo 1 > /sys/block/sdc/device/delete
casadm -P -i 1
Expected:
Inactive Core Devices: 1
Actual status:
Inactive Core Devices: 0
When using casctl init to initialize the cache, there will be a kernel error.
The error code is as follows: kernel: NMI watchdog: bug: soft lock - CPU#0 stuck for 22s
CAS version: 19.9
System version: CentOS 7.7
Kernel version: 3.10.0-1062.9.1.el7.x86_64
Nvme SSD: Samsung 963 1.92t
Data disk: Seagate 900g * 24
How to solve this problem?
Some of the scripts which are used by open cas linux require python3
and python3
is not available everywhere by default. This should be required in order to install open cas linux.
This would fix things such as #119
Steps:
exit_code is 1 but no message is displayed
Tested on clean CAS installation.
Kernel: 3.10.0-957.el7.x86_64
OS: CENTOS76
Outputs after reboot:
casadm -L
output:
Device /dev/cas_ctrl not found
Is the kernel module loaded?
Error getting caches list
casadm -V
output:
Name | Version |
---|---|
CAS Cache Kernel Module | Not Loaded |
CAS Disk Kernel Module | Not Loaded |
CAS CLI Utility | 19.09.00.00000924 |
If package elfutils-libelf-devel is not installed, ./configure only lists not processed (all) conf files - there is no information about the actual problem.
pattern used: /dev/cas(\d)-(\d)
includes only caches and cores with one-digit ids
When modprobing cas_cache module on Fedora 31. Please find kernel log bellow.
commit: 9c7b485
Dec 18 07:33:21 localhost.localdomain kernel: [CAS Disk] CAS Disk Version (5.2.16-200.fc30.x86_64)::Module loaded successfully
Dec 18 07:33:21 localhost.localdomain kernel: 'OCF Core' volume operations registered
Dec 18 07:33:21 localhost.localdomain kernel: ------------[ cut here ]------------
Dec 18 07:33:21 localhost.localdomain kernel: WARNING: CPU: 7 PID: 2582 at mm/slab_common.c:93 kmem_cache_create_usercopy.cold+0x46/0xae
Dec 18 07:33:21 localhost.localdomain kernel: Modules linked in: cas_cache(OE+) cas_disk(OE) xt_CHECKSUM xt_MASQUERADE tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sunrpc snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm sb_edac snd_timer snd crct10dif_pclmul crc32_pclmul soundcore ghash_clmulni_intel joydev i2c_piix4 vboxguest intel_rapl_perf vmwgfx drm_kms_helper crc32c_intel ttm drm e1000 serio_raw ata_generic pata_acpi video [last unloaded: cas_disk]
Dec 18 07:33:21 localhost.localdomain kernel: CPU: 7 PID: 2582 Comm: modprobe Tainted: G W OE 5.2.16-200.fc30.x86_64 #1
Dec 18 07:33:21 localhost.localdomain kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Dec 18 07:33:21 localhost.localdomain kernel: RIP: 0010:kmem_cache_create_usercopy.cold+0x46/0xae
Dec 18 07:33:21 localhost.localdomain kernel: Code: ed ff e8 f8 b3 73 00 e9 bd f6 ff ff 44 89 f2 4c 89 e6 48 c7 c7 10 37 14 9d e8 a9 6a e7 ff 48 c7 c7 70 c0 10 9d e8 b3 d9 ed ff <0f> 0b e9 1c f6 ff ff 4c 89 e6 48 c7 c7 e0 36 14 9d 41 be ea ff ff
Dec 18 07:33:21 localhost.localdomain kernel: RSP: 0018:ffffb3654293fb50 EFLAGS: 00010246
Dec 18 07:33:21 localhost.localdomain kernel: RAX: 0000000000000024 RBX: 0000000000000000 RCX: 0000000000000000
Dec 18 07:33:21 localhost.localdomain kernel: RDX: 0000000000000000 RSI: ffff8fc40f997908 RDI: ffff8fc40f997908
Dec 18 07:33:21 localhost.localdomain kernel: RBP: 00000000000000b8 R08: ffff8fc40f997908 R09: 00000000000117dd
Dec 18 07:33:21 localhost.localdomain kernel: R10: ffffb3654293fa08 R11: ffffb3654293fa0d R12: ffff8fc4004b8280
Dec 18 07:33:21 localhost.localdomain kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Dec 18 07:33:21 localhost.localdomain kernel: FS: 00007fc48e103740(0000) GS:ffff8fc40f980000(0000) knlGS:0000000000000000
Dec 18 07:33:21 localhost.localdomain kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 18 07:33:21 localhost.localdomain kernel: CR2: 00007fe22bb206d0 CR3: 00000003ee654006 CR4: 00000000000606e0
Dec 18 07:33:21 localhost.localdomain kernel: Call Trace:
Dec 18 07:33:21 localhost.localdomain kernel: kmem_cache_create+0x12/0x20
Dec 18 07:33:21 localhost.localdomain kernel: env_allocator_create+0x79/0xd0 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? ocf_io_allocator_default_init+0x14/0x30 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? ocf_volume_type_init+0x9e/0xf0 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? _cas_ctx_data_munlock+0x10/0x10 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? ocf_ctx_register_volume_type_extended+0x11a/0x1a0 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? 0xffffffffc068f000
Dec 18 07:33:21 localhost.localdomain kernel: ? block_dev_init+0x1d/0x30 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? cas_initialize_context+0x91/0xc0 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? cas_init_module+0xf7/0x1b0 [cas_cache]
Dec 18 07:33:21 localhost.localdomain kernel: ? do_one_initcall+0x46/0x1f4
Dec 18 07:33:21 localhost.localdomain kernel: ? _cond_resched+0x15/0x30
Dec 18 07:33:21 localhost.localdomain kernel: ? kmem_cache_alloc_trace+0x154/0x1c0
Dec 18 07:33:21 localhost.localdomain kernel: ? do_init_module+0x23/0x230
Dec 18 07:33:21 localhost.localdomain kernel: ? do_init_module+0x5c/0x230
Dec 18 07:33:21 localhost.localdomain kernel: ? load_module+0x2775/0x2950
Dec 18 07:33:21 localhost.localdomain kernel: ? __do_sys_finit_module+0xaa/0x110
Dec 18 07:33:21 localhost.localdomain kernel: ? __do_sys_finit_module+0xaa/0x110
Dec 18 07:33:21 localhost.localdomain kernel: ? do_syscall_64+0x5f/0x1a0
Dec 18 07:33:21 localhost.localdomain kernel: ? page_fault+0x8/0x30
Dec 18 07:33:21 localhost.localdomain kernel: ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 18 07:33:21 localhost.localdomain kernel: ---[ end trace 0fb21fbcffa4e456 ]---
Dec 18 07:33:21 localhost.localdomain kernel: 'Block Device' volume operations registered
Dec 18 07:33:21 localhost.localdomain kernel: [Open-CAS] Not found configuration for upgrade. Standard module initialization.
Dec 18 07:33:21 localhost.localdomain kernel: Open Cache Acceleration Software Linux Version (5.2.16-200.fc30.x86_64)::Module loaded successfully
casadm --script --add-core --try-add --cache-id 1 --core-device /dev/sdX should work regardless of (non)existing cache with cache_id
CAS Configuration:
WB/acp/cache_line_size=8/seq-cutoff=never[global]
ioengine=libaio
iodepth=16
size=200G
direct=1
runtime=300
time_based
filename=/dev/cas2-1
filename=/dev/cas1-1
filename=/dev/cas3-1
filename=/dev/cas4-1
numjobs=4
group_reporting=1
#[seq-read]
#rw=read
#bs=1M
#[rand-read]
#rw=randread
#bs=8k
[seq-write]
rw=write
bs=1M
#[rand-write]
#rw=randwrite
#bs=8k
Issue 2
open-cas-linux commit: 3812015
The following warning occurs after running FIO in WB mode:
[ 969.561114] BUG: scheduling while atomic: swapper/7/0/0x10010000
[ 969.562015] Modules linked in: cas_cache(OE) ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set
nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6tabl
e_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security i
ptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter xfs libcrc32c cas_disk(OE) ppdev sg pcspkr joydev parport
_pc parport i2c_piix4 ip_tables btrfs xor raid6_pq sd_mod crc_t10dif crct10dif_generic sr_mod cdrom crct10dif_common ata_generic virti
o_blk pata_acpi cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix libata e1000 virtio_pci virtio_ri
ng floppy i2c_core virtio serio_raw [last unloaded: cas_cache]
[ 969.562015] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE ------------ 3.10.0-693.el7.x86_64 #1
[ 969.562015] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.or
g 04/01/2014
[ 969.562015] ffff88017a11c000 b91d71f1fff90dd8 ffff880424fc3ac0 ffffffff816a3d91
[ 969.562015] ffff880424fc3ad0 ffffffff8169e202 ffff880424fc3b30 ffffffff816a93f7
[ 969.562015] ffff88017a120000 ffff88017a11ffd8 ffff88017a11ffd8 ffff88017a11ffd8
[ 969.562015] Call Trace:
[ 969.562015] [] dump_stack+0x19/0x1b
[ 969.562015] [] __schedule_bug+0x64/0x72
[ 969.562015] [] __schedule+0x807/0x8b0
[ 969.562015] [] __cond_resched+0x26/0x30
[ 969.562015] [] _cond_resched+0x3a/0x50
[ 969.562015] [] mutex_lock+0x12/0x2f
[ 969.562015] [] ocf_async_unlock+0x3c/0xa0 [cas_cache]
[ 969.562015] [] ocf_mngt_cache_unlock+0x1e/0x30 [cas_cache]
[ 969.562015] [] ocf_cleaner_run_complete+0x23/0x40 [cas_cache]
[ 969.562015] [] alru_clean_complete+0x4b/0x50 [cas_cache]
[ 969.562015] [] _ocf_cleaner_complete_req+0x4f/0x60 [cas_cache]
[ 969.562015] [] _ocf_cleaner_finish_req+0x23/0x30 [cas_cache]
[ 969.562015] [] _ocf_cleaner_flush_cache_io_end+0x39/0x50 [cas_cache]
[ 969.562015] [] cas_bd_io_end+0x40/0x50 [cas_cache]
[ 969.562015] [] cas_bd_io_end_callback+0x6e/0xe0 [cas_cache]
[ 969.562015] [] bio_endio+0x64/0xa0
[ 969.562015] [] blk_update_request+0x90/0x360
[ 969.562015] [] blk_mq_end_request+0x1a/0x70
[ 969.562015] [] blk_flush_complete_seq+0xd6/0x2c0
[ 969.562015] [] flush_end_io+0x160/0x250
[ 969.562015] [] blk_mq_end_request+0x3a/0x70
[ 969.562015] [] virtblk_request_done+0x3f/0x70 [virtio_blk]
[ 969.562015] [] __blk_mq_complete_request+0x7d/0x100
[ 969.562015] [] blk_mq_complete_request+0x27/0x30
[ 969.562015] [] virtblk_done+0x7c/0x110 [virtio_blk]
[ 969.562015] [] vring_interrupt+0x38/0x90 [virtio_ring]
[ 969.562015] [] __handle_irq_event_percpu+0x3e/0x1c0
[ 969.562015] [] handle_irq_event_percpu+0x32/0x80
[ 969.562015] [] handle_irq_event+0x3c/0x60
[ 969.562015] [] handle_edge_irq+0x77/0x130
[ 969.562015] [] handle_irq+0xb8/0x150
[ 969.562015] [] ? tick_check_idle+0x8c/0xd0
[ 969.562015] [] ? atomic_notifier_call_chain+0x1a/0x20
[ 969.562015] [] do_IRQ+0x4d/0xe0
[ 969.562015] [] common_interrupt+0x6d/0x6d
[ 969.562015] [] ? native_safe_halt+0x6/0x10
[ 969.562015] [] default_idle+0x1e/0xc0
[ 969.562015] [] arch_cpu_idle+0x26/0x30
[ 969.562015] [] cpu_startup_entry+0x14a/0x1c0
[ 969.562015] [] start_secondary+0x1b6/0x230
Are there any plans to support building with clang instead of gcc?
This would be great for OSes that build packages from source with clang by default.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.