Details
-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
Lustre 2.15.5
-
Lustre server 2.15.5 RoCE
Lustre MGS 2.15.5 RoCE
Lustre client 2.15.5 RoCE
-
3
-
9223372036854775807
Description
Lustre's client and server are deployed within the VM, The VM uses the network card PF pass-through mode.
【OS】
VM Version: qemu-kvm-7.0.0
OS Verion: Rocky 8.10
Kernel Verion: 4.18.0-553.el8_10.x86_64
【Network Card】
Client:
MLX CX6 1*100G RoCE v2
MLNX_OFED_LINUX-23.10-3.2.2.0-rhel8.10-x86_64
Server:
MLX CX6 2*100G RoCE v2 bond
MLNX_OFED_LINUX-23.10-3.2.2.0-rhel8.10-x86_64
【BUG Info】
Here is the following reproducer:
- Mount lustre on a RoCE network
- Construct lustre MDT(mdt0-10.255.153.128@o2ib) restart
- Crash occurs on other lustre servers
Server call trace:
crash> bt
PID: 568423 TASK: ff4787632aa5c000 CPU: 5 COMMAND: "kworker/u40:0"
#0 [ff7d728b15e6baa0] machine_kexec at ffffffff8fa6f353
#1 [ff7d728b15e6baf8] __crash_kexec at ffffffff8fbbaa7a
#2 [ff7d728b15e6bbb8] crash_kexec at ffffffff8fbbb9b1
#3 [ff7d728b15e6bbd0] oops_end at ffffffff8fa2d831
#4 [ff7d728b15e6bbf0] no_context at ffffffff8fa81cf3
#5 [ff7d728b15e6bc48] __bad_area_nosemaphore at ffffffff8fa8206c
#6 [ff7d728b15e6bc90] do_page_fault at ffffffff8fa82cf7
#7 [ff7d728b15e6bcc0] page_fault at ffffffff906011ae
[exception RIP: kiblnd_cm_callback+2653]
RIP: ffffffffc0efe00d RSP: ff7d728b15e6bd70 RFLAGS: 00010246
RAX: 0000000000000007 RBX: ff7d728b15e6be08 RCX: 0000000000000000
RDX: ff4787632aa5c000 RSI: ff7d728b15e6be08 RDI: ff47876095213c00
RBP: ff47876095213c00 R8: 0000000000000000 R9: 006d635f616d6472
R10: 8080808080808080 R11: 0000000000000000 R12: ff47876095213c00
R13: 0000000000000000 R14: 0000000000000000 R15: ff47876095213de0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ff7d728b15e6bdd8] cma_cm_event_handler at ffffffffc04729a5 [rdma_cm]
#9 [ff7d728b15e6be00] cma_netevent_work_handler at ffffffffc04786b5 [rdma_cm]
#10 [ff7d728b15e6be90] process_one_work at ffffffff8fb195e3
#11 [ff7d728b15e6bed8] worker_thread at ffffffff8fb197d0
#12 [ff7d728b15e6bf10] kthread at ffffffff8fb20e24
#13 [ff7d728b15e6bf50] ret_from_fork at ffffffff9060028f
Server kernel log:
[69106.143672] LustreError: 11-0: lustre-MDT000a-osp-MDT0007: operation mds_statfs to node 10.255.153.128@o2ib failed: rc = -107 [69106.143700] Lustre: lustre-OST0038-osc-MDT0007: Connection to lustre-OST0038 (at 10.255.153.128@o2ib) was lost; in progress operations using this service will wait for recovery to complete [69106.145053] LustreError: Skipped 29 previous similar messages [69106.145060] Lustre: Skipped 203 previous similar messages [69111.263490] Lustre: lustre-OST004e-osc-MDT0007: Connection to lustre-OST004e (at 10.255.153.128@o2ib) was lost; in progress operations using this service will wait for recovery to complete [69111.263496] Lustre: Skipped 6 previous similar messages [69112.506974] Lustre: lustre-OST0038-osc-MDT0007: Connection restored to 10.255.153.129@o2ib (at 10.255.153.129@o2ib) [69112.506980] Lustre: Skipped 195 previous similar messages [69116.383952] Lustre: lustre-MDT0000-lwp-MDT0007: Connection to lustre-MDT0000 (at 10.255.153.128@o2ib) was lost; in progress operations using this service will wait for recovery to complete [69116.383965] Lustre: Skipped 3 previous similar messages [69122.528880] Lustre: lustre-MDT000a-lwp-OST0013: Connection restored to 10.255.153.129@o2ib (at 10.255.153.129@o2ib) [69122.528885] Lustre: Skipped 3 previous similar messages [69127.659951] Lustre: lustre-OST0043-osc-MDT0007: Connection restored to 10.255.153.233@o2ib (at 10.255.153.233@o2ib) [69127.659960] Lustre: Skipped 7 previous similar messages [69138.672269] Lustre: lustre-OST002d-osc-MDT0007: Connection restored to 10.255.153.233@o2ib (at 10.255.153.233@o2ib) [69138.672275] Lustre: Skipped 3 previous similar messages [69158.168201] Lustre: lustre-MDT000a-osp-MDT0007: Connection restored to 10.255.153.129@o2ib (at 10.255.153.129@o2ib) [69158.168206] Lustre: Skipped 1 previous similar message [69178.775546] Lustre: lustre-MDT0000-osp-MDT0007: Connection restored to 10.255.153.233@o2ib (at 10.255.153.233@o2ib) [69178.775554] Lustre: Skipped 11 previous similar messages [69178.805333] Lustre: lustre-OST0008: deleting orphan objects from 0x0:13854 to 0x0:13889 [69178.805386] Lustre: lustre-OST0013: deleting orphan objects from 0x0:13854 to 0x0:13889 [69178.805505] Lustre: lustre-OST001e: deleting orphan objects from 0x0:13846 to 0x0:13921 [69178.805729] Lustre: lustre-OST003f: deleting orphan objects from 0x0:13822 to 0x0:13857 [69178.806122] Lustre: lustre-OST004a: deleting orphan objects from 0x0:13855 to 0x0:13889 [69178.806130] Lustre: lustre-OST0055: deleting orphan objects from 0x0:13854 to 0x0:13889 [69178.807518] Lustre: lustre-OST0029: deleting orphan objects from 0x0:13854 to 0x0:13889 [69178.807537] Lustre: lustre-OST0034: deleting orphan objects from 0x0:13854 to 0x0:13889 [69178.838633] LustreError: 39177:0:(qsd_reint.c:635:qqi_reint_delayed()) lustre-OST0034: Delaying reintegration for qtype:2 until pending updates are flushed. [69178.840301] LustreError: 39177:0:(qsd_reint.c:635:qqi_reint_delayed()) Skipped 1 previous similar message [69180.358640] LustreError: 37895:0:(qsd_reint.c:635:qqi_reint_delayed()) lustre-OST0013: Delaying reintegration for qtype:2 until pending updates are flushed. [69180.359161] LustreError: 37895:0:(qsd_reint.c:635:qqi_reint_delayed()) Skipped 2 previous similar messages [69239.967382] BUG: unable to handle kernel NULL pointer dereference at 000000000000004c [69239.968927] PGD 0 [69239.969144] Oops: 0000 [#1] SMP NOPTI [69239.969327] CPU: 5 PID: 568423 Comm: kworker/u40:0 Kdump: loaded Tainted: G OE -------- - - 4.18.0-553.5.1.el8_lustre.x86_64 #1 [69239.969650] Hardware name: Red Hat KVM, BIOS 1.16.0-4.cl9 04/01/2014 [69239.969792] Workqueue: rdma_cm cma_netevent_work_handler [rdma_cm] [69239.969995] RIP: 0010:kiblnd_cm_callback+0xa5d/0x1ea0 [ko2iblnd] [69239.970191] Code: 48 89 05 06 7f 01 00 c7 05 04 7f 01 00 00 00 02 02 48 c7 05 01 7f 01 00 d0 5e f1 c0 e8 ac 66 ee ff e9 b5 f7 ff ff 4c 8b 6f 08 <41> 8b 6d 4c f6 05 e5 d4 f0 ff 01 0f 84 8d 00 00 00 f6 05 dc d4 f0 [69239.970483] RSP: 0018:ff7d728b15e6bd70 EFLAGS: 00010246 [69239.970621] RAX: 0000000000000007 RBX: ff7d728b15e6be08 RCX: 0000000000000000 [69239.970763] RDX: ff4787632aa5c000 RSI: ff7d728b15e6be08 RDI: ff47876095213c00 [69239.970895] RBP: ff47876095213c00 R08: 0000000000000000 R09: 006d635f616d6472 [69239.971025] R10: 8080808080808080 R11: 0000000000000000 R12: ff47876095213c00 [69239.971155] R13: 0000000000000000 R14: 0000000000000000 R15: ff47876095213de0 [69239.971289] FS: 0000000000000000(0000) GS:ff47877d7f740000(0000) knlGS:0000000000000000 [69239.971425] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [69239.971570] CR2: 000000000000004c CR3: 0000001900e10004 CR4: 0000000000771ee0 [69239.971708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [69239.971841] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [69239.971971] PKRU: 55555554 [69239.972099] Call Trace: [69239.972267] ? __die_body+0x1a/0x60 [69239.972451] ? no_context+0x1ba/0x3f0 [69239.972583] ? __bad_area_nosemaphore+0x16c/0x1c0 [69239.972705] ? do_page_fault+0x37/0x12d [69239.972826] ? page_fault+0x1e/0x30 [69239.972971] ? kiblnd_cm_callback+0xa5d/0x1ea0 [ko2iblnd] [69239.973099] cma_cm_event_handler+0x25/0xd0 [rdma_cm] [69239.973234] cma_netevent_work_handler+0x75/0xd0 [rdma_cm] [69239.973362] process_one_work+0x1d3/0x390 [69239.973516] worker_thread+0x30/0x390 [69239.973631] ? process_one_work+0x390/0x390 [69239.973743] kthread+0x134/0x150 [69239.973863] ? set_kthread_struct+0x50/0x50 [69239.973976] ret_from_fork+0x1f/0x40 [69239.974099] Modules linked in: ofd(OE) ost(OE) osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ptlrpc(OE) obdclass(OE) ldiskfs(OE) mbcache jbd2 ko2iblnd(OE) lnet(OE) libcfs(OE) bonding uio_pci_generic uio vfio_pci vfio_virqfd vfio_iommu_type1 vfio cuse fuse rdma_ucm(OE) ib_ipoib(OE) ib_umad(OE) sunrpc intel_rapl_msr intel_rapl_common intel_uncore_frequency_common nfit libnvdimm cirrus drm_shmem_helper kvm_intel kvm irqbypass drm_kms_helper crct10dif_pclmul crc32_pclmul syscopyarea sysfillrect ghash_clmulni_intel sysimgblt rapl drm i2c_piix4 pcspkr virtio_balloon joydev knem(OE) xfs libcrc32c mlx5_ib(OE) ib_uverbs(OE) ata_generic mlx5_core(OE) mlxfw(OE) ata_piix psample pci_hyperv_intf tls crc32c_intel virtio_console virtio_blk libata serio_raw mlxdevm(OE) xpmem(OE) nvme_tcp(OE) nvme_rdma(OE) rdma_cm(OE) iw_cm(OE) nvme_fabrics(OE) nvme_core(OE) ib_cm(OE) ib_core(OE) mlx_compat(OE) t10_pi [69239.975388] CR2: 000000000000004c
Attachments
Issue Links
- is related to
-
LU-18364 rdma_cm: unable to handle kernel NULL pointer dereference in process_one_work when disconnect
-
- Open
-
Hi, eaujames
When dealing with this issue, we suspect that the problem arises because the cmid still exists while the connection has already been released during the connection destruction process. This might be due to an incompatibility between the Lustre and RoCE protocols. We have made a fix for this issue, and combined with the patch you previously fixed, the problem has not reoccurred after our extensive testing. Could you please help review whether the modifications in this patch are appropriate? Thank you!