Details
-
Bug
-
Resolution: Fixed
-
Critical
-
Lustre 2.11.0
-
Soak performance cluster, version=2.10.56_84_gd645c72, RHEL 7.4 kernel
-
3
-
9223372036854775807
Description
LBUG occurs immediately when we try to do any IO on clients. Multiple clients impacted
Jan 4 21:57:38 soak-17 kernel: LNetError: 12570:0:(o2iblnd_cb.c:991:kiblnd_check_sends_locked()) ASSERTION( conn->ibc_nsends_posted <= conn->ibc_queue_depth ) failed: Jan 4 21:57:38 soak-17 kernel: LNetError: 12570:0:(o2iblnd_cb.c:991:kiblnd_check_sends_locked()) LBUG Jan 4 21:57:38 soak-17 kernel: Pid: 12570, comm: kiblnd_sd_00_00 Jan 4 21:57:38 soak-17 kernel: #012Call Trace: Jan 4 21:57:38 soak-17 kernel: [<ffffffffc097c7ae>] libcfs_call_trace+0x4e/0x60 [libcfs] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc097c83c>] lbug_with_loc+0x4c/0xb0 [libcfs] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c2666b>] kiblnd_check_sends_locked+0xd8b/0xd90 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0538b5c>] ? mlx4_ib_post_recv+0x1dc/0x310 [mlx4_ib] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c27f50>] kiblnd_post_rx+0x160/0x520 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c284ea>] kiblnd_recv+0x1da/0x7b0 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0a00573>] lnet_ni_recv+0xc3/0x320 [lnet] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0a02e06>] lnet_parse_local+0x4c6/0xd40 [lnet] Jan 4 21:57:38 soak-17 kernel: [<ffffffff810c7705>] ? sched_clock_cpu+0x85/0xc0 Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0a03f4a>] lnet_parse+0x8ca/0xfc0 [lnet] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c261ac>] ? kiblnd_check_sends_locked+0x8cc/0xd90 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffff81029557>] ? __switch_to+0xd7/0x510 Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c28e63>] kiblnd_handle_rx+0x213/0x6b0 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c2facf>] kiblnd_scheduler+0xf0f/0x1150 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffff810ce55e>] ? dequeue_task_fair+0x41e/0x660 Jan 4 21:57:38 soak-17 kernel: [<ffffffff810c7705>] ? sched_clock_cpu+0x85/0xc0 Jan 4 21:57:38 soak-17 kernel: [<ffffffff810c4820>] ? default_wake_function+0x0/0x20 Jan 4 21:57:38 soak-17 kernel: [<ffffffffc0c2ebc0>] ? kiblnd_scheduler+0x0/0x1150 [ko2iblnd] Jan 4 21:57:38 soak-17 kernel: [<ffffffff810b099f>] kthread+0xcf/0xe0 Jan 4 21:57:38 soak-17 kernel: [<ffffffff810b08d0>] ? kthread+0x0/0xe0 Jan 4 21:57:38 soak-17 kernel: [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90 Jan 4 21:57:38 soak-17 kernel: [<ffffffff810b08d0>] ? kthread+0x0/0xe0 Jan 4 21:57:38 soak-17 kernel:
Multiple crash dumps available on Spirit
Attachments
Issue Links
- is related to
-
LU-10291 remove concurrent_sends tunable
- Resolved