Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.14.0
-
None
-
2.13.54_44_gf3fef81
-
2
-
9223372036854775807
Description
1 x server(CentoOS7.8), 1 client (CentOS8.1) and both server and client installed OFED-5.0
# ofed_info | head -1 MLNX_OFED_LINUX-5.0-2.1.8.0 (OFED-5.0-2.1.8):
When client mounts lustre, both server and client crashed with follwoing LBUG.
Server
[482108.891327] LNetError: 14769:0:(o2iblnd.h:1003:kiblnd_queue2str()) LBUG [482108.891395] Pid: 14769, comm: kiblnd_connd 3.10.0-1127.10.1.el7.x86_64 #1 SMP Wed Jun 3 14:28:03 UTC 2020 [482108.891397] Call Trace: [482108.891412] [<ffffffffc146f67c>] libcfs_call_trace+0x8c/0xc0 [libcfs] [482108.891436] [<ffffffffc146f99c>] lbug_with_loc+0x4c/0xa0 [libcfs] [482108.891448] [<ffffffffc15b82cb>] kiblnd_need_noop.part.21+0x0/0x36 [ko2iblnd] [482108.891463] [<ffffffffc15aa581>] kiblnd_check_txs_locked+0x421/0x490 [ko2iblnd] [482108.891474] [<ffffffffc15b107b>] kiblnd_check_conns+0x3cb/0x880 [ko2iblnd] [482108.891485] [<ffffffffc15b6273>] kiblnd_connd+0x813/0x9e0 [ko2iblnd] [482108.891495] [<ffffffff9bec6691>] kthread+0xd1/0xe0 [482108.891506] [<ffffffff9c592d37>] ret_from_fork_nospec_end+0x0/0x39 [482108.891514] [<ffffffffffffffff>] 0xffffffffffffffff [482108.891553] Kernel panic - not syncing: LBUG [482108.891593] CPU: 3 PID: 14769 Comm: kiblnd_connd Kdump: loaded Tainted: P OE ------------ 3.10.0-1127.10.1.el7.x86_64 #1 [482108.891682] Hardware name: Supermicro SYS-2028U-TN24R4T+/X10DRU-i+, BIOS 3.2 06/11/2019 [482108.891742] Call Trace: [482108.891773] [<ffffffff9c57ffa5>] dump_stack+0x19/0x1b [482108.891817] [<ffffffff9c579541>] panic+0xe8/0x21f [482108.891869] [<ffffffffc146f9eb>] lbug_with_loc+0x9b/0xa0 [libcfs] [482108.891925] [<ffffffffc15b82cb>] kiblnd_queue2str.part.17+0x1a/0x1a [ko2iblnd] [482108.891988] [<ffffffffc15aa581>] kiblnd_check_txs_locked+0x421/0x490 [ko2iblnd] [482108.892053] [<ffffffffc15b107b>] kiblnd_check_conns+0x3cb/0x880 [ko2iblnd] [482108.892110] [<ffffffff9beae150>] ? __internal_add_timer+0x130/0x130 [482108.892168] [<ffffffffc15b6273>] kiblnd_connd+0x813/0x9e0 [ko2iblnd] [482108.892221] [<ffffffff9c585942>] ? __schedule+0x402/0x840 [482108.892268] [<ffffffff9bedb990>] ? wake_up_state+0x20/0x20 [482108.892321] [<ffffffffc15b5a60>] ? kiblnd_cm_callback+0x2380/0x2380 [ko2iblnd] [482108.892380] [<ffffffff9bec6691>] kthread+0xd1/0xe0 [482108.892423] [<ffffffff9bec65c0>] ? insert_kthread_work+0x40/0x40 [482108.892473] [<ffffffff9c592d37>] ret_from_fork_nospec_begin+0x21/0x21 [482108.892527] [<ffffffff9bec65c0>] ? insert_kthread_work+0x40/0x40
Client
[487085.899074] LNetError: 32398:0:(o2iblnd.h:1003:kiblnd_queue2str()) LBUG [487085.900509] Pid: 32398, comm: kiblnd_connd 4.18.0-147.8.1.el8_1.x86_64 #1 SMP Thu Apr 9 13:49:54 UTC 2020 [487085.900510] Call Trace: [487085.900531] libcfs_call_trace+0x86/0xc0 [libcfs] [487085.900537] lbug_with_loc+0x43/0x80 [libcfs] [487085.900546] kiblnd_queue2str.part.19+0x16/0x20 [ko2iblnd] [487085.900551] kiblnd_check_txs_locked+0x39c/0x3a0 [ko2iblnd] [487085.900556] kiblnd_check_conns+0x58b/0x920 [ko2iblnd] [487085.900561] kiblnd_connd+0x9c2/0xa60 [ko2iblnd] [487085.900564] kthread+0x112/0x130 [487085.900567] ret_from_fork+0x1f/0x40 [487085.900568] 0xffffffffffffffff [487085.900569] Kernel panic - not syncing: LBUG [487085.901751] CPU: 4 PID: 32398 Comm: kiblnd_connd Kdump: loaded Tainted: G OE --------- -t - 4.18.0-147.8.1.el8_1.x86_64 #1 [487085.904110] Hardware name: Intel Corporation S2600BPB/S2600BPB, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [487085.905298] Call Trace: [487085.906489] dump_stack+0x5c/0x80 [487085.907663] panic+0xe7/0x247 [487085.908837] lbug_with_loc.cold.8+0x18/0x18 [libcfs] [487085.910002] kiblnd_queue2str.part.19+0x16/0x20 [ko2iblnd] [487085.911147] kiblnd_check_txs_locked+0x39c/0x3a0 [ko2iblnd] [487085.912287] kiblnd_check_conns+0x58b/0x920 [ko2iblnd] [487085.913424] kiblnd_connd+0x9c2/0xa60 [ko2iblnd] [487085.914557] ? wake_up_q+0x70/0x70 [487085.915677] ? kiblnd_cm_callback+0x2230/0x2230 [ko2iblnd] [487085.916799] kthread+0x112/0x130 [487085.917912] ? kthread_flush_work_fn+0x10/0x10 [487085.919036] ret_from_fork+0x1f/0x40
Attachments
Issue Links
- is related to
-
LU-1742 Fix 'Timed out tx' error message
-
- Resolved
-