Details
-
Bug
-
Resolution: Done
-
Blocker
-
None
-
Lustre 2.8.0
-
lola
build: 2.8 GA + patches
-
3
-
9223372036854775807
Description
Error happens during soak testing of build '20160324' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160324).
LNet run on IB (all nodes equipped with Mellanox HCAs 4xQDR )
Sequence of events
- Error happened after a MDS node paniced (see
LU-7935) during MDT failback at 2016-03-29 14:53( umount of MDT). The MDT (lola-9) node
was unsuable (i.e no primary or secondary resources mounted) as the error occurred on the Lustre client described below. Anyway, evtl. this event isn't related. - Lustre client crash with the following error message:
<0>LNetError: 4231:0:(linux-cpu.c:1081:cfs_cpu_init()) ASSERTION( !(((current_thread_info()->preempt_count) & ((((1UL << (10))-1) << ((0 + 8) + 8)) | (((1UL << (8))-1) << (0 + 8)) | (((1UL << (1))-1) << (((0 + 8) + 8) + 10))))) || (((cpumask_size())) <= (2 << 12) && ((((((gfp_t)0x10u) | ((gfp_t)0x40u)))) & (((gfp_t)0x20u)))) != 0 ) failed: <0>LNetError: 4231:0:(linux-cpu.c:1081:cfs_cpu_init()) LBUG <0>Kernel panic - not syncing: LBUG in interrupt. <0> <4>Pid: 4231, comm: modprobe Not tainted 2.6.32-504.30.3.el6.x86_64 #1 <4>Call Trace: <4> [<ffffffff815293fc>] ? panic+0xa7/0x16f <4> [<ffffffffa0478ebd>] ? lbug_with_loc+0x8d/0xb0 [libcfs] <4> [<ffffffffa047dcfc>] ? cfs_cpu_init+0xc7c/0xcb0 [libcfs] <4> [<ffffffff810a5525>] ? atomic_notifier_chain_register+0x55/0x60 <4> [<ffffffffa047875c>] ? libcfs_register_panic_notifier+0x1c/0x20 [libcfs] <4> [<ffffffffa0482b70>] ? init_libcfs_module+0x0/0x340 [libcfs] <4> [<ffffffffa0482b97>] ? init_libcfs_module+0x27/0x340 [libcfs] <4> [<ffffffff8100204c>] ? do_one_initcall+0x3c/0x1d0 <4> [<ffffffff810c0181>] ? sys_init_module+0xe1/0x250 <4> [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
Attached files:
console, messages, vmcore-dmsg.txt of affected node lola-33.
Crash dump file is available.