Details
-
Bug
-
Resolution: Won't Fix
-
Critical
-
None
-
Lustre 2.4.0, Lustre 2.1.5, Lustre 1.8.8
-
3
-
6072
Description
When I was repeating mount/umount and "lctl --device ${devno} deactivate" on a client node, kernel panic happened.
The below is the stack trace of the thread which caused the kernel panic.
PID: 7160 TASK: ffff88060a370aa0 CPU: 2 COMMAND: "lctl"
#0 [ffff8805f8ba7b20] machine_kexec at ffffffff8103281b
#1 [ffff8805f8ba7b80] crash_kexec at ffffffff810ba792
#2 [ffff8805f8ba7c50] oops_end at ffffffff81501700
#3 [ffff8805f8ba7c80] die at ffffffff8100f26b
#4 [ffff8805f8ba7cb0] do_general_protection at ffffffff81501292
#5 [ffff8805f8ba7ce0] general_protection at ffffffff81500a65
[exception RIP: class_handle_ioctl+4926]
RIP: ffffffffa04e776e RSP: ffff8805f8ba7d98 RFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8805f4f0c800 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88062808b500
RBP: ffff8805f8ba7e38 R8: 0000000000000000 R9: ffffffff8163abc0
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000c0086815
R13: 00007fff38122d70 R14: 5a5a5a5a5a5a5a5a R15: 0000000000000240
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#6 [ffff8805f8ba7d90] class_handle_ioctl at ffffffffa04e76f4 [obdclass]
#7 [ffff8805f8ba7e40] obd_class_ioctl at ffffffffa04d12ab [obdclass]
#8 [ffff8805f8ba7e60] vfs_ioctl at ffffffff8118dff2
#9 [ffff8805f8ba7ea0] do_vfs_ioctl at ffffffff8118e194
#10 [ffff8805f8ba7f30] sys_ioctl at ffffffff8118e711
#11 [ffff8805f8ba7f80] system_call_fastpath at ffffffff8100b0f2
RIP: 00000032e3adf7b7 RSP: 00007fff38122d08 RFLAGS: 00010202
RAX: 0000000000000010 RBX: ffffffff8100b0f2 RCX: 0000000000000018
RDX: 00007fff38122d70 RSI: 00000000c0086815 RDI: 0000000000000003
RBP: 0000000000000001 R8: 0000000000000000 R9: 0000000000000240
R10: 00007fff38122a90 R11: 0000000000000202 R12: 00000000c0086815
R13: 00007fff38122d70 R14: 0000000000676ae0 R15: 0000000000000003
ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b
Actually class_handle_ioctl() has some if-statements which check obd status such as obd_stopping, obd_set_up and obd_attached. But there's no protection when class_handle_ioctl has already passed all of the if-statements. So, class_decref() can release an obd_device while class_handle_ioctl() is refering the obd_device.
I'll upload a patch for this problem soon, so I will be pleased if someone checks my patch.
Thank you.