Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.3.0, Lustre 2.4.0, Lustre 1.8.8, Lustre 1.8.x (1.8.0 - 1.8.5)
-
FEFS based on Lustre-1.8.5
MDSx1, OSSx1(OSTx3), Clientx1
-
3
-
5407
Description
I believe that I've found two possible race conditions between evict and umount on the server-side.
・About The Cases
1) Touch an Already-Finalized Lustre Hash
After class_manual_cleanup() with mds or obdfilter, obd->obd_uuid_hash has been finalized. so I believe that the hash shouldn't be touched by anyone. But, evict process can touch the hash right after class_cleanup() because no exclusive function works in this case. Which is why the case ends up in OS panic.
PID: 3361 TASK: ffff810621f21080 CPU: 13 COMMAND: "lctl" #0 [ffff8105e1817a30] crash_kexec at ffffffff800aeb6b #1 [ffff8105e1817af0] __die at ffffffff80066157 #2 [ffff8105e1817b30] die at ffffffff8006cce5 #3 [ffff8105e1817b60] do_general_protection at ffffffff8006659f #4 [ffff8105e1817ba0] error_exit at ffffffff8005ede9 [exception RIP: lustre_hash_lookup+249] RIP: ffffffff887400f9 RSP: ffff8105e1817c58 RFLAGS: 00010206 RAX: ffff810115f337c0 RBX: 00000000ffff8106 RCX: 0000000000000001 RDX: 00000000ffff8106 RSI: ffff8105e1817ce8 RDI: ffff810628834140 RBP: ffff810628834140 R8: 0000000000000032 R9: 0000000000000020 R10: 000000000000001b R11: 0000000000000000 R12: 0000000000000000 R13: ffff8105e1817ce8 R14: ffff8102de86e078 R15: 00007fff6a0adc10 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffff8105e1817cb0] obd_export_evict_by_uuid at ffffffff8874426f #6 [ffff8105e1817d40] lprocfs_wr_evict_client at ffffffff88869bfd #7 [ffff8105e1817e00] lprocfs_mds_wr_evict_client at ffffffff889a6a39 #8 [ffff8105e1817ee0] lprocfs_fops_write at ffffffff8875349b #9 [ffff8105e1817f10] vfs_write at ffffffff80016a49 #10 [ffff8105e1817f40] sys_write at ffffffff80017316 #11 [ffff8105e1817f80] system_call at ffffffff8005e116 RIP: 000000394bac6070 RSP: 00007fff6a0ad3c0 RFLAGS: 00010287 RAX: 0000000000000001 RBX: ffffffff8005e116 RCX: 00000000fbad2a84 RDX: 0000000000000024 RSI: 00007fff6a0b0bc7 RDI: 0000000000000003 RBP: 0000000000000001 R8: fefefefefefefeff R9: 632d303038362d63 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff6a0b0bb4 R14: 0000000000000003 R15: 0000000000000000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
2) Deadlock at _procfs_lock
lprocfs_wr_evict_client() calls class_incref() before LPROCFS_EXIT() and calls class_decref() after LPROCFS_ENTRY(). So when the case in which the class_decref() calls lprocfs_removes() via osc_cleanup() while having _lprocfs_lock already, this case ends up in a deadlock.
PID: 4495 TASK: ffff810638e25820 CPU: 5 COMMAND: "lctl" #0 [ffff8106184df728] schedule at ffffffff80063f96 #1 [ffff8106184df800] __down_write_nested at ffffffff80065613 #2 [ffff8106184df840] lprocfs_remove at ffffffff88751fc1 #3 [ffff8106184df8a0] lprocfs_obd_cleanup at ffffffff887524a8 #4 [ffff8106184df8b0] osc_cleanup at ffffffff88b25e70 #5 [ffff8106184df900] class_decref at ffffffff8875c46c #6 [ffff8106184df950] obd_zombie_impexp_cull at ffffffff88743ad2 #7 [ffff8106184df970] class_detach at ffffffff8875d9dd #8 [ffff8106184df9b0] class_process_config at ffffffff8876183e #9 [ffff8106184dfa30] class_manual_cleanup at ffffffff88763747 #10 [ffff8106184dfb20] lov_putref at ffffffff88aceb60 #11 [ffff8106184dfbe0] lov_disconnect at ffffffff88ad1f2b #12 [ffff8106184dfc40] mds_lov_clean at ffffffff88975fd4 #13 [ffff8106184dfca0] mds_precleanup at ffffffff889851a9 #14 [ffff8106184dfcf0] class_decref at ffffffff8875c07e #15 [ffff8106184dfd40] lprocfs_wr_evict_client at ffffffff88869c14 #16 [ffff8106184dfe00] lprocfs_mds_wr_evict_client at ffffffff889a6a39 #17 [ffff8106184dfee0] lprocfs_fops_write at ffffffff887534cb #18 [ffff8106184dff10] vfs_write at ffffffff80016a49 #19 [ffff8106184dff40] sys_write at ffffffff80017316 #20 [ffff8106184dff80] system_call at ffffffff8005e116 RIP: 000000394bac6070 RSP: 00007fff3dfad010 RFLAGS: 00010287 RAX: 0000000000000001 RBX: ffffffff8005e116 RCX: 00000000fbad2a84 RDX: 0000000000000024 RSI: 00007fff3dfb1bc3 RDI: 0000000000000003 RBP: 0000000000000001 R8: fefefefefefefeff R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff3dfb1bb0 R14: 0000000000000003 R15: 0000000000000000 ORIG_RAX: 0000000000000001 CS: 0033 SS: 002b
the both cases happen with FEFS based on Lustre-1.8.5. But after my source code reading, I came to think that the both case could happen with Lustre-1.8.8, and Lustre-2.3.x too. But I'm not so sure, so I'm happy if someone confirms that.
・About The Patches' Features
1) When someone is touching evict_nid on procfs, mount thread waits for finishing touching it with a new function, server_wait_for_evict_client().
2) After server_wait_for_evict_client(), a new variable, obd_evict_client_frozen, is set and this prohibits "evict_client" from evicting export object.
3) Add a if-statement to the evict functions which checks obd_stopping and obd_evict_client_frozen so as not to touch the lustre hash which has already been finalized by class_manual_cleanup().
I will be happy when someone reviews my patches and when my patches for the both cases help you to fix the problem.
Thank you.