Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.12.7, Lustre 2.12.8
-
None
-
3
-
9223372036854775807
Description
Write and Truncate IO will serialized on ll_trunc_sem::ll_trunc_{readers|waiters}, if one process quit abruptly (be killed), the other will keep waiting for the semaphore (task state be set as TASK_INTERRUPTIBLE):
INFO: task a.out:109684 blocked for more than 120 seconds. Tainted: G IOE --------- - - 4.18.0-240.15.1.el8_3.x86_64 #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Call Trace: __schedule+0x2a6/0x700 schedule+0x38/0xa0 trunc_sem_down_read+0xa6/0xb0 [lustre] vvp_io_write_start+0x107/0xb80 [lustre] cl_io_start+0x59/0x110 [obdclass] cl_io_loop+0x9a/0x1e0 [obdclass] ll_file_io_generic+0x380/0xb10 [lustre] ll_file_write_iter+0x136/0x5a0 [lustre] new_sync_write+0x124/0x170 vfs_write+0xa5/0x1a0 ksys_write+0x4f/0xb0 do_syscall_64+0x5b/0x1a0
Attachments
Issue Links
- is related to
-
LU-15397 LustreError: 4585:0:(llite_mmap.c:61:our_vma()) ASSERTION( !down_write_trylock(&mm->mmap_sem) ) failed
-
- Open
-
Looking at the vmcore "bt -FF 4718" show
#1 [ffffbe8540f3bb88] schedule at ffffffff8f6d38a8
ffffbe8540f3bb90: [ffff9f9b3d8c37a0:lustre_inode_cache] trunc_sem_down_read+166
#2 [ffffbe8540f3bb98] trunc_sem_down_read at ffffffffc12523c6 [lustre]
ffffbe8540f3bba0: [ffff9f9b3d8c37a0:lustre_inode_cache] 00000000ffffffff
So ffffbe8540f3bba0 must be the address of the ll_trunc_sem.
crash> x/2wx 0xffff9f9b3d8c37a0
0xffff9f9b3d8c37a0: 0x00000000 0x00000000
so both ll_trunc_waiters and ll_trunc_readers are zero. So trunc_sem_down_read() shouldn't block.
This suggests a missed wake-up. It could only be the wakeup from trunc_sem_up_write(). Maybe a barrier is needed after the atomic_set, and before the atomic_add_unless_negative.
But I thought barriers like that weren't needed on x86.
I'll read up about memory ordering again.