Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.10.4
-
lustre-2.10.4_1.chaos-1.ch6.x86_64 servers
RHEL 7.5
DNE1 file system
-
3
-
9223372036854775807
Description
Servers were restarted and appeared to recover normally. They briefly appeared to be handling the same (heavy) workload from before they were powered off, then started logging the "system was overloaded" message. The kernel then reported several stacks like this:
INFO: task ll_ost00_007:108440 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ll_ost00_007 D ffff8ba4dc35bf40 0 108440 2 0x00000080
Call Trace:
[<ffffffffaad38919>] schedule_preempt_disabled+0x39/0x90
[<ffffffffaad3654f>] __mutex_lock_slowpath+0x10f/0x250
[<ffffffffaad357f2>] mutex_lock+0x32/0x42
[<ffffffffc1669afb>] ofd_create_hdl+0xdcb/0x2090 [ofd]
[<ffffffffc1322007>] ? lustre_msg_add_version+0x27/0xa0 [ptlrpc]
[<ffffffffc132235f>] ? lustre_pack_reply_v2+0x14f/0x290 [ptlrpc]
[<ffffffffc1322691>] ? lustre_pack_reply+0x11/0x20 [ptlrpc]
[<ffffffffc138653a>] tgt_request_handle+0x92a/0x1370 [ptlrpc]
[<ffffffffc132db5b>] ptlrpc_server_handle_request+0x23b/0xaa0 [ptlrpc]
[<ffffffffc132b26b>] ? ptlrpc_wait_event+0xab/0x350 [ptlrpc]
[<ffffffffaa6d5c32>] ? default_wake_function+0x12/0x20
[<ffffffffaa6cb01b>] ? __wake_up_common+0x5b/0x90
[<ffffffffc1331c70>] ptlrpc_main+0xae0/0x1e90 [ptlrpc]
[<ffffffffc1331190>] ? ptlrpc_register_service+0xe30/0xe30 [ptlrpc]
[<ffffffffaa6c0ad1>] kthread+0xd1/0xe0
[<ffffffffaa6c0a00>] ? insert_kthread_work+0x40/0x40
[<ffffffffaad44837>] ret_from_fork_nospec_begin+0x21/0x21
[<ffffffffaa6c0a00>] ? insert_kthread_work+0x40/0x40
And lustre began reporting:
LustreError: 108448:0:(ofd_dev.c:1627:ofd_create_hdl()) lquake-OST0003:[27917288460] destroys_in_progress already cleared
Attachments
Issue Links
- is related to
-
LU-11399 use separate locks for orphan destroy and objects re-create at OFD
-
- Open
-
That message - "destroys_in_progress already cleared" - related just to MDT->OST reconnects and may happen when MDT reconnects due to some reasons but OST is still performing orphan destroys from previous connect. I see in logs in description this:
Looks like second thread [102598] was waiting for the first [102596] to complete OST_CREATE request with orphan destroys. And that first one took quite long time. It seems the problem is in that long orphan destroy, something was blocking it for quite a time. That can be just result of overall high load upon OST restart