Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
Lustre 2.11.0
-
None
-
3
-
9223372036854775807
Description
https://testing.hpdd.intel.com/test_sets/713fb70e-119d-11e8-a6ad-52540065bddc
It fails very often:
Error: 'Timeout occurred after 227 mins, last suite running was sanity-flr, restarting cluster to continue tests' Failure Rate: 41.18% of most recent 17 runs, 22 skipped (all branches)
On a client:
[10077.749514] Lustre: DEBUG MARKER: == sanity-flr test 43: mirror pick on write ========================================================== 12:14:55 (1518610495) [10320.098013] INFO: task dd:23892 blocked for more than 120 seconds. [10320.114074] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [10320.116709] dd D ffff88007b96dee0 0 23892 23675 0x00000080 [10320.119330] Call Trace: [10320.125475] [<ffffffff810c6632>] ? default_wake_function+0x12/0x20 [10320.150782] [<ffffffff810bc2d8>] ? __wake_up_common+0x58/0x90 [10320.154162] [<ffffffff816ab8a9>] schedule+0x29/0x70 [10320.170306] [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0 [10320.176336] [<ffffffffc09f5e88>] ? ptlrpc_set_add_new_req+0xd8/0x150 [ptlrpc] [10320.178829] [<ffffffffc0bd50c0>] ? osc_io_ladvise_end+0x50/0x50 [osc] [10320.181237] [<ffffffffc0a25ffb>] ? ptlrpcd_add_req+0x22b/0x300 [ptlrpc] [10320.183701] [<ffffffffc09fbe99>] ? ptlrpc_request_bufs_pack+0x1d9/0x480 [ptlrpc] [10320.186106] [<ffffffff816abc5d>] wait_for_completion+0xfd/0x140 [10320.188437] [<ffffffff810c6620>] ? wake_up_state+0x20/0x20 [10320.190651] [<ffffffffc0bd5284>] osc_io_setattr_end+0xc4/0x180 [osc] [10320.192955] [<ffffffffc0bd63d0>] ? osc_io_setattr_start+0x260/0x700 [osc] [10320.195231] [<ffffffffc0c28490>] ? lov_io_iter_fini_wrapper+0x50/0x50 [lov] [10320.197659] [<ffffffffc0832e8d>] cl_io_end+0x5d/0x150 [obdclass] [10320.199802] [<ffffffffc0c2856b>] lov_io_end_wrapper+0xdb/0xe0 [lov] [10320.202033] [<ffffffffc0c28bc5>] lov_io_call.isra.5+0x85/0x140 [lov] [10320.204170] [<ffffffffc0c28cb6>] lov_io_end+0x36/0xb0 [lov] [10320.206291] [<ffffffffc0832e8d>] cl_io_end+0x5d/0x150 [obdclass] [10320.208353] [<ffffffffc083551f>] cl_io_loop+0x13f/0xc70 [obdclass] [10320.210509] [<ffffffffc0cd1460>] cl_setattr_ost+0x250/0x3c0 [lustre] [10320.212550] [<ffffffffc0cab495>] ll_setattr_raw+0x1165/0x1270 [lustre] [10320.214631] [<ffffffffc0cab60c>] ll_setattr+0x6c/0xd0 [lustre] [10320.217542] [<ffffffff81220fc1>] notify_change+0x2c1/0x420 [10320.228621] [<ffffffff812b45b6>] ? security_inode_need_killpriv+0x16/0x20 [10320.230605] [<ffffffff81200ad5>] do_truncate+0x75/0xc0 [10320.232485] [<ffffffff81211d97>] do_last+0x627/0x12c0 [10320.234244] [<ffffffff81212af2>] path_openat+0xc2/0x490 [10320.236065] [<ffffffff811af746>] ? do_read_fault.isra.44+0xe6/0x130 [10320.237871] [<ffffffff8121508b>] do_filp_open+0x4b/0xb0 [10320.239642] [<ffffffff8122233a>] ? __alloc_fd+0x8a/0x130 [10320.241313] [<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0 [10320.243068] [<ffffffff816b8945>] ? system_call_after_swapgs+0x172/0x214 [10320.244820] [<ffffffff81201cde>] SyS_open+0x1e/0x20 [10320.246469] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b [10320.248096] [<ffffffff816b889d>] ? system_call_after_swapgs+0xca/0x214