Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.12.6
-
None
-
3
-
9223372036854775807
Description
sanityn test_40a fails with 'parallel operation is blocked'.
We’ve seen this failure three times :
18 AUG 2020 DNE/ZFS 2.12.5.20 for ticket/patch LU-13471/39576 - https://testing.whamcloud.com/test_sets/15350d90-9f69-4900-b036-9fc3f73140e5
07 NOV 2020 DNE/ZFS 2.12.5.83 for branch testing - https://testing.whamcloud.com/test_sets/bbd2f426-8f1b-4e0d-a27f-951e75976ab6
07 DEC 2020 DNE/ldiskfs 2.12.6 RC2 branch testing - https://testing.whamcloud.com/test_sets/bc0cbb1c-6e57-40df-9eef-47d8b7b8962d
Looking at the failure for RC2 testing, the suite_log shows the output for the test
== sanityn test 40a: pdirops: create vs others ======================================================= 17:18:20 (1607361500) CMD: trevis-52vm5,trevis-52vm6 /usr/sbin/lctl set_param -n ldlm.namespaces.*mdt*.lru_size=clear CMD: trevis-52vm5,trevis-52vm6 /usr/sbin/lctl get_param ldlm.namespaces.*mdt*.lock_unused_count ldlm.namespaces.*mdt*.lock_count ldlm.namespaces.mdt-lustre-MDT0000_UUID.lock_count=57 ldlm.namespaces.mdt-lustre-MDT0001_UUID.lock_count=1 CMD: trevis-52vm5 lctl set_param fail_loc=0x80000145 fail_loc=0x80000145 No conflict No conflict No conflict No conflict No conflict No conflict Conflict sanityn test_40a: @@@@@@ FAIL: parallel operation is blocked Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:5907:error() = /usr/lib64/lustre/tests/sanityn.sh:1517:test_40a()
Although we see these ‘errors’ in the MDS console when 40a passes, this is the only interesting messages in the console logs. Looking at the console log for MDS1/3 (vm5), we see
[31916.082787] Lustre: DEBUG MARKER: == sanityn test 40a: pdirops: create vs others ======================================================= 17:18:20 (1607361500) [31916.531769] Lustre: DEBUG MARKER: /usr/sbin/lctl set_param -n ldlm.namespaces.*mdt*.lru_size=clear [31917.009765] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param ldlm.namespaces.*mdt*.lock_unused_count ldlm.namespaces.*mdt*.lock_count [31917.449680] Lustre: DEBUG MARKER: lctl set_param fail_loc=0x80000145 [31917.647130] LustreError: 8427:0:(fail.c:129:__cfs_fail_timeout_set()) cfs_fail_timeout id 145 sleeping for 15000ms [31917.648962] LustreError: 8427:0:(fail.c:129:__cfs_fail_timeout_set()) Skipped 4 previous similar messages [31932.650062] LustreError: 8427:0:(fail.c:133:__cfs_fail_timeout_set()) cfs_fail_timeout id 145 awake [31932.651716] LustreError: 8427:0:(fail.c:133:__cfs_fail_timeout_set()) Skipped 4 previous similar messages [31933.827926] Lustre: DEBUG MARKER: /usr/sbin/lctl mark sanityn test_40a: @@@@@@ FAIL: parallel operation is blocked [31934.090878] Lustre: DEBUG MARKER: sanityn test_40a: @@@@@@ FAIL: parallel operation is blocked
Attachments
Issue Links
- mentioned in
-
Page Loading...