Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.14.0
-
3
-
9223372036854775807
Description
recovery-small test_143 fails for interop testing starting on 19 APRIL 2020 for Lustre server version < 2.13.53.62 and Lustre client version >= 2.13.53.62. This failure does not happen for Lustre servers 2.12.5 and 2.12.6, but we do see this failure for 2.13.0 servers.
Looking at suite_log for the latest failure at https://testing.whamcloud.com/test_sets/8adef6a4-82c3-4286-811b-c3600c371395, we can still see MDD orphan threads
trevis-17vm4: trevis-17vm4.trevis.whamcloud.com: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 trevis-17vm4: *.lustre-MDT0000.recovery_status status: COMPLETE CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l Waiting 90s for '0' CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l … CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l Update not seen after 90s: want '0' got '1' recovery-small test_143: @@@@@@ FAIL: MDD orphan cleanup thread not quit Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6273:error() = /usr/lib64/lustre/tests/recovery-small.sh:3030:test_143()