[LU-14330] Interop: recovery-small test 143 fails with 'MDD orphan cleanup thread not quit' Created: 13/Jan/21 Updated: 22/Jan/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.14.0 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | James Nunez (Inactive) | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | interop, tests | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
recovery-small test_143 fails for interop testing starting on 19 APRIL 2020 for Lustre server version < 2.13.53.62 and Lustre client version >= 2.13.53.62. This failure does not happen for Lustre servers 2.12.5 and 2.12.6, but we do see this failure for 2.13.0 servers. Looking at suite_log for the latest failure at https://testing.whamcloud.com/test_sets/8adef6a4-82c3-4286-811b-c3600c371395, we can still see MDD orphan threads trevis-17vm4: trevis-17vm4.trevis.whamcloud.com: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 trevis-17vm4: *.lustre-MDT0000.recovery_status status: COMPLETE CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l Waiting 90s for '0' CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l … CMD: trevis-17vm4 pgrep orph_.*-MDD | wc -l Update not seen after 90s: want '0' got '1' recovery-small test_143: @@@@@@ FAIL: MDD orphan cleanup thread not quit Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6273:error() = /usr/lib64/lustre/tests/recovery-small.sh:3030:test_143() |