[LU-12747] sanity: test 811 fail with "MDD orphan cleanup thread not quit" Created: 11/Sep/19 Updated: 13/Jan/21 Resolved: 14/Feb/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | Andreas Dilger |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Severity: | 3 | ||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||
| Description |
|
This issue was created by maloo for Lai Siyao <lai.siyao@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/83dd0b3a-d3ea-11e9-9fc9-52540065bddc onyx-33vm4: == rpc test complete, duration -o sec ================================================================ 16:37:21 (1568133441)
onyx-33vm4: onyx-33vm4.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
CMD: onyx-33vm4 e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: onyx-33vm4 e2label /dev/mapper/mds1_flakey 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
CMD: onyx-33vm4 e2label /dev/mapper/mds1_flakey 2>/dev/null
Started lustre-MDT0000
CMD: onyx-33vm4 pgrep orph_.*-MDD
sanity test_811: @@@@@@ FAIL: MDD orphan cleanup thread not quit
Trace dump:
= /usr/lib64/lustre/tests/test-framework.sh:6115:error()
= /usr/lib64/lustre/tests/sanity.sh:21633:test_811()
= /usr/lib64/lustre/tests/test-framework.sh:6417:run_one()
= /usr/lib64/lustre/tests/test-framework.sh:6456:run_one_logged()
= /usr/lib64/lustre/tests/test-framework.sh:6302:run_test()
= /usr/lib64/lustre/tests/sanity.sh:21635:main()
|
| Comments |
| Comment by Andreas Dilger [ 31/Jan/20 ] |
|
+1 on master https://testing.whamcloud.com/test_sets/64398a3e-4243-11ea-b083-52540065bddc |
| Comment by Andreas Dilger [ 01/Feb/20 ] |
|
This seems to fail intermittently, but could be made more robust. |
| Comment by Andreas Dilger [ 01/Feb/20 ] |
[ 8541.797642] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 8542.067572] Lustre: DEBUG MARKER: pgrep orph_.*-MDD [ 8542.196104] Lustre: lustre-MDT0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted. [ 8542.253891] LustreError: 27822:0:(osd_handler.c:278:osd_idc_find_or_init()) can't lookup: rc = -2 [ 8542.255510] Lustre: 27822:0:(mdd_orphans.c:340:mdd_orphan_destroy()) lustre-MDD0000: orphan 0x200006991:0xd:0x0 [0x200006991:0xd:0x0] doesn't exist [ 8542.706917] Lustre: DEBUG MARKER: sanity test_811: @@@@@@ FAIL: MDD orphan cleanup thread not quit The pgrep is run shortly before mdd_orphan_destroy() is finished, a slightly longer wait would fix this. |
| Comment by Gerrit Updater [ 01/Feb/20 ] |
|
Andreas Dilger (adilger@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/37395 |
| Comment by Gerrit Updater [ 14/Feb/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37395/ |
| Comment by Peter Jones [ 14/Feb/20 ] |
|
Landed for 2.14 |