[LU-16065] replay-single test_81a: rm remote dir failed Created: 02/Aug/22 Updated: 05/Jul/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||||||||||
| Severity: | 3 | ||||||||||||||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||||||||||||||
| Description |
|
This issue was created by maloo for sarah <sarah@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d5f0d54d-a199-4071-9e41-e86f87f40229 test_81a failed with the following error: rm remote dir failed env: DNE zfs [ 9499.983850] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep, fail MDT1 ========================================================== 21:34:23 \(1658093663\) [ 9500.318282] Lustre: DEBUG MARKER: == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep, fail MDT1 ========================================================== 21:34:23 (1658093663) [ 9501.082644] Lustre: DEBUG MARKER: lctl set_param fail_loc=0x1701 [ 9501.537124] Lustre: *** cfs_fail_loc=1701, val=2147483648*** [ 9501.538189] Lustre: Skipped 1 previous similar message [ 9501.539108] LustreError: 202414:0:(ldlm_lib.c:3218:target_send_reply_msg()) @@@ dropping reply req@00000000b254ad1d x1738627440004288/t377957122055(0) o1000->lustre-MDT0001-mdtlov_UUID@10.240.26.120@tcp:421/0 lens 1488/4320 e 0 to 0 dl 1658093671 ref 1 fl Interpret:/0/0 rc 0/0 job:'osp_up0-1.0' [ 9527.585438] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9528.133003] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9529.436192] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 [ 9529.804116] Lustre: DEBUG MARKER: onyx-77vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4 [ 9533.839594] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9533.858943] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9534.426529] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9534.437882] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null [ 9535.718546] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm2.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0001-mdc-*.mds_server_uuid [ 9535.725722] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm1.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0001-mdc-*.mds_server_uuid [ 9536.131130] Lustre: DEBUG MARKER: onyx-77vm1.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid [ 9536.148517] Lustre: DEBUG MARKER: onyx-77vm2.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid [ 9536.625978] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec [ 9536.637584] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec [ 9537.052563] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec [ 9537.059777] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec [ 9538.811568] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_81a: @@@@@@ FAIL: rm remote dir failed [ 9539.151269] Lustre: DEBUG MARKER: replay-single test_81a: @@@@@@ FAIL: rm remote dir failed VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 08/Aug/22 ] |
|
It looks like this same subtest has been failing for a long time (back to 2021-01-01 at least), and was attributed to |
| Comment by James A Simmons [ 23/Jan/23 ] |
|
Is this still true? |
| Comment by Andreas Dilger [ 23/Jan/23 ] |
|
This subtest failed after a long series of other subtest failures started by test_70b failing with LU-10616. I suspect it is just fallout from that issue (incomplete cleanup, etc.). There have only been 3 failures of this subtest in the past 6 months: Two of them look like stand-alone failures on review patches: rm: cannot remove '/mnt/lustre/d81a.replay-single': Directory not empty and one is also likely fallout after test_70b failed. |