[LU-16065] replay-single test_81a: rm remote dir failed Created: 02/Aug/22  Updated: 05/Jul/23

Status: Open
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.15.1
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Maloo Assignee: WC Triage
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Related
is related to LU-10616 replay-single test_70b fails with 'ru... Open
is related to LU-6864 DNE3: Support multiple modify RPCs in... Resolved
is related to LU-16336 LFSCK should fix inconsistencies caus... Open
is related to LU-15335 replay-single test_81b: lfs mkdir failed Open
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

This issue was created by maloo for sarah <sarah@whamcloud.com>

This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/d5f0d54d-a199-4071-9e41-e86f87f40229

test_81a failed with the following error:

rm remote dir failed

env: DNE zfs
replay-single test 81a/b/f/g all failed with this error

[ 9499.983850] Lustre: DEBUG MARKER: /usr/sbin/lctl mark == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep,  fail MDT1 ========================================================== 21:34:23 \(1658093663\)
[ 9500.318282] Lustre: DEBUG MARKER: == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep, fail MDT1 ========================================================== 21:34:23 (1658093663)
[ 9501.082644] Lustre: DEBUG MARKER: lctl set_param fail_loc=0x1701
[ 9501.537124] Lustre: *** cfs_fail_loc=1701, val=2147483648***
[ 9501.538189] Lustre: Skipped 1 previous similar message
[ 9501.539108] LustreError: 202414:0:(ldlm_lib.c:3218:target_send_reply_msg()) @@@ dropping reply  req@00000000b254ad1d x1738627440004288/t377957122055(0) o1000->lustre-MDT0001-mdtlov_UUID@10.240.26.120@tcp:421/0 lens 1488/4320 e 0 to 0 dl 1658093671 ref 1 fl Interpret:/0/0 rc 0/0 job:'osp_up0-1.0'
[ 9527.585438] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9528.133003] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9529.436192] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 9529.804116] Lustre: DEBUG MARKER: onyx-77vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 9533.839594] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9533.858943] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9534.426529] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9534.437882] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 9535.718546] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm2.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0001-mdc-*.mds_server_uuid
[ 9535.725722] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-77vm1.onyx.whamcloud.com: executing wait_import_state_mount \(FULL\|IDLE\) mdc.lustre-MDT0001-mdc-*.mds_server_uuid
[ 9536.131130] Lustre: DEBUG MARKER: onyx-77vm1.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid
[ 9536.148517] Lustre: DEBUG MARKER: onyx-77vm2.onyx.whamcloud.com: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid
[ 9536.625978] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec
[ 9536.637584] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec
[ 9537.052563] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec
[ 9537.059777] Lustre: DEBUG MARKER: mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec
[ 9538.811568] Lustre: DEBUG MARKER: /usr/sbin/lctl mark  replay-single test_81a: @@@@@@ FAIL: rm remote dir failed 
[ 9539.151269] Lustre: DEBUG MARKER: replay-single test_81a: @@@@@@ FAIL: rm remote dir failed

VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
replay-single test_81a - rm remote dir failed



 Comments   
Comment by Andreas Dilger [ 08/Aug/22 ]

It looks like this same subtest has been failing for a long time (back to 2021-01-01 at least), and was attributed to LU-6864. However, that ticket is closed and was more of a feature patch, so better to use this one going forward.

Comment by James A Simmons [ 23/Jan/23 ]

Is this still true?

Comment by Andreas Dilger [ 23/Jan/23 ]

This subtest failed after a long series of other subtest failures started by test_70b failing with LU-10616. I suspect it is just fallout from that issue (incomplete cleanup, etc.).

There have only been 3 failures of this subtest in the past 6 months:

https://testing.whamcloud.com/search?horizon=15552000&status%5B%5D=FAIL&test_set_script_id=f6a12204-32c3-11e0-a61c-52540025f9ae&sub_test_script_id=fc2b3a3e-32c3-11e0-a61c-52540025f9ae&source=sub_tests#redirect

Two of them look like stand-alone failures on review patches:


rm: cannot remove '/mnt/lustre/d81a.replay-single': Directory not empty
replay-single test_81a: @@@@@@ FAIL: rmdir failed

{norformat}

and one is also likely fallout after test_70b failed.

Generated at Sat Feb 10 03:23:41 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.