[LU-15335] replay-single test_81b: lfs mkdir failed Created: 07/Dec/21 Updated: 05/Jul/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Chris Horn <hornc@cray.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/e507a0e9-992c-480a-81b1-cede4ebbbfcc test_81b failed with the following error: lfs mkdir failed == replay-single test 81b: DNE: unlink remote dir, drop MDT0 update reply, fail MDT0 ========================================================== 23:38:12 (1638833892) lfs mkdir: dirstripe error on '/mnt/lustre/d81b.replay-single/remote_dir': No space left on device lfs setdirstripe: cannot create dir '/mnt/lustre/d81b.replay-single/remote_dir': No space left on device replay-single test_81b: @@@@@@ FAIL: lfs mkdir failed MDS 2, 4 reports: [ 6769.074349] Lustre: DEBUG MARKER: == replay-single test 81b: DNE: unlink remote dir, drop MDT0 update reply, fail MDT0 ========================================================== 23:38:12 (1638833892) [ 6769.447476] Lustre: 11079:0:(llog_cat.c:101:llog_cat_new_log()) lustre-MDT0001-osd: there are no more free slots in catalog [0x1:0x40000400:0x2]:0 [ 6769.449595] LustreError: 11079:0:(update_trans.c:985:top_trans_stop()) lustre-MDT0001-osd: write updates failed: rc = -28 [ 6769.902206] Lustre: DEBUG MARKER: /usr/sbin/lctl mark replay-single test_81b: @@@@@@ FAIL: lfs mkdir failed [ 6770.337063] Lustre: DEBUG MARKER: replay-single test_81b: @@@@@@ FAIL: lfs mkdir failed VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Etienne Aujames [ 05/Jul/23 ] |
|
+1 on b2_15: https://testing.whamcloud.com/test_sets/4bcb1eb7-8b31-4b85-82d8-ee45432e786f Similar type of failure for replay-single.sh test81a rm: cannot remove '/mnt/lustre/d81a.replay-single': Directory not empty replay-single test_81a: @@@@@@ FAIL: rmdir failed On the remote MDT: [ 6759.178532] Lustre: DEBUG MARKER: == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep, fail MDT1 ========================================================== 16:37:23 (1688488643)
[ 6760.608314] Lustre: DEBUG MARKER: sync; sync; sync
[ 6762.701358] Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0001 notransno
[ 6763.322377] Lustre: DEBUG MARKER: /usr/sbin/lctl --device lustre-MDT0001 readonly
[ 6764.014219] Lustre: DEBUG MARKER: /usr/sbin/lctl mark mds2 REPLAY BARRIER on lustre-MDT0001
[ 6764.321882] Lustre: DEBUG MARKER: mds2 REPLAY BARRIER on lustre-MDT0001
[ 6764.647328] Lustre: DEBUG MARKER: grep -c /mnt/lustre-mds2' ' /proc/mounts || true
[ 6765.261342] Lustre: DEBUG MARKER: umount -d /mnt/lustre-mds2
[ 6765.582687] Lustre: lustre-MDT0001: Not available for connect from 10.240.23.84@tcp (stopping)
[ 6766.647124] Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping)
[ 6769.844426] Lustre: lustre-MDT0001: Not available for connect from 10.240.23.86@tcp (stopping)
[ 6769.846155] Lustre: Skipped 9 previous similar messages
[ 6770.746807] LustreError: 137728:0:(client.c:1256:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@00000000d41d4268 x1770501781186432/t0(0) o1000->lustre-MDT0000-osp-MDT0001@10.240.23.87@tcp:24/4 lens 304/4320 e 0 to 0 dl 0 ref 2 fl Rpc:QU/0/ffffffff rc 0/-1 job:'umount.0'
[ 6770.751080] LustreError: 137728:0:(osp_object.c:629:osp_attr_get()) lustre-MDT0000-osp-MDT0001: osp_attr_get update error [0x200000401:0x1:0x0]: rc = -5
[ 6771.170857] Lustre: DEBUG MARKER: lsmod | grep lnet > /dev/null &&
lctl dl | grep ' ST ' || true
[ 6771.808589] Lustre: DEBUG MARKER: ! zpool list -H lustre-mdt2 >/dev/null 2>&1 ||
grep -q ^lustre-mdt2/ /proc/mounts ||
zpool export lustre-mdt2
[ 6782.485160] Lustre: DEBUG MARKER: hostname
[ 6783.183110] Lustre: DEBUG MARKER: lsmod | grep zfs >&/dev/null || modprobe zfs;
zpool list -H lustre-mdt2 >/dev/null 2>&1 ||
zpool import -f -o cachefile=none -o failmode=panic -d /dev/lvm-Role_MDS lustre-mdt2
[ 6784.139317] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt2/mdt2
[ 6784.757254] Lustre: DEBUG MARKER: mkdir -p /mnt/lustre-mds2; mount -t lustre -o localrecov lustre-mdt2/mdt2 /mnt/lustre-mds2
[ 6785.540066] LustreError: 138500:0:(llog_cat.c:418:llog_cat_id2handle()) lustre-MDT0000-osp-MDT0001: error opening log id [0x1:0x2abb1:0x2]:0: rc = -2
[ 6785.884440] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n health_check
[ 6786.503021] Lustre: DEBUG MARKER: PATH=/usr/lib64/lustre/tests:/usr/lib/lustre/tests:/usr/lib64/lustre/tests:/opt/iozone/bin:/opt/iozone/bin:/opt/iozone/bin:/usr/lib64/lustre/tests/mpi:/usr/lib64/lustre/tests/racer:/usr/lib64/lustre/../lustre-iokit/sgpdd-survey:/usr/lib64/lustre/tests:/us
[ 6789.418774] Lustre: DEBUG MARKER: /usr/sbin/lctl get_param -n version 2>/dev/null
[ 6789.891802] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-37vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 6789.910641] Lustre: DEBUG MARKER: /usr/sbin/lctl mark onyx-37vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 6790.338746] Lustre: DEBUG MARKER: onyx-37vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 6790.371724] Lustre: DEBUG MARKER: onyx-37vm5.onyx.whamcloud.com: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all 4
[ 6790.699592] Lustre: DEBUG MARKER: zfs get -H -o value lustre:svname lustre-mdt2/mdt2 2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
[ 6790.824220] Lustre: 138503:0:(llog_cat.c:101:llog_cat_new_log()) lustre-MDT0000-osp-MDT0001: there are no more free slots in catalog [0x1:0x401:0x2]:0
[ 6791.032737] Lustre: 13458:0:(mdt_recovery.c:200:mdt_req_from_lrd()) @@@ restoring transno req@000000007df1805e x1770501704784256/t47244640326(0) o36->e64e0c52-8b16-4b00-83a8-f1b1780cfa46@10.240.23.84@tcp:683/0 lens 496/2888 e 0 to 0 dl 1688488723 ref 1 fl Interpret:/2/0 rc 0/0 job:'rmdir.0'
|
| Comment by Etienne Aujames [ 05/Jul/23 ] |
|
replay_single.sh test 81a "rmdir failed" (MDT001 -> "there are no more free slots in catalog"): This issue seems to appear only for ZFS tests. |