== sanityn test 15: test out-of-space with multiple writers ===================================================================== 04:33:12 (1745555592)
[ 39.413547] Lustre: DEBUG MARKER: == sanityn test 15: test out-of-space with multiple writers ===================================================================== 04:33:12 (1745555592)
PATH=/mnt/build/lustre/tests/../tests/mpi:/mnt/build/lustre/tests/../tests/racer:/mnt/build/lustre/tests/../../lustre-iokit/sgpdd-survey:/mnt/build/lustre/tests/../tests:/mnt/build/lustre/tests/../utils/gss:/mnt/build/lustre/tests/../utils:/root/.local/bin:/root/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/mnt/build/lustre/utils:/mnt/build/lustre/tests::/mnt/build/lustre/scripts:/mnt/build/lustre-iokit/mds-survey:/mnt/build/lustre-iokit/obdfilter-survey:/mnt/build/lipe/src:/opt/iozone/bin:/usr/lib64/openmpi/bin:
Reading test skip list from /tmp/ltest.config
EXCEPT="$EXCEPT 14 55d 78 106"
mgs: Rocky Linux release 9.3 (Blue Onyx)
MGS_OS_VERSION_ID=9.3
MGS_OS_ID=rocky
MGS_OS_VERSION_CODE=151191552
MGS_OS_ID_LIKE=rhel centos fedora rocky
mds1: Rocky Linux release 9.3 (Blue Onyx)
MDS1_OS_ID=rocky
MDS1_OS_VERSION_CODE=151191552
MDS1_OS_ID_LIKE=rhel centos fedora rocky
MDS1_OS_VERSION_ID=9.3
ost1: Rocky Linux release 9.3 (Blue Onyx)
OST1_OS_VERSION_ID=9.3
OST1_OS_VERSION_CODE=151191552
OST1_OS_ID=rocky
OST1_OS_ID_LIKE=rhel centos fedora rocky
client: Rocky Linux release 9.3 (Blue Onyx)
CLIENT_OS_VERSION_ID=9.3
CLIENT_OS_ID_LIKE=rhel centos fedora rocky
CLIENT_OS_ID=rocky
CLIENT_OS_VERSION_CODE=151191552
STRIPECOUNT=2 ORIGFREE=4796328 MAXFREE=2097152000
[ 61.914712] loop: Write error at byte offset 2191167488, length 4096.
[ 61.915295] loop: Write error at byte offset 2189426688, length 4096.
[ 61.915295] blk_print_req_error: 49 callbacks suppressed
[ 61.915295] I/O error, dev loop2, sector 4278784 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.915573] I/O error, dev loop2, sector 4276224 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.915759] LustreError: 3415:0:(osc_request.c:2438:osc_brw_redo_request()) @@@ redo for recoverable error -5 req@ffff9a07b47f9580 x1830347671373568/t4294967849(4294967849) o4->lustre-OST0001-osc-ffff9a0745d6c000@0@lo:6/4 lens 488/448 e 0 to 0 dl 1745555630 ref 3 fl Interpret:RQU/604/0 rc -5/-5 job:'dd.0' uid:0 gid:0 projid:0
[ 61.931445] loop: Write error at byte offset 2189164544, length 4096.
[ 61.931831] loop: Write error at byte offset 2187853824, length 4096.
[ 61.931831] I/O error, dev loop2, sector 4275712 op 0x1:(WRITE) flags 0x0 phys_seg 4 prio class 2
[ 61.931875] loop: Write error at byte offset 2186543104, length 4096.
[ 61.931912] I/O error, dev loop2, sector 4273152 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.931999] loop: Write error at byte offset 2185232384, length 4096.
[ 61.932170] I/O error, dev loop2, sector 4270592 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.932249] I/O error, dev loop2, sector 4268032 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.960158] loop: Write error at byte offset 2193358848, length 4096.
[ 61.960239] loop: Write error at byte offset 2192048128, length 4096.
[ 61.960239] I/O error, dev loop1, sector 4283904 op 0x1:(WRITE) flags 0x0 phys_seg 4 prio class 2
[ 61.960279] loop: Write error at byte offset 2190737408, length 4096.
[ 61.960523] loop: Write error at byte offset 2189426688, length 4096.
[ 61.960523] I/O error, dev loop1, sector 4281344 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.960718] I/O error, dev loop1, sector 4278784 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 61.960756] I/O error, dev loop1, sector 4276224 op 0x1:(WRITE) flags 0x4000 phys_seg 20 prio class 2
[ 62.458045] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) @@@ redo for recoverable error -5 req@ffff9a079f412e00 x1830347671379840/t4294967773(4294967773) o4->lustre-OST0000-osc-ffff9a0745d6c000@0@lo:6/4 lens 488/448 e 0 to 0 dl 1745555631 ref 3 fl Interpret:RQU/604/0 rc -5/-5 job:'dd.0' uid:0 gid:0 projid:0
[ 62.458240] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) Skipped 20 previous similar messages
[ 64.144991] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) @@@ redo for recoverable error -5 req@ffff9a077f0a9700 x1830347671377152/t4294967870(4294967870) o4->lustre-OST0001-osc-ffff9a074acee000@0@lo:6/4 lens 488/448 e 0 to 0 dl 1745555633 ref 3 fl Interpret:RQU/604/0 rc -5/-5 job:'ptlrpcd_00_01.0' uid:0 gid:0 projid:0
[ 64.145104] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) Skipped 16 previous similar messages
[ 65.957220] Aborting journal on device dm-2-8.
[ 65.957235] Aborting journal on device dm-1-8.
[ 65.957317] LustreError: 5988:0:(osd_handler.c:1880:osd_trans_commit_cb()) transaction @0xffff9a079f7b3200 commit error: 2
[ 66.353992] LDISKFS-fs error (device dm-1): ldiskfs_journal_check_start:83: comm ll_ost_io00_002: Detected aborted journal
[ 66.354864] LDISKFS-fs error (device dm-2): ldiskfs_journal_check_start:83: comm ll_ost_io00_000: Detected aborted journal
[ 66.354972] LDISKFS-fs (dm-1): Remounting filesystem read-only
[ 66.355151] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) @@@ redo for recoverable error -30 req@ffff9a0877cc1200 x1830347682812288/t0(0) o4->lustre-OST0000-osc-ffff9a074acee000@0@lo:6/4 lens 568/464 e 0 to 0 dl 1745555635 ref 2 fl Interpret:RMQU/600/0 rc -30/-30 job:'dd.0' uid:0 gid:0 projid:0
[ 66.355166] LDISKFS-fs (dm-2): Remounting filesystem read-only
[ 66.355166] LustreError: 3416:0:(osc_request.c:2438:osc_brw_redo_request()) Skipped 18 previous similar messages
^C
I think this test should just be disabled on ZFS for large volumes again.
The intent was to change the ldiskfs testing to use fallocate to speed up testing, but unintentionally it changed the limit from 400MB x OSTCOUNT to 1TB x OSTCOUNT for the filesystem. This is almost certainly going to fill up $TMP if that is where the ZFS OSTs are located, because ZFS doesn't support fallocate and will try to write the full data size, and you don't have 4TB of RAM on your test system...
I don't think there is any mystery here.