Details
-
Bug
-
Resolution: Unresolved
-
Minor
-
None
-
Lustre 2.16.0, Lustre 2.15.0, Lustre 2.15.4, Lustre 2.15.5, Lustre 2.15.6
-
3
-
9223372036854775807
Description
replay-vbr test_12a started failing with 'test_12a failed with 4' on August 4, 2021 for Lustre 2.14.53.7 with logs at https://testing.whamcloud.com/test_sets/17efe0ba-7e4a-4e7f-b7f5-02383e1314c5. We’ve seen this test fail for ZFS and ldiskfs, but, so far, always DNE.
Looking at a recent failure at https://testing.whamcloud.com/test_sets/014ce4c3-c654-47f9-9333-1c58ebf545c3, the suite_log shows
CMD: onyx-24vm7 e2label /dev/mapper/mds1_flakey 2>/dev/null Started lustre-MDT0000 CMD: onyx-55vm7.onyx.whamcloud.com unlinkmany /mnt/lustre/f12a.replay-vbr- 25 - unlinked 0 (time 1643080125 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second CMD: onyx-55vm7.onyx.whamcloud.com unlinkmany /mnt/lustre/f12a.replay-vbr-3- 25 - unlinked 0 (time 1643080125 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second CMD: onyx-55vm7.onyx.whamcloud.com checkstat -v /mnt/lustre/d12a.replay-vbr/f12a.replay-vbr replay-vbr test_12a: @@@@@@ FAIL: test_12a failed with 4 Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:6391:error() = /usr/lib64/lustre/tests/test-framework.sh:6695:run_one()
Looking at the code for this test,
1152 # All 50 files should have been replayed 1153 do_node $CLIENT1 unlinkmany $DIR/$tfile- 25 || return 2 1154 do_node $CLIENT1 unlinkmany $DIR/$tfile-3- 25 || return 3 1155 do_node $CLIENT1 $CHECKSTAT $DIR/$tdir/$tfile && return 4 1156 1157 return 0 1158 } 1159 run_test 12a "lost data due to missed REMOTE client during replay"
The call to checkstat is what produces this error.
Attachments
Issue Links
- is related to
-
LU-9096 sanity test_253: File creation failed after rm
-
- Open
-
- mentioned in
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page No Confluence page found with the given URL.
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
-
Page Loading...
"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56943/
Subject: LU-15553 test: mkdir_on_mdt0 in replay-vbr.sh
Project: fs/lustre-release
Branch: b2_15
Current Patch Set:
Commit: 5f0b7d1950615fc227b0a4aa0002f00046ae01f1