Details
-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
Lustre 2.16.0
-
3
-
9223372036854775807
Description
This issue was created by maloo for jianyu <yujian@whamcloud.com>
This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/e51696de-5c00-48d7-a726-ba3b5a3cb0d6
test_17b failed with the following error:
CMD: onyx-147vm3 /usr/sbin/lctl set_param fail_loc=0xa0000520 fail_val=1 fail_loc=0xa0000520 fail_val=1 1+0 records in 1+0 records out 2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.250987 s, 8.4 MB/s CMD: onyx-147vm2 dd if=/mnt/lustre/f17b.recovery-small of=/dev/null bs=1M count=1 onyx-147vm2: dd: failed to open '/mnt/lustre/f17b.recovery-small': No such file or directory pdsh@onyx-147vm1: onyx-147vm2: ssh exited with exit code 1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0239161 s, 43.8 MB/s service estimate dropped to 5 recovery-small test_17b: @@@@@@ FAIL: read failed
Test session details:
clients: https://build.whamcloud.com/job/lustre-master/4584 - 4.18.0-553.16.1.el8_10.x86_64
servers: https://build.whamcloud.com/job/lustre-b2_15/94 - 4.18.0-553.5.1.el8_lustre.x86_64
<<Please provide additional information about the failure here>>
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
recovery-small test_17b - read failed
whole master vs b2_15 were started to be tested after ATM-3308, so we just have no data prior RC2, we can't say it start failing since RC2 - it may start behave so long ago. We need to start series of runs different masters checkpoints/tags against b2_15 server to see where that started.
So far it looks like all failures are about second client mount issue