Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
None
-
3
-
9223372036854775807
Description
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>
This issue relates to the following test suite run:
https://testing.whamcloud.com/test_sets/72dfd1b1-7efa-4dcc-a366-043549333756
https://testing.whamcloud.com/test_sets/1dde1cd4-24fd-4ae3-ab20-d7261788833f
test_1 failed with the following error:
CMD: onyx-106vm11 /usr/sbin/lctl set_param -P lod.*.mdt_hash=crush comparing 520 previously copied files Files /etc/yum.repos.d/redhat.repo and /mnt/lustre/d1.runtests//etc/yum.repos.d/redhat.repo differ Files /etc/pki/entitlement/2519028287967039457.pem and /mnt/lustre/d1.runtests//etc/pki/entitlement/2519028287967039457.pem differ runtests test_1: @@@@@@ FAIL: old and new files are different: rc=22
Test session details:
clients: https://build.whamcloud.com/job/lustre-master-next/784 - 5.14.0-362.18.1.el9_3.x86_64
servers: https://build.whamcloud.com/job/lustre-master-next/784 - 5.14.0-362.18.1_lustre.el9.x86_64
This failed for the first time with this error on 2024-04-24 for on two separate test runs, one an unlanded patch, and one a "full" test run on master. Strangely, both failures were reported on the same two files. There don't appear to be any Lustre console errors immediately before this failure (a few back when the filesystem is remounted in the test).
VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
runtests test_1 - old and new files are different: rc=22
Unfortunately this was hit again on another test run, with the same files causing issues:
https://testing.whamcloud.com/test_sets/abfde0a8-37fa-4801-8f76-3d75d434d243
but it hit during interop testing with a 2.12.9 client, so that eliminates any changes on master clients. It appears the clients were running RHEL, so it seems like something that RHEL is doing itself.
We can partly work around this by excluding those files from the copy list, but that will only fix new clients and not old ones having issues with interop testing. It would likely be better to eliminate whatever is causing this process to run inside the VM to stop.