Details
-
Bug
-
Resolution: Fixed
-
Minor
-
Lustre 2.10.0
-
None
-
3
-
9223372036854775807
Description
There is a defect in save_lustre_params() in test-framework.sh.
replay-vbr calls save_lustre_params() with wildcard on parameter :
save_lustre_params $(get_facets MDS) "mdt.*.commit_on_sharing" > $cos_param_file
On the setup with facet_HOST != facetfailover_HOST we could have the following stored parameters :
[root@lm0117 ~]# cat /tmp/rvbr-cos-params mds1 mdt.lustre-MDT0000.commit_on_sharing=0 mds1 mdt.lustre-MDT0001.commit_on_sharing=0 <<< 1st parameter mds2 mdt.lustre-MDT0000.commit_on_sharing=0 mds2 mdt.lustre-MDT0001.commit_on_sharing=0 <<< 2nd parameter mds3 mdt.lustre-MDT0002.commit_on_sharing=0 mds4 mdt.lustre-MDT0003.commit_on_sharing=0 [root@lm0117 ~]#
after facet_failover mds1 mds1 is mounted on other node, but mds2 are still on the the same node.
restore_lustre_params () is trying to restore "mdt.lustre-MDT0001.commit_on_sharing=0" on mds1, but it is missing because of mds2 was not failed and it is still on the same place.
I.e. save_lustre_params() duplicates the info, 1st parameter is not related to mds1, it is related to mds2.