Details
-
Bug
-
Resolution: Cannot Reproduce
-
Critical
-
Lustre 2.9.0
-
3
-
9223372036854775807
Description
Error occured during soak testing of build '20160713' (see https://wiki.hpdd.intel.com/display/Releases/Soak+Testing+on+Lola#SoakTestingonLola-20160713). MDSes have been configured using ldiskfs, OSTs using zfs. Test environment consist of 4 MDSes with 1 MDT each, 6 OSSes with 4 OSTs each. MDS and OSS nodes are configured in active-active HA configuration.
Roles:
- lola-8 MGS/MDS
- lola-[9-11] MDS
DNE has been enabled using the command sequence (see Lustre manual page 96):
pdsh -g mds 'lctl set_param mdt.*.enable_remote_dir=1' pdsh -g mds 'lctl set_param mdt.*.enable_remote_dir_gid=-1' especially pdsh -w lola-8 'lctl set_param -P mdt.*.enable_remote_dir=1' pdsh -w lola-8 'lctl set_param -P mdt.*.enable_remote_dir_git=-1'
(The later two commands only work on MGS node).
Problem occur after each of the node lola-[9-11] have been restarted or resourcres had been failover / failedback.
While parameter 'enable_remote_dir' is persistent on the non MGS MDSes, the parameter 'enable_remote_dir_gid' isn't.
Therefore the command:
[soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1 /mnt/soaked/soaktest/hsm_rbh/ error on LL_IOC_LMV_SETSTRIPE '/mnt/soaked/soaktest/hsm_rbh/' (3): Operation not permitted error: setdirstripe: create stripe dir '/mnt/soaked/soaktest/hsm_rbh/' failed --------------- --> Remote dir setting: ---------------- lola-8 ---------------- Remote dir_gid setting soaked-MDT0000: -1 ---------------- lola-9 ---------------- Remote dir_gid setting soaked-MDT0001: 0 ---------------- lola-10 ---------------- Remote dir_gid setting soaked-MDT0002: -1 ---------------- lola-11 ---------------- Remote dir_gid setting soaked-MDT0003: 0
failed. This will break all test (slurm) jobs that rely on this functionality.
After setting the parameters on the nodes again the command
[soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1 /mnt/soaked/soaktest/hsm_rbh/ ^A2[soaktest@lola-16 ~]$ lfs setdirstripe -c 4 -i 1 -D /mnt/soaked/soaktest/hsm_rbh/ [soaktest@lola-16 ~]$ lfs getdirstripe /mnt/soaked/soaktest/hsm_rbh/ /mnt/soaked/soaktest/hsm_rbh/ [soaktest@lola-16 ~]$ lfs getdirstripe /mnt/soaked/soaktest/hsm_rbh/ /mnt/soaked/soaktest/hsm_rbh/ lmv_stripe_count: 4 lmv_stripe_offset: 1 mdtidx FID[seq:oid:ver] 1 [0x240007160:0x3:0x0] 2 [0x28000d714:0x3:0x0] 3 [0x2c000a810:0x1:0x0] 0 [0x20000fe01:0x3:0x0]
end successful.
Attachments
Issue Links
- is related to
-
LU-7004 fix "lctl set_param -P" to allow deprecation of "lctl conf_param"
- Resolved