Details
-
Bug
-
Resolution: Fixed
-
Blocker
-
None
-
None
-
None
-
lola
build 20150527
-
3
-
9223372036854775807
Description
Error happens on cluster lola during soak testing of build 20150527 (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20150527).
Creating directory inside a directory that had been created as remote and striped directory and default stripe enabled, failed for non-root users:
[soaktest@lola-21 test]$ pwd /mnt/soaked/soaktest/test [soaktest@lola-21 test]$ lfs setdirstripe -i 2 -c 5 parent [soaktest@lola-21 test]$ lfs setdirstripe -i 2 -c 5 -D parent [soaktest@lola-21 test]$ mkdir parent/child-0 mkdir: cannot create directory `parent/child-0': Object is remote [soaktest@lola-21 test]$ lfs getdirstripe -r parent parent lmv_stripe_count: 5 lmv_stripe_offset: 2 mdtidx FID[seq:oid:ver] 2 [0x680020f71:0x1:0x0] 3 [0x6c001f030:0x1:0x0] 4 [0x7800032e0:0x1:0x0] 5 [0x7000032e0:0x1:0x0] 6 [0x800000bd0:0x1:0x0]
This works well for user root:
[root@lola-21 ~]# cd /mnt/soaked/soaktest/test/ [root@lola-21 test]# ls blogbench iorfpp iorssf kcompile mdtestfpp mdtestssf parent racer simul [root@lola-21 test]# lfs setdirstripe -i 2 -c 5 parrent-root [root@lola-21 test]# lfs setdirstripe -i 2 -c 5 -D parrent-root [root@lola-21 test]# mkdir parrent-root/child-0 [root@lola-21 test]# lfs getdirstripe -r parrent-root parrent-root lmv_stripe_count: 5 lmv_stripe_offset: 2 mdtidx FID[seq:oid:ver] 2 [0x680020f71:0x2:0x0] 3 [0x6c001f030:0x2:0x0] 4 [0x7800032e0:0x2:0x0] 5 [0x7000032e0:0x2:0x0] 6 [0x800000bd0:0x2:0x0] parrent-root/child-0 lmv_stripe_count: 5 lmv_stripe_offset: 2 mdtidx FID[seq:oid:ver] 2 [0x680020f71:0x3:0x0] 3 [0x6c001f030:0x3:0x0] 4 [0x7800032e0:0x3:0x0] 5 [0x7000032e0:0x3:0x0] 6 [0x800000bd0:0x3:0x0]
remote dirs and remote gid parameters have been set for all MDTs:
[root@lola-16 provision]# pdsh -R ssh -w lola-[8-11] 'for i in `ls -1 /proc/fs/lustre/mdt|grep -v num_refs` ; do echo -e "Remote dir setting $i: \c"; lctl get_param -n mdt.${i}.enable_remote_dir; done' | dshbak -c lola-10: ssh: connect to host lola-10 port 22: No route to host ---------------- lola-8 ---------------- Remote dir setting soaked-MDT0000: 1 Remote dir setting soaked-MDT0001: 1 ---------------- lola-9 ---------------- Remote dir setting soaked-MDT0002: 1 Remote dir setting soaked-MDT0003: 1 ---------------- lola-10 ---------------- Remote dir setting soaked-MDT0004: 1 Remote dir setting soaked-MDT0005: 1 ---------------- lola-11 ---------------- Remote dir setting soaked-MDT0006: 1 Remote dir setting soaked-MDT0007: 1 [root@lola-16 console]# pdsh -R ssh -w lola-[8-11] 'for i in `ls -1 /proc/fs/lustre/mdt|grep -v num_refs` ; do echo -e "Remote dir_gid setting $i: \c"; lctl get_param -n mdt.${i}.enable_remote_dir_gid; done' | dshbak -c ---------------- lola-8 ---------------- Remote dir_gid setting soaked-MDT0000: -1 Remote dir_gid setting soaked-MDT0001: -1 ---------------- lola-9 ---------------- Remote dir_gid setting soaked-MDT0002: -1 Remote dir_gid setting soaked-MDT0003: -1 ---------------- lola-10 ---------------- Remote dir_gid setting soaked-MDT0004: -1 Remote dir_gid setting soaked-MDT0005: -1 ---------------- lola-11 ---------------- Remote dir_gid setting soaked-MDT0006: -1 Remote dir_gid setting soaked-MDT0007: -1
This is a regression that had been fixed in previous patch already.
Ticket is a blocker as all test (slurm) jobs fail using remote striping.
Attachments
Issue Links
- is related to
-
LUDOC-306 Document new DNE features for 2.8
- Resolved