[LUDOC-289] remote directory don't work for non-root users Created: 02/Jun/15  Updated: 20/May/22  Resolved: 20/May/22

Status: Resolved
Project: Lustre Documentation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Blocker
Reporter: Frank Heckes (Inactive) Assignee: Lustre Manual Triage
Resolution: Fixed Votes: 0
Labels: None
Environment:

lola
build 20150527


Issue Links:
Related
is related to LUDOC-306 Document new DNE features for 2.8 Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

Error happens on cluster lola during soak testing of build 20150527 (see https://wiki.hpdd.intel.com/pages/viewpage.action?title=Soak+Testing+on+Lola&spaceKey=Releases#SoakTestingonLola-20150527).

Creating directory inside a directory that had been created as remote and striped directory and default stripe enabled, failed for non-root users:

[soaktest@lola-21 test]$ pwd
/mnt/soaked/soaktest/test
[soaktest@lola-21 test]$ lfs setdirstripe -i 2 -c 5 parent
[soaktest@lola-21 test]$ lfs setdirstripe -i 2 -c 5 -D parent
[soaktest@lola-21 test]$ mkdir parent/child-0
mkdir: cannot create directory `parent/child-0': Object is remote
[soaktest@lola-21 test]$ lfs getdirstripe -r parent
parent
lmv_stripe_count: 5 lmv_stripe_offset: 2
mdtidx           FID[seq:oid:ver]
     2           [0x680020f71:0x1:0x0]
     3           [0x6c001f030:0x1:0x0]
     4           [0x7800032e0:0x1:0x0]
     5           [0x7000032e0:0x1:0x0]
     6           [0x800000bd0:0x1:0x0]

This works well for user root:

[root@lola-21 ~]# cd /mnt/soaked/soaktest/test/
[root@lola-21 test]# ls
blogbench  iorfpp  iorssf  kcompile  mdtestfpp  mdtestssf  parent  racer  simul
[root@lola-21 test]# lfs setdirstripe -i 2 -c 5 parrent-root
[root@lola-21 test]# lfs setdirstripe -i 2 -c 5 -D parrent-root
[root@lola-21 test]# mkdir parrent-root/child-0
[root@lola-21 test]# lfs getdirstripe -r parrent-root
parrent-root
lmv_stripe_count: 5 lmv_stripe_offset: 2
mdtidx           FID[seq:oid:ver]
     2           [0x680020f71:0x2:0x0]
     3           [0x6c001f030:0x2:0x0]
     4           [0x7800032e0:0x2:0x0]
     5           [0x7000032e0:0x2:0x0]
     6           [0x800000bd0:0x2:0x0]
parrent-root/child-0
lmv_stripe_count: 5 lmv_stripe_offset: 2
mdtidx           FID[seq:oid:ver]
     2           [0x680020f71:0x3:0x0]
     3           [0x6c001f030:0x3:0x0]
     4           [0x7800032e0:0x3:0x0]
     5           [0x7000032e0:0x3:0x0]
     6           [0x800000bd0:0x3:0x0]

remote dirs and remote gid parameters have been set for all MDTs:

[root@lola-16 provision]# pdsh -R ssh -w lola-[8-11] 'for i in `ls -1 /proc/fs/lustre/mdt|grep -v num_refs` ; do echo -e "Remote dir setting $i: \c"; lctl get_param -n mdt.${i}.enable_remote_dir; done' | dshbak -c
lola-10: ssh: connect to host lola-10 port 22: No route to host
----------------
lola-8
----------------
Remote dir setting soaked-MDT0000: 1
Remote dir setting soaked-MDT0001: 1
----------------
lola-9
----------------
Remote dir setting soaked-MDT0002: 1
Remote dir setting soaked-MDT0003: 1
----------------
lola-10
----------------
Remote dir setting soaked-MDT0004: 1
Remote dir setting soaked-MDT0005: 1
----------------
lola-11
----------------
Remote dir setting soaked-MDT0006: 1
Remote dir setting soaked-MDT0007: 1

[root@lola-16 console]# pdsh -R ssh -w lola-[8-11] 'for i in `ls -1 /proc/fs/lustre/mdt|grep -v num_refs` ; do echo -e "Remote dir_gid setting $i: \c"; lctl get_param -n mdt.${i}.enable_remote_dir_gid; done' | dshbak -c
----------------
lola-8
----------------
Remote dir_gid setting soaked-MDT0000: -1
Remote dir_gid setting soaked-MDT0001: -1
----------------
lola-9
----------------
Remote dir_gid setting soaked-MDT0002: -1
Remote dir_gid setting soaked-MDT0003: -1
----------------
lola-10
----------------
Remote dir_gid setting soaked-MDT0004: -1
Remote dir_gid setting soaked-MDT0005: -1
----------------
lola-11
----------------
Remote dir_gid setting soaked-MDT0006: -1
Remote dir_gid setting soaked-MDT0007: -1

This is a regression that had been fixed in previous patch already.
Ticket is a blocker as all test (slurm) jobs fail using remote striping.



 Comments   
Comment by Peter Jones [ 02/Jun/15 ]

Di

Could you please look into this issue?

Peter

Comment by Di Wang [ 02/Jun/15 ]

this is duplicate with LU-6341. I remember there is duplicate ticket in INTL as well.

Comment by Andreas Dilger [ 02/Jun/15 ]

Frank, there is also a setting to enable/disable remote/striped directories for root and non-root users.

lctl set_param mdt.myth-MDT0000.enable_remote_dir_gid=-1

It seems that the enable_remote_dir_gid is not documented in the Manual at all.

Comment by Frank Heckes (Inactive) [ 03/Jun/15 ]

The enable_remote_dir_gid parameter had been set to -1 on all MDTs when I got the error above.
Yes, the parameter isn't documented, I learned about it from Johann.

Comment by Richard Henwood (Inactive) [ 04/Dec/15 ]

http://review.whamcloud.com/#/c/17394/

Generated at Sat Feb 10 03:41:44 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.