Details
-
Technical task
-
Resolution: Fixed
-
Blocker
-
Lustre 2.5.0
-
9850
Description
Non-root users cannot archive their files and root cannot archive files owned by other users. Moreover the lfs hsm_archive command fails silently. To reproduce, start HSM and do the following:
# cd /mnt/lustre # dd if=/dev/zero of=Asterix bs=1M count=10 10+0 records in 10+0 records out 10485760 bytes (10 MB) copied, 0.0633262 s, 166 MB/s # chown sanity: Asterix # lfs hsm_archive Asterix # echo $? 0
Running the same operation with trace enabled show that the failure originates from:
00000004:00000001:3.0:1377109173.525959:0:15687:0:(mdd_object.c:911:mdd_xattr_sanity_check()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff) 00000004:00000001:3.0:1377109173.525959:0:15687:0:(mdd_object.c:1034:mdd_xattr_set()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff) 00000004:00000001:3.0:1377109173.525964:0:15687:0:(mdt_hsm.c:77:mdt_hsm_attr_set()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff)
The failure in mdd_xattr_sanity_check() is because the file ownership is not root and the coordinator runs with no capabilities.
The failed request leaves an orphaned agent action according to proc
# lfs path2fid /mnt/lustre/Asterix [0x200000400:0x1:0x0] # cat /proc/fs/lustre/mdt/lustre-MDT0000/hsm/agent_actions lrh=[type=10680000 len=136 idx=73] fid=[0x200000400:0x1:0x0] dfid=[0x200000400:0x1:0x0] compound/cookie=0x52150414/0x52150414 action=ARCHIVE archive#=0 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=WAITING data=[]
Similarly a non-root user cannot archive any files.
# cd /mnt/lustre # mkdir sanity # chown sanity: sanity # cd sanity # su sanity $ dd if=/dev/zero of=Obelix bs=1M count=50 $ ls -l Obelix -rw-rw-r-- 1 sanity sanity 52428800 Aug 21 14:03 Obelix $ lfs hsm_archive Obelix $ echo $? 0 $ sleep 60 $ lfs hsm_state Obelix Obelix: (0x00000000)