Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3647 HSM _not only_ small fixes and to do list goes here
  3. LU-3811

non-root users cannot archive files, root cannot archive non-root users' files

Details

    • Technical task
    • Resolution: Fixed
    • Blocker
    • Lustre 2.5.0
    • Lustre 2.5.0
    • 9850

    Description

      Non-root users cannot archive their files and root cannot archive files owned by other users. Moreover the lfs hsm_archive command fails silently. To reproduce, start HSM and do the following:

      # cd /mnt/lustre
      # dd if=/dev/zero of=Asterix bs=1M count=10
      10+0 records in
      10+0 records out
      10485760 bytes (10 MB) copied, 0.0633262 s, 166 MB/s
      # chown sanity: Asterix 
      # lfs hsm_archive Asterix 
      # echo $?
      0
      

      Running the same operation with trace enabled show that the failure originates from:

      00000004:00000001:3.0:1377109173.525959:0:15687:0:(mdd_object.c:911:mdd_xattr_sanity_check()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff)
      00000004:00000001:3.0:1377109173.525959:0:15687:0:(mdd_object.c:1034:mdd_xattr_set()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff)
      00000004:00000001:3.0:1377109173.525964:0:15687:0:(mdt_hsm.c:77:mdt_hsm_attr_set()) Process leaving (rc=18446744073709551615 : -1 : ffffffffffffffff)
      

      The failure in mdd_xattr_sanity_check() is because the file ownership is not root and the coordinator runs with no capabilities.

      The failed request leaves an orphaned agent action according to proc

      # lfs path2fid /mnt/lustre/Asterix 
      [0x200000400:0x1:0x0]
      # cat /proc/fs/lustre/mdt/lustre-MDT0000/hsm/agent_actions 
      lrh=[type=10680000 len=136 idx=73] fid=[0x200000400:0x1:0x0] dfid=[0x200000400:0x1:0x0] compound/cookie=0x52150414/0x52150414 action=ARCHIVE archive#=0 flags=0x0 extent=0x0-0xffffffffffffffff gid=0x0 datalen=0 status=WAITING data=[]
      

      Similarly a non-root user cannot archive any files.

      # cd /mnt/lustre
      # mkdir sanity
      # chown sanity: sanity
      # cd sanity
      # su sanity
      $ dd if=/dev/zero of=Obelix bs=1M count=50
      $ ls -l Obelix
      -rw-rw-r-- 1 sanity sanity 52428800 Aug 21 14:03 Obelix
      $ lfs hsm_archive Obelix 
      $ echo $?
      0
      $ sleep 60
      $ lfs hsm_state Obelix
      Obelix: (0x00000000)
      

      Attachments

        Issue Links

          Activity

            [LU-3811] non-root users cannot archive files, root cannot archive non-root users' files
            jhammond John Hammond added a comment -

            Patch landed to master.

            jhammond John Hammond added a comment - Patch landed to master.
            jhammond John Hammond added a comment -

            I though it best to restart the permissions debate in a net ticket. Please see LU-3866.

            jhammond John Hammond added a comment - I though it best to restart the permissions debate in a net ticket. Please see LU-3866 .

            Release is not really resource consuming, it is resource freeing (it is a way for the user to do a kind of posix_fadvise(DONTNEED) with HSM meaning).

            thinking about a case that an archive already exists so the malicious user can keep releasing and restoring a file in an infinite loop.

            jay Jinshan Xiong (Inactive) added a comment - Release is not really resource consuming, it is resource freeing (it is a way for the user to do a kind of posix_fadvise(DONTNEED) with HSM meaning). thinking about a case that an archive already exists so the malicious user can keep releasing and restoring a file in an infinite loop.

            Archive and release should be a privileged operation because it will consume system resources.

            Release is not really resource consuming, it is resource freeing (it is a way for the user to do a kind of posix_fadvise(DONTNEED) with HSM meaning).

            I'd rather say restore is the more resource consuming (use Lustre disk space + copy bandwidth + tape drive). And archive, that use copy bandwidth.

            Otherwise, the system may be easily DoS attack by malicious users.

            A DoS is still easy on a HSM system by restoring all readable files.
            For archive and restore, it is more a QoS issue to avoid a single user to get all the bandwidth.

            A non-root user can cancel HSM requests, do hsm_remove, ...

            Indeed, this is dangerous...

            leibovici-cea Thomas LEIBOVICI - CEA (Inactive) added a comment - - edited Archive and release should be a privileged operation because it will consume system resources. Release is not really resource consuming, it is resource freeing (it is a way for the user to do a kind of posix_fadvise(DONTNEED) with HSM meaning). I'd rather say restore is the more resource consuming (use Lustre disk space + copy bandwidth + tape drive). And archive, that use copy bandwidth. Otherwise, the system may be easily DoS attack by malicious users. A DoS is still easy on a HSM system by restoring all readable files. For archive and restore, it is more a QoS issue to avoid a single user to get all the bandwidth. A non-root user can cancel HSM requests, do hsm_remove, ... Indeed, this is dangerous...

            Also note that there is not enough permission checking in mdt_hsm_request() and friends. A non-root user can cancel HSM requests, do hsm_remove, ...

            jhammond John Hammond added a comment - Also note that there is not enough permission checking in mdt_hsm_request() and friends. A non-root user can cancel HSM requests, do hsm_remove, ...
            • the volatile files created should use the uid/gid of the creating process for the new file, as with any regular file create
              so it will not be possible to restore a file in a RO directory. there is no reason for such restriction
            jcl jacques-charles lafoucriere added a comment - the volatile files created should use the uid/gid of the creating process for the new file, as with any regular file create so it will not be possible to restore a file in a RO directory. there is no reason for such restriction

            People

              jay Jinshan Xiong (Inactive)
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: