Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1778

Root Squash is not always properly enforced

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.6.0, Lustre 2.5.4
    • Lustre 2.1.1, Lustre 2.1.2
    • None
    • 3
    • 8532

    Description

      On a node with root_squash activated, if root try to access to attributes of file (fstat) which has not been previously accessed, the operation return ENOPERM.
      If the attributes file were accessed by an authorized user, then root can access attributes without troubles.

      as root :
      [root@clientae ~]# mount -t lustre 192.168.1.100:/scratch /scratch
      [root@clientae ~]# cd /scratch/
      [root@clientae scratch]# ls -la
      total 16
      drwxrwxrwx 4 root root 4096 Aug 21 18:03 .
      dr-xr-xr-x. 28 root root 4096 Aug 22 15:53 ..
      drwxr-xr-x 2 root root 4096 Jun 21 18:42 .lustre
      drwx------ 2 slurm users 4096 Aug 21 18:03 test_dir
      [root@clientae scratch]# cd test_dir/
      [root@clientae test_dir]# ls -la
      ls: cannot open directory .: Permission denied

      then, as user 'slurm' :
      [slurm@clientae ~]$ cd /scratch/test_dir
      [slurm@clientae test_dir]# ls -la
      total 16
      drwx------ 2 slurm users 4096 Aug 21 18:03 .
      drwxrwxrwx 4 root root 4096 Aug 22 16:47 ..
      rw-rr- 1 slurm users 7007 Aug 22 15:58 afile

      now, come back as user root an replay the 'ls' command :
      [root@clientae test_dir]# ls -la
      total 16
      drwx------ 2 slurm users 4096 Aug 21 18:03 .
      drwxrwxrwx 4 root root 4096 Aug 22 16:47 ..
      rw-rr- 1 slurm users 7007 Aug 22 15:58 afile
      [root@clientae test_dir]# stat afile
      File: `afile'
      Size: 7007 Blocks: 16 IO Block: 2097152 regular file
      Device: d61f715ah/3592384858d Inode: 144115238826934275 Links: 1
      Access: (0644/rw-rr-) Uid: ( 500/ slurm) Gid: ( 100/ users)
      Access: 2012-08-22 15:59:26.000000000 +0200
      Modify: 2012-08-22 15:58:55.000000000 +0200
      Change: 2012-08-22 15:58:55.000000000 +0200

      At this point if you try to have a look into the file as root, you get ENOPERM
      [root@clientae test_dir]# cat afile
      cat: afile: Permission denied
      even if you already got access to the content with the authorized user.

      But, if the file is opened by the user ('tail -f afile' for exemple), root get access to the content of the file as well
      [root@clientae test_dir]# tail afile
      coucou
      coucou
      coucou
      coucou
      coucou
      coucou
      coucou
      coucou
      coucou
      coucou

      As soon as the file is closed by the user, root left access to the content(at least can't open the file any more)

      Alex.

      Attachments

        Issue Links

          Activity

            [LU-1778] Root Squash is not always properly enforced

            Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/10743/
            Subject: LU-1778 libcfs: add a service that prints a nidlist
            Project: fs/lustre-release
            Branch: b2_5
            Current Patch Set:
            Commit: 57a8a6bec4dc965388b5bba48e7501f79bdab44b

            gerrit Gerrit Updater added a comment - Oleg Drokin (oleg.drokin@intel.com) merged in patch http://review.whamcloud.com/10743/ Subject: LU-1778 libcfs: add a service that prints a nidlist Project: fs/lustre-release Branch: b2_5 Current Patch Set: Commit: 57a8a6bec4dc965388b5bba48e7501f79bdab44b

            The two above patches #10743 and #10744 have been posted and are ready for review since end of June.
            Would it be possible to have them included in the next 2.5 maintenance release: 2.5.3 ?

            pichong Gregoire Pichon added a comment - The two above patches #10743 and #10744 have been posted and are ready for review since end of June. Would it be possible to have them included in the next 2.5 maintenance release: 2.5.3 ?

            I have backported the two patches to be integrated in 2.5 maintenance release.
            http://review.whamcloud.com/10743
            http://review.whamcloud.com/10744

            pichong Gregoire Pichon added a comment - I have backported the two patches to be integrated in 2.5 maintenance release. http://review.whamcloud.com/10743 http://review.whamcloud.com/10744
            pjones Peter Jones added a comment -

            Now really landed for 2.6.

            pjones Peter Jones added a comment - Now really landed for 2.6.

            This ticket has not been fixed yet.
            The main patch http://review.whamcloud.com/#change,5700 is still in progress.

            pichong Gregoire Pichon added a comment - This ticket has not been fixed yet. The main patch http://review.whamcloud.com/#change,5700 is still in progress.

            Patch landed to Master

            jlevi Jodi Levi (Inactive) added a comment - Patch landed to Master

            The patch #8479 has been landed and then reverted due to a conflit with GNIIPLND patch.

            I have posted a new version of the patch: http://review.whamcloud.com/9221

            pichong Gregoire Pichon added a comment - The patch #8479 has been landed and then reverted due to a conflit with GNIIPLND patch. I have posted a new version of the patch: http://review.whamcloud.com/9221

            Thank you. Would it be possible for you to rebase this on current master? There are a few conflicts preventing merge.

            cliffw Cliff White (Inactive) added a comment - Thank you. Would it be possible for you to rebase this on current master? There are a few conflicts preventing merge.

            I have posted another patch that adds a service to print a nidlist: http://review.whamcloud.com/#/c/8479/ . After review of patchset 11 of #5700 patch, it seems to be a requirement.

            pichong Gregoire Pichon added a comment - I have posted another patch that adds a service to print a nidlist: http://review.whamcloud.com/#/c/8479/ . After review of patchset 11 of #5700 patch, it seems to be a requirement.

            client log from Maloo test on patchset 7 (Jul 19 10:12 PM)

            pichong Gregoire Pichon added a comment - client log from Maloo test on patchset 7 (Jul 19 10:12 PM)

            Tests on the patchset 7 and 8 have made the client hang after conf-sanity test_43 (the one for root squash). I have been able to reproduce the hang (after 16 successful runs) and took a dump.

            It is available on ftp.whamcloud.com in /uploads/LU-1778
            ftp> dir
            227 Entering Passive Mode (72,18,218,227,205,178).
            150 Here comes the directory listing.
            rw-rr- 1 123 114 3387608 Jul 25 07:22 lustre-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 45824206 Jul 25 07:23 lustre-debuginfo-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 181316 Jul 25 07:23 lustre-ldiskfs-4.1.0-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 1674715 Jul 25 07:23 lustre-ldiskfs-debuginfo-4.1.0-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 3312152 Jul 25 07:23 lustre-modules-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 165060 Jul 25 07:24 lustre-osd-ldiskfs-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 5067172 Jul 25 07:24 lustre-source-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 4757320 Jul 25 07:24 lustre-tests-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm
            rw-rr- 1 123 114 100181834 Jul 25 07:27 vmcore

            Here are the information I have extracted from the dump.

            The unmount command seems hung. Higher part of the stack is due to the dump signal.
            
            crash> bt 2723
            PID: 2723   TASK: ffff88046ab98040  CPU: 4   COMMAND: "umount"
             #0 [ffff880028307e90] crash_nmi_callback at ffffffff8102d2c6
             #1 [ffff880028307ea0] notifier_call_chain at ffffffff815131d5
             #2 [ffff880028307ee0] atomic_notifier_call_chain at ffffffff8151323a
             #3 [ffff880028307ef0] notify_die at ffffffff8109cbfe
             #4 [ffff880028307f20] do_nmi at ffffffff81510e9b
             #5 [ffff880028307f50] nmi at ffffffff81510760
                [exception RIP: page_fault]
                RIP: ffffffff815104b0  RSP: ffff880472c13bc0  RFLAGS: 00000082
                RAX: ffffc9001dd57008  RBX: ffff880470b27e40  RCX: 000000000000000f
                RDX: ffffc9001dd1d000  RSI: ffff880472c13c08  RDI: ffff880470b27e40
                RBP: ffff880472c13c48   R8: 0000000000000000   R9: 00000000fffffffe
                R10: 0000000000000001  R11: 5a5a5a5a5a5a5a5a  R12: ffff880472c13c08 = struct cl_site *
                R13: 00000000000000c4  R14: 0000000000000000  R15: 0000000000000000
                ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
            --- <NMI exception stack> ---
             #6 [ffff880472c13bc0] page_fault at ffffffff815104b0
             #7 [ffff880472c13bc8] cfs_hash_putref at ffffffffa04305c1 [libcfs]
             #8 [ffff880472c13c50] lu_site_fini at ffffffffa0588841 [obdclass]
             #9 [ffff880472c13c70] cl_site_fini at ffffffffa0591d0e [obdclass]
            #10 [ffff880472c13c80] ccc_device_free at ffffffffa0e6c16a [lustre]
            #11 [ffff880472c13cb0] lu_stack_fini at ffffffffa058b22e [obdclass]
            #12 [ffff880472c13cf0] cl_stack_fini at ffffffffa059132e [obdclass]
            #13 [ffff880472c13d00] cl_sb_fini at ffffffffa0e703bd [lustre]
            #14 [ffff880472c13d40] client_common_put_super at ffffffffa0e353d4 [lustre]
            #15 [ffff880472c13d70] ll_put_super at ffffffffa0e35ef9 [lustre]
            #16 [ffff880472c13e30] generic_shutdown_super at ffffffff8118326b
            #17 [ffff880472c13e50] kill_anon_super at ffffffff81183356
            #18 [ffff880472c13e70] lustre_kill_super at ffffffffa057d37a [obdclass]
            #19 [ffff880472c13e90] deactivate_super at ffffffff81183af7
            #20 [ffff880472c13eb0] mntput_no_expire at ffffffff811a1b6f
            #21 [ffff880472c13ee0] sys_umount at ffffffff811a25db
            #22 [ffff880472c13f80] system_call_fastpath at ffffffff8100b072
                RIP: 00007f0e6a971717  RSP: 00007fff17919878  RFLAGS: 00010206
                RAX: 00000000000000a6  RBX: ffffffff8100b072  RCX: 0000000000000010
                RDX: 0000000000000000  RSI: 0000000000000000  RDI: 00007f0e6c3cfb90
                RBP: 00007f0e6c3cfb70   R8: 00007f0e6c3cfbb0   R9: 0000000000000000
                R10: 00007fff179196a0  R11: 0000000000000246  R12: 0000000000000000
                R13: 0000000000000000  R14: 0000000000000000  R15: 00007f0e6c3cfbf0
                ORIG_RAX: 00000000000000a6  CS: 0033  SS: 002b
            
            ccc_device_free() is called on lu_device 0xffff880475ad06c0
            
            crash> struct lu_device ffff880475ad06c0
            struct lu_device {
              ld_ref = {
                counter = 1
              }, 
              ld_type = 0xffffffffa0ea22e0, 
              ld_ops = 0xffffffffa0e787a0, 
              ld_site = 0xffff880472cf05c0, 
              ld_proc_entry = 0x0, 
              ld_obd = 0x0, 
              ld_reference = {<No data fields>}, 
              ld_linkage = {
                next = 0xffff880472cf05f0, 
                prev = 0xffff880472cf05f0
              }
            }
            
            ld_type->ldt_tags
            crash> rd -8 ffffffffa0ea22e0
            ffffffffa0ea22e0:  04 = LU_DEVICE_CL
            
            ld_type->ldt_name
            crash> rd  ffffffffa0ea22e8
            ffffffffa0ea22e8:  ffffffffa0e7d09f = "vvp"
            
            
            lu_site=ffff880472cf05c0
            crash> struct lu_site ffff880472cf05c0
            struct lu_site {
              ls_obj_hash = 0xffff880470b27e40, 
              ls_purge_start = 0, 
              ls_top_dev = 0xffff880475ad06c0, 
              ls_bottom_dev = 0x0, 
              ls_linkage = {
                next = 0xffff880472cf05e0, 
                prev = 0xffff880472cf05e0
              }, 
              ls_ld_linkage = {
                next = 0xffff880475ad06f0, 
                prev = 0xffff880475ad06f0
              }, 
              ls_ld_lock = {
                raw_lock = {
                  slock = 65537
                }
              }, 
              ls_stats = 0xffff880470b279c0, 
              ld_seq_site = 0x0
            }
            
            crash> struct cfs_hash 0xffff880470b27e40
            struct cfs_hash {
              hs_lock = {
                rw = {
                  raw_lock = {
                    lock = 0
                  }
                }, 
                spin = {
                  raw_lock = {
                    slock = 0
                  }
                }
              }, 
              hs_ops = 0xffffffffa05edee0, 
              hs_lops = 0xffffffffa044e320, 
              hs_hops = 0xffffffffa044e400,
              hs_buckets = 0xffff880471e4f000, 
              hs_count = {
                counter = 0
              }, 
              hs_flags = 6184, = 0x1828 = CFS_HASH_SPIN_BKTLOCK | CFS_HASH_NO_ITEMREF | CFS_HASH_ASSERT_EMPTY | CFS_HASH_DEPTH 
              hs_extra_bytes = 48, 
              hs_iterating = 0 '\000', 
              hs_exiting = 1 '\001', 
              hs_cur_bits = 23 '\027', 
              hs_min_bits = 23 '\027', 
              hs_max_bits = 23 '\027', 
              hs_rehash_bits = 0 '\000', 
              hs_bkt_bits = 15 '\017', 
              hs_min_theta = 0, 
              hs_max_theta = 0, 
              hs_rehash_count = 0, 
              hs_iterators = 0, 
              hs_rehash_wi = {
                wi_list = {
                  next = 0xffff880470b27e88, 
                  prev = 0xffff880470b27e88
                }, 
                wi_action = 0xffffffffa04310f0 <cfs_hash_rehash_worker>, 
                wi_data = 0xffff880470b27e40, 
                wi_running = 0, 
                wi_scheduled = 0
              }, 
              hs_refcount = {
                counter = 0
              }, 
              hs_rehash_buckets = 0x0, 
              hs_name = 0xffff880470b27ec0 "lu_site_vvp"
            }
            

            I am going to attach the log of the Maloo test that hung (Jul 19 10:12 PM).

            pichong Gregoire Pichon added a comment - Tests on the patchset 7 and 8 have made the client hang after conf-sanity test_43 (the one for root squash). I have been able to reproduce the hang (after 16 successful runs) and took a dump. It is available on ftp.whamcloud.com in /uploads/ LU-1778 ftp> dir 227 Entering Passive Mode (72,18,218,227,205,178). 150 Here comes the directory listing. rw-r r - 1 123 114 3387608 Jul 25 07:22 lustre-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 45824206 Jul 25 07:23 lustre-debuginfo-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 181316 Jul 25 07:23 lustre-ldiskfs-4.1.0-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 1674715 Jul 25 07:23 lustre-ldiskfs-debuginfo-4.1.0-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 3312152 Jul 25 07:23 lustre-modules-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 165060 Jul 25 07:24 lustre-osd-ldiskfs-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 5067172 Jul 25 07:24 lustre-source-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 4757320 Jul 25 07:24 lustre-tests-2.4.51-2.6.32_358.el6.x86_64_g4c66dbd.x86_64.rpm rw-r r - 1 123 114 100181834 Jul 25 07:27 vmcore Here are the information I have extracted from the dump. The unmount command seems hung. Higher part of the stack is due to the dump signal. crash> bt 2723 PID: 2723 TASK: ffff88046ab98040 CPU: 4 COMMAND: "umount" #0 [ffff880028307e90] crash_nmi_callback at ffffffff8102d2c6 #1 [ffff880028307ea0] notifier_call_chain at ffffffff815131d5 #2 [ffff880028307ee0] atomic_notifier_call_chain at ffffffff8151323a #3 [ffff880028307ef0] notify_die at ffffffff8109cbfe #4 [ffff880028307f20] do_nmi at ffffffff81510e9b #5 [ffff880028307f50] nmi at ffffffff81510760 [exception RIP: page_fault] RIP: ffffffff815104b0 RSP: ffff880472c13bc0 RFLAGS: 00000082 RAX: ffffc9001dd57008 RBX: ffff880470b27e40 RCX: 000000000000000f RDX: ffffc9001dd1d000 RSI: ffff880472c13c08 RDI: ffff880470b27e40 RBP: ffff880472c13c48 R8: 0000000000000000 R9: 00000000fffffffe R10: 0000000000000001 R11: 5a5a5a5a5a5a5a5a R12: ffff880472c13c08 = struct cl_site * R13: 00000000000000c4 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #6 [ffff880472c13bc0] page_fault at ffffffff815104b0 #7 [ffff880472c13bc8] cfs_hash_putref at ffffffffa04305c1 [libcfs] #8 [ffff880472c13c50] lu_site_fini at ffffffffa0588841 [obdclass] #9 [ffff880472c13c70] cl_site_fini at ffffffffa0591d0e [obdclass] #10 [ffff880472c13c80] ccc_device_free at ffffffffa0e6c16a [lustre] #11 [ffff880472c13cb0] lu_stack_fini at ffffffffa058b22e [obdclass] #12 [ffff880472c13cf0] cl_stack_fini at ffffffffa059132e [obdclass] #13 [ffff880472c13d00] cl_sb_fini at ffffffffa0e703bd [lustre] #14 [ffff880472c13d40] client_common_put_super at ffffffffa0e353d4 [lustre] #15 [ffff880472c13d70] ll_put_super at ffffffffa0e35ef9 [lustre] #16 [ffff880472c13e30] generic_shutdown_super at ffffffff8118326b #17 [ffff880472c13e50] kill_anon_super at ffffffff81183356 #18 [ffff880472c13e70] lustre_kill_super at ffffffffa057d37a [obdclass] #19 [ffff880472c13e90] deactivate_super at ffffffff81183af7 #20 [ffff880472c13eb0] mntput_no_expire at ffffffff811a1b6f #21 [ffff880472c13ee0] sys_umount at ffffffff811a25db #22 [ffff880472c13f80] system_call_fastpath at ffffffff8100b072 RIP: 00007f0e6a971717 RSP: 00007fff17919878 RFLAGS: 00010206 RAX: 00000000000000a6 RBX: ffffffff8100b072 RCX: 0000000000000010 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f0e6c3cfb90 RBP: 00007f0e6c3cfb70 R8: 00007f0e6c3cfbb0 R9: 0000000000000000 R10: 00007fff179196a0 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 00007f0e6c3cfbf0 ORIG_RAX: 00000000000000a6 CS: 0033 SS: 002b ccc_device_free() is called on lu_device 0xffff880475ad06c0 crash> struct lu_device ffff880475ad06c0 struct lu_device { ld_ref = { counter = 1 }, ld_type = 0xffffffffa0ea22e0, ld_ops = 0xffffffffa0e787a0, ld_site = 0xffff880472cf05c0, ld_proc_entry = 0x0, ld_obd = 0x0, ld_reference = {<No data fields>}, ld_linkage = { next = 0xffff880472cf05f0, prev = 0xffff880472cf05f0 } } ld_type->ldt_tags crash> rd -8 ffffffffa0ea22e0 ffffffffa0ea22e0: 04 = LU_DEVICE_CL ld_type->ldt_name crash> rd ffffffffa0ea22e8 ffffffffa0ea22e8: ffffffffa0e7d09f = "vvp" lu_site=ffff880472cf05c0 crash> struct lu_site ffff880472cf05c0 struct lu_site { ls_obj_hash = 0xffff880470b27e40, ls_purge_start = 0, ls_top_dev = 0xffff880475ad06c0, ls_bottom_dev = 0x0, ls_linkage = { next = 0xffff880472cf05e0, prev = 0xffff880472cf05e0 }, ls_ld_linkage = { next = 0xffff880475ad06f0, prev = 0xffff880475ad06f0 }, ls_ld_lock = { raw_lock = { slock = 65537 } }, ls_stats = 0xffff880470b279c0, ld_seq_site = 0x0 } crash> struct cfs_hash 0xffff880470b27e40 struct cfs_hash { hs_lock = { rw = { raw_lock = { lock = 0 } }, spin = { raw_lock = { slock = 0 } } }, hs_ops = 0xffffffffa05edee0, hs_lops = 0xffffffffa044e320, hs_hops = 0xffffffffa044e400, hs_buckets = 0xffff880471e4f000, hs_count = { counter = 0 }, hs_flags = 6184, = 0x1828 = CFS_HASH_SPIN_BKTLOCK | CFS_HASH_NO_ITEMREF | CFS_HASH_ASSERT_EMPTY | CFS_HASH_DEPTH hs_extra_bytes = 48, hs_iterating = 0 '\000', hs_exiting = 1 '\001', hs_cur_bits = 23 '\027', hs_min_bits = 23 '\027', hs_max_bits = 23 '\027', hs_rehash_bits = 0 '\000', hs_bkt_bits = 15 '\017', hs_min_theta = 0, hs_max_theta = 0, hs_rehash_count = 0, hs_iterators = 0, hs_rehash_wi = { wi_list = { next = 0xffff880470b27e88, prev = 0xffff880470b27e88 }, wi_action = 0xffffffffa04310f0 <cfs_hash_rehash_worker>, wi_data = 0xffff880470b27e40, wi_running = 0, wi_scheduled = 0 }, hs_refcount = { counter = 0 }, hs_rehash_buckets = 0x0, hs_name = 0xffff880470b27ec0 "lu_site_vvp" } I am going to attach the log of the Maloo test that hung (Jul 19 10:12 PM).

            People

              niu Niu Yawei (Inactive)
              louveta Alexandre Louvet (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: