Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1518

Missing/bad operations in mdd_{obf,dot_lustre}_obj_op causing LBUGs

Details

    • 3
    • 4470

    Description

      An unprivileged user can cause an MDS LBUG by issuing "chmod +x /mnt/lustre/.lustre/fid". Similarly root can cause an LBUG by issuing

      {get,set}

      facl calls against this directory. Privileged users cannot changes the attributes of /mnt/lustre/.lustre.

      Attachments

        Issue Links

          Activity

            [LU-1518] Missing/bad operations in mdd_{obf,dot_lustre}_obj_op causing LBUGs

            refreshed version available for review at http://review.whamcloud.com/#change,3726

            I tend to agree that the approach minimizing special case code would be more elegant, but couldn't massage the existing attachment into a working form. refreshed the old commit as the quicker path.

            bogl Bob Glossman (Inactive) added a comment - refreshed version available for review at http://review.whamcloud.com/#change,3726 I tend to agree that the approach minimizing special case code would be more elegant, but couldn't massage the existing attachment into a working form. refreshed the old commit as the quicker path.
            jhammond John Hammond added a comment -

            Hi Bob,

            First off, please post your version when you get a chance.

            If you're basing on 3013 then .lustre has a special lookup() implementation. So logically no creates, links, unlinks, rmdirs, or renames (to/from/into/onto) should be allowed---since you could never access the file by that name afterwards.

            I say allow chown, chmod, getfattr, setfattr on .lustre and .lustre/fid.

            stat '.lustre/fid' should succeed. That it failed is worrisome. After your patch did you check that open-by-fid still worked as intended?

            All that said, I think the real-boy approach is much more sound in the long term. I will rebase my build setup to see about porting the patch to master.

            Thanks,

            John

            jhammond John Hammond added a comment - Hi Bob, First off, please post your version when you get a chance. If you're basing on 3013 then .lustre has a special lookup() implementation. So logically no creates, links, unlinks, rmdirs, or renames (to/from/into/onto) should be allowed---since you could never access the file by that name afterwards. I say allow chown, chmod, getfattr, setfattr on .lustre and .lustre/fid. stat '.lustre/fid' should succeed. That it failed is worrisome. After your patch did you check that open-by-fid still worked as intended? All that said, I think the real-boy approach is much more sound in the long term. I will rebase my build setup to see about porting the patch to master. Thanks, John

            John,
            I couldn't get the real inode approach from your attachment to fly at all. Instead I've been looking at rebasing the patch in http://review.whamcloud.com/#change,3103 on current master. I've gotten it to build & run, but can't get it to pass the sanity tests 154b,c from your sanity tests patch. I edited out all the echo "SKIPPING" to enable everything. I'm not seeing any oops from missing ops, but I'm not seeing passes either. I'm not clear about which ops are supposed to succeed and which are supposed to fail. results look like:

            == sanity test 154b: Test md operations on .lustre == 11:13:50 (1345486430)
            stat /mnt/lustre/.lustre
            touch /mnt/lustre/.lustre
            ln /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154/x154
            ln: `/mnt/lustre/.lustre': hard link not allowed for directory
            touch /mnt/lustre/.lustre/f154
            touch .lustre/f154 should fail.
            mknod /mnt/lustre/.lustre/p154 p
            mknod .lustre/p154 p should fail.
            mkdir /mnt/lustre/.lustre/d154
            mkdir .lustre/d154 should fail.
            ls /mnt/lustre/.lustre
            d154
            f154
            p154
            chmod ugo+rwx /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            skipping chown 0:0 /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            setfattr -n user.test_md_ops -v foo /mnt/lustre/.lustre (Oops).
            setfattr: /mnt/lustre/.lustre: Operation not permitted
            getfattr -n user.test_md_ops /mnt/lustre/.lustre
            getfattr: Removing leading '/' from absolute path names

            1. file: mnt/lustre/.lustre
              user.test_md_ops

            getfattr -d /mnt/lustre/.lustre (LBUG in mo_xattr_list()).
            setfattr -x user.test_md_ops /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            setfattr: /mnt/lustre/.lustre: Operation not permitted
            rename /mnt/lustre/d0.sanity/d154 onto /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            mv: cannot move `/mnt/lustre/d0.sanity/d154' to `/mnt/lustre/.lustre': Directory not empty
            rename /mnt/lustre/.lustre into /mnt/lustre/d0.sanity/d154
            sanity test_154b: @@@@@@ FAIL: mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail.
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/sanity.sh:7921:test_md_ops()
            = /usr/lib64/lustre/tests/sanity.sh:7931:test_154b()
            = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one()
            = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged()
            = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test()
            = /usr/lib64/lustre/tests/sanity.sh:7934:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154b.*.1345486430.log
            Dumping logs only on local client.
            FAIL 154b (0s)

            == sanity test 154c: Test md operations on .lustre/fid == 11:13:50 (1345486430)
            stat /mnt/lustre/.lustre/fid
            stat: cannot stat `/mnt/lustre/.lustre/fid': No such file or directory
            sanity test_154c: @@@@@@ FAIL: cannot stat /mnt/lustre/.lustre/fid.
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/sanity.sh:7874:test_md_ops()
            = /usr/lib64/lustre/tests/sanity.sh:7939:test_154c()
            = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one()
            = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged()
            = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test()
            = /usr/lib64/lustre/tests/sanity.sh:7942:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154c.*.1345486430.log
            Dumping logs only on local client.
            FAIL 154c (0s)
            ..................................................== sanity sanity.sh test complete, duration 1 sec == 11:13:50 (1345486430)
            /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154b mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail.
            /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154c cannot stat /mnt/lustre/.lustre/fid.
            rm: cannot remove `/mnt/lustre/d0.sanity/d154/.lustre': Operation not permitted
            sanity : @@@@@@ FAIL: remove sub-test dirs failed
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/test-framework.sh:3238:check_and_cleanup_lustre()
            = /usr/lib64/lustre/tests/sanity.sh:9648:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity..*.1345486430.log
            Dumping logs only on local client.
            sanity returned 0
            Finished at Mon Aug 20 11:13:50 PDT 2012 in 2s
            /usr/lib64/lustre/tests/auster: completed with rc 0

            bogl Bob Glossman (Inactive) added a comment - John, I couldn't get the real inode approach from your attachment to fly at all. Instead I've been looking at rebasing the patch in http://review.whamcloud.com/#change,3103 on current master. I've gotten it to build & run, but can't get it to pass the sanity tests 154b,c from your sanity tests patch. I edited out all the echo "SKIPPING" to enable everything. I'm not seeing any oops from missing ops, but I'm not seeing passes either. I'm not clear about which ops are supposed to succeed and which are supposed to fail. results look like: == sanity test 154b: Test md operations on .lustre == 11:13:50 (1345486430) stat /mnt/lustre/.lustre touch /mnt/lustre/.lustre ln /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154/x154 ln: `/mnt/lustre/.lustre': hard link not allowed for directory touch /mnt/lustre/.lustre/f154 touch .lustre/f154 should fail. mknod /mnt/lustre/.lustre/p154 p mknod .lustre/p154 p should fail. mkdir /mnt/lustre/.lustre/d154 mkdir .lustre/d154 should fail. ls /mnt/lustre/.lustre d154 f154 p154 chmod ugo+rwx /mnt/lustre/.lustre (Oops in osd_xattr_get()). skipping chown 0:0 /mnt/lustre/.lustre (Oops in osd_xattr_get()). setfattr -n user.test_md_ops -v foo /mnt/lustre/.lustre (Oops). setfattr: /mnt/lustre/.lustre: Operation not permitted getfattr -n user.test_md_ops /mnt/lustre/.lustre getfattr: Removing leading '/' from absolute path names file: mnt/lustre/.lustre user.test_md_ops getfattr -d /mnt/lustre/.lustre (LBUG in mo_xattr_list()). setfattr -x user.test_md_ops /mnt/lustre/.lustre (Oops in osd_xattr_get()). setfattr: /mnt/lustre/.lustre: Operation not permitted rename /mnt/lustre/d0.sanity/d154 onto /mnt/lustre/.lustre (Oops in osd_xattr_get()). mv: cannot move `/mnt/lustre/d0.sanity/d154' to `/mnt/lustre/.lustre': Directory not empty rename /mnt/lustre/.lustre into /mnt/lustre/d0.sanity/d154 sanity test_154b: @@@@@@ FAIL: mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail. Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/sanity.sh:7921:test_md_ops() = /usr/lib64/lustre/tests/sanity.sh:7931:test_154b() = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one() = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test() = /usr/lib64/lustre/tests/sanity.sh:7934:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154b.*.1345486430.log Dumping logs only on local client. FAIL 154b (0s) == sanity test 154c: Test md operations on .lustre/fid == 11:13:50 (1345486430) stat /mnt/lustre/.lustre/fid stat: cannot stat `/mnt/lustre/.lustre/fid': No such file or directory sanity test_154c: @@@@@@ FAIL: cannot stat /mnt/lustre/.lustre/fid. Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/sanity.sh:7874:test_md_ops() = /usr/lib64/lustre/tests/sanity.sh:7939:test_154c() = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one() = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test() = /usr/lib64/lustre/tests/sanity.sh:7942:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154c.*.1345486430.log Dumping logs only on local client. FAIL 154c (0s) ..................................................== sanity sanity.sh test complete, duration 1 sec == 11:13:50 (1345486430) /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154b mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail. /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154c cannot stat /mnt/lustre/.lustre/fid. rm: cannot remove `/mnt/lustre/d0.sanity/d154/.lustre': Operation not permitted sanity : @@@@@@ FAIL: remove sub-test dirs failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/test-framework.sh:3238:check_and_cleanup_lustre() = /usr/lib64/lustre/tests/sanity.sh:9648:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity..*.1345486430.log Dumping logs only on local client. sanity returned 0 Finished at Mon Aug 20 11:13:50 PDT 2012 in 2s /usr/lib64/lustre/tests/auster: completed with rc 0
            pjones Peter Jones added a comment -

            Bob

            Could you please help out with this one?

            Thanks

            Peter

            pjones Peter Jones added a comment - Bob Could you please help out with this one? Thanks Peter

            I will work to get an inode approach patch out for review.

            keith Keith Mannthey (Inactive) added a comment - I will work to get an inode approach patch out for review.

            Making this a blocker for 2.1 and 2.3. The patch in http://review.whamcloud.com/3103 was merged and then reverted, because it was based on the wrong branch. That patch needs to be updated for master, since it otherwise presents a DOS possibility. There are a number of related bugs that may also need to be fixed in order to avoid potential problems.

            adilger Andreas Dilger added a comment - Making this a blocker for 2.1 and 2.3. The patch in http://review.whamcloud.com/3103 was merged and then reverted, because it was based on the wrong branch. That patch needs to be updated for master, since it otherwise presents a DOS possibility. There are a number of related bugs that may also need to be fixed in order to avoid potential problems.

            I am holding off until we make a direction decision.

            keith Keith Mannthey (Inactive) added a comment - I am holding off until we make a direction decision.

            Alex and Fan Yong, can you please comment on John's proposal? I'm all for reducing code complexity where possible.

            adilger Andreas Dilger added a comment - Alex and Fan Yong, can you please comment on John's proposal? I'm all for reducing code complexity where possible.
            jhammond John Hammond added a comment -

            From my point of view, giving .lustre/fid a real inode has some advantages:

            1. We address LU-1518, LU-1531, LU-1550, LU-1553, and some currently un-LU-ed issues mostly by deleting some special case code. Whereas, keeping .lustre/fid virtual while addressing these issues will likely mean adding code to the general case handlers. Watching master lately, the real inode approach seems much more durable in the face of the osd restructuring landings.

            2. We give admins a transparent way (chmod, chown) to make persistent changes to who can perform open-by-fid. Again without adding any special code or sysfs paramaters.

            There doesn't seem to be a difference in performance between the two approaches. What downsides do you see?

            jhammond John Hammond added a comment - From my point of view, giving .lustre/fid a real inode has some advantages: 1. We address LU-1518 , LU-1531 , LU-1550 , LU-1553 , and some currently un-LU-ed issues mostly by deleting some special case code. Whereas, keeping .lustre/fid virtual while addressing these issues will likely mean adding code to the general case handlers. Watching master lately, the real inode approach seems much more durable in the face of the osd restructuring landings. 2. We give admins a transparent way (chmod, chown) to make persistent changes to who can perform open-by-fid. Again without adding any special code or sysfs paramaters. There doesn't seem to be a difference in performance between the two approaches. What downsides do you see?

            John, I don't think .lustre/fid should be given a real inode on disk. This is a "virtual" interface for lookup-by-FID, and does not correspond to a real file at all.

            adilger Andreas Dilger added a comment - John, I don't think .lustre/fid should be given a real inode on disk. This is a "virtual" interface for lookup-by-FID, and does not correspond to a real file at all.
            jhammond John Hammond added a comment -

            Hi Keith,

            Cool.

            I was poking at a patch to mdd_device.c that would give .lustre/fid a real inode on disk. I don't know if you've been doing the same. It's not ready for primetime but it does address many of the issues above, so I will attach it here.

            Best,

            John

            jhammond John Hammond added a comment - Hi Keith, Cool. I was poking at a patch to mdd_device.c that would give .lustre/fid a real inode on disk. I don't know if you've been doing the same. It's not ready for primetime but it does address many of the issues above, so I will attach it here. Best, John

            People

              niu Niu Yawei (Inactive)
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: