Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-1518

Missing/bad operations in mdd_{obf,dot_lustre}_obj_op causing LBUGs

Details

    • 3
    • 4470

    Description

      An unprivileged user can cause an MDS LBUG by issuing "chmod +x /mnt/lustre/.lustre/fid". Similarly root can cause an LBUG by issuing

      {get,set}

      facl calls against this directory. Privileged users cannot changes the attributes of /mnt/lustre/.lustre.

      Attachments

        Issue Links

          Activity

            [LU-1518] Missing/bad operations in mdd_{obf,dot_lustre}_obj_op causing LBUGs
            jhammond John Hammond added a comment -

            Please add LU-1777 as a sub-issue, as I guess I don't have sufficient Jira clout to do so.

            jhammond John Hammond added a comment - Please add LU-1777 as a sub-issue, as I guess I don't have sufficient Jira clout to do so.
            bogl Bob Glossman (Inactive) added a comment - - edited

            patch with incremental special case test in mdt_reint_setattr()

            bogl Bob Glossman (Inactive) added a comment - - edited patch with incremental special case test in mdt_reint_setattr()

            I see that cut/paste didn't go in very nicely. Will add the patch as an attachment with better formatting.

            bogl Bob Glossman (Inactive) added a comment - I see that cut/paste didn't go in very nicely. Will add the patch as an attachment with better formatting.
            bogl Bob Glossman (Inactive) added a comment - - edited

            John,
            I think the small additional patch below will avoid the BUG you see on chown of .lustre/fid. I really hate adding even more special case code, but I don't see a way around it in the current framework. I note that many other mdt_reint ops already have special case tests for mdt_object_obf(). I'm not expert enough to spot additional places that might also need special case checks.

            --- a/lustre/mdt/mdt_reint.c
            +++ b/lustre/mdt/mdt_reint.c
            @@ -492,6 +492,9 @@ static int mdt_reint_setattr(struct mdt_thread_info *info,
                     if (IS_ERR(mo))
                             GOTO(out, rc = PTR_ERR(mo));
             
            +       if (mdt_object_obf(mo))
            +               GOTO(out_put, rc = -EPERM);
            +
                     /* start a log jounal handle if needed */
                     if (!(mdt_conn_flags(info) & OBD_CONNECT_SOM)) {
                             if ((ma->ma_attr.la_valid & LA_SIZE) ||
            
            
            bogl Bob Glossman (Inactive) added a comment - - edited John, I think the small additional patch below will avoid the BUG you see on chown of .lustre/fid. I really hate adding even more special case code, but I don't see a way around it in the current framework. I note that many other mdt_reint ops already have special case tests for mdt_object_obf(). I'm not expert enough to spot additional places that might also need special case checks. --- a/lustre/mdt/mdt_reint.c +++ b/lustre/mdt/mdt_reint.c @@ -492,6 +492,9 @@ static int mdt_reint_setattr(struct mdt_thread_info *info, if (IS_ERR(mo)) GOTO(out, rc = PTR_ERR(mo)); + if (mdt_object_obf(mo)) + GOTO(out_put, rc = -EPERM); + /* start a log jounal handle if needed */ if (!(mdt_conn_flags(info) & OBD_CONNECT_SOM)) { if ((ma->ma_attr.la_valid & LA_SIZE) ||

            John, given this LBUG I'm surprised you gave http://review.whamcloud.com/#change,3726 a +review.

            bogl Bob Glossman (Inactive) added a comment - John, given this LBUG I'm surprised you gave http://review.whamcloud.com/#change,3726 a +review.

            I think this patch is fine for 2.1.x, but may not be the right approach for master (I tried 2.2.93 + epsilon). There are places where the osd code assumes an inode and some of these are before any of the members of mdd_{obf,dot_lustre}_obj_ops are invoked.

            # uname -r
            2.6.32-279.5.1.el6.x86_64
            # cd /usr/src/lustre-release/
            # git show HEAD | head
            commit f67757d350fe010592927a45f0c99d9551165a3b
            Author: John L. Hammond <jhammond@tacc.utexas.edu>
            Date:   Wed Jun 13 11:20:12 2012 -0500
            
                LU-1518 mdd: Fixup mdd_{obf,dot_lustre}_obj_ops.
            
                Define several missing md_object ops for .lustre/fid.  Unify
                attribute handling for .lustre with that of normal md_objects.
            
                Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu>
            # ./lustre/tests/llmount.sh
            ...
            # cat /proc/fs/lustre/version
            lustre: 2.2.93
            kernel: patchless_client
            build:  2.2.93-gbaaf628-CHANGED-2.6.32-279.5.1.el6.x86_64
            # cd /mnt/lustre/
            # su sanity
            $ chown sanity: .lustre
            chown: changing ownership of `.lustre': Operation not permitted
            $ chown sanity: .lustre/fid
            
            BUG: unable to handle kernel NULL pointer dereference at 0000000000000340
            IP: [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs]
            PGD 5d941067 PUD 5d95b067 PMD 0
            Oops: 0000 [#1] SMP
            last sysfs file: /sys/devices/system/cpu/possible
            CPU 0
            Modules linked in: lustre(U) obdfilter(U) ost(U) cmm(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldi\
            skfs(U) exportfs mdd(U) mds(U) mgs(U) lquota(U) jbd obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) f\
            id(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs\
            (U) autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv\
            4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6ta\
            ble_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod microcode virtio_balloon virtio_n\
            et i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_gene\
            ric ata_piix [last unloaded: speedstep_lib]
            
            Pid: 2653, comm: mdt00_002 Not tainted 2.6.32-279.5.1.el6.x86_64 #1 Bochs Bochs
            RIP: 0010:[<ffffffffa0aa7c90>]  [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs]
            RSP: 0018:ffff8800667d1af0  EFLAGS: 00010246
            RAX: 0000000000000000 RBX: ffffffffffffff30 RCX: 0000000000000000
            RDX: ffff88006667e2c0 RSI: ffffffffa04c7a3c RDI: ffffffffa0ac872b
            RBP: ffff8800667d1b30 R08: fffffffffffffffe R09: 00000000fffffffe
            R10: 0000000000000000 R11: 0000000000000004 R12: ffff8800667d1b58
            R13: ffff880066762b80 R14: ffffffffa04c7a2c R15: 0000000000000000
            FS:  00007fdae5c05700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
            CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
            CR2: 0000000000000340 CR3: 000000005d8e6000 CR4: 00000000000006f0
            DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
            DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
            Process mdt00_002 (pid: 2653, threadinfo ffff8800667d0000, task ffff88007bc09500)
            Stack:
             fffffffffffffffe ffff8800667d8000 ffff8800667c3000 ffff88007d366780
            <d> ffff8800667c3000 ffff8800667c3250 ffff88007ce16000 ffff8800667c3010
            <d> ffff8800667d1b60 ffffffffa0497f84 ffff8800667d1b58 0000000000000008
            Call Trace:
             [<ffffffffa0497f84>] dt_version_get+0x54/0x190 [obdclass]
             [<ffffffffa0b2aa3f>] mdt_obj_version_get+0x6f/0x1f0 [mdt]
             [<ffffffffa0b2b73f>] mdt_version_get_check_save+0x2f/0xf0 [mdt]
             [<ffffffffa0b30a51>] mdt_attr_set+0x251/0x590 [mdt]
             [<ffffffffa0b310f5>] mdt_reint_setattr+0x365/0x1330 [mdt]
             [<ffffffffa062f886>] ? __req_capsule_get+0x176/0x750 [ptlrpc]
             [<ffffffffa0603646>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc]
             [<ffffffffa0b2a281>] mdt_reint_rec+0x41/0xe0 [mdt]
             [<ffffffffa0b23ada>] mdt_reint_internal+0x50a/0x810 [mdt]
             [<ffffffffa0b23e24>] mdt_reint+0x44/0xe0 [mdt]
             [<ffffffffa0b17932>] mdt_handle_common+0x922/0x1740 [mdt]
             [<ffffffffa0b18825>] mdt_regular_handle+0x15/0x20 [mdt]
             [<ffffffffa061382d>] ptlrpc_server_handle_request+0x40d/0xea0 [ptlrpc]
             [<ffffffffa031765e>] ? cfs_timer_arm+0xe/0x10 [libcfs]
             [<ffffffffa060acb7>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc]
             [<ffffffff810533f3>] ? __wake_up+0x53/0x70
             [<ffffffffa0614e19>] ptlrpc_main+0xb59/0x1860 [ptlrpc]
             [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
             [<ffffffff8100c14a>] child_rip+0xa/0x20
             [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
             [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc]
             [<ffffffff8100c140>] ? child_rip+0x0/0x20
            Code: 33 4b 03 00 00 00 00 00 c7 05 21 4b 03 00 02 00 00 00 48 c7 c7 80 c7 ad a0 48 8b 48 40 48 8b \
            93 10 04 00 00 31 c0 e8 50 f8 87 ff <48> 8b 83 10 04 00 00 49 89 04 24 b8 08 00 00 00 48 8b 5d d8 4\
            c
            RIP  [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs]
             RSP <ffff8800667d1af0>
            CR2: 0000000000000340
            
            jhammond John Hammond added a comment - I think this patch is fine for 2.1.x, but may not be the right approach for master (I tried 2.2.93 + epsilon). There are places where the osd code assumes an inode and some of these are before any of the members of mdd_{obf,dot_lustre}_obj_ops are invoked. # uname -r 2.6.32-279.5.1.el6.x86_64 # cd /usr/src/lustre-release/ # git show HEAD | head commit f67757d350fe010592927a45f0c99d9551165a3b Author: John L. Hammond <jhammond@tacc.utexas.edu> Date: Wed Jun 13 11:20:12 2012 -0500 LU-1518 mdd: Fixup mdd_{obf,dot_lustre}_obj_ops. Define several missing md_object ops for .lustre/fid. Unify attribute handling for .lustre with that of normal md_objects. Signed-off-by: John L. Hammond <jhammond@tacc.utexas.edu> # ./lustre/tests/llmount.sh ... # cat /proc/fs/lustre/version lustre: 2.2.93 kernel: patchless_client build: 2.2.93-gbaaf628-CHANGED-2.6.32-279.5.1.el6.x86_64 # cd /mnt/lustre/ # su sanity $ chown sanity: .lustre chown: changing ownership of `.lustre': Operation not permitted $ chown sanity: .lustre/fid BUG: unable to handle kernel NULL pointer dereference at 0000000000000340 IP: [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs] PGD 5d941067 PUD 5d95b067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/system/cpu/possible CPU 0 Modules linked in: lustre(U) obdfilter(U) ost(U) cmm(U) mdt(U) osd_ldiskfs(U) fsfilt_ldiskfs(U) ldi\ skfs(U) exportfs mdd(U) mds(U) mgs(U) lquota(U) jbd obdecho(U) mgc(U) lov(U) osc(U) mdc(U) lmv(U) f\ id(U) fld(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) sha512_generic sha256_generic libcfs\ (U) autofs4 nfs lockd fscache nfs_acl auth_rpcgss sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv\ 4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6ta\ ble_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod microcode virtio_balloon virtio_n\ et i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_gene\ ric ata_piix [last unloaded: speedstep_lib] Pid: 2653, comm: mdt00_002 Not tainted 2.6.32-279.5.1.el6.x86_64 #1 Bochs Bochs RIP: 0010:[<ffffffffa0aa7c90>] [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs] RSP: 0018:ffff8800667d1af0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffffffffffff30 RCX: 0000000000000000 RDX: ffff88006667e2c0 RSI: ffffffffa04c7a3c RDI: ffffffffa0ac872b RBP: ffff8800667d1b30 R08: fffffffffffffffe R09: 00000000fffffffe R10: 0000000000000000 R11: 0000000000000004 R12: ffff8800667d1b58 R13: ffff880066762b80 R14: ffffffffa04c7a2c R15: 0000000000000000 FS: 00007fdae5c05700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000340 CR3: 000000005d8e6000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process mdt00_002 (pid: 2653, threadinfo ffff8800667d0000, task ffff88007bc09500) Stack: fffffffffffffffe ffff8800667d8000 ffff8800667c3000 ffff88007d366780 <d> ffff8800667c3000 ffff8800667c3250 ffff88007ce16000 ffff8800667c3010 <d> ffff8800667d1b60 ffffffffa0497f84 ffff8800667d1b58 0000000000000008 Call Trace: [<ffffffffa0497f84>] dt_version_get+0x54/0x190 [obdclass] [<ffffffffa0b2aa3f>] mdt_obj_version_get+0x6f/0x1f0 [mdt] [<ffffffffa0b2b73f>] mdt_version_get_check_save+0x2f/0xf0 [mdt] [<ffffffffa0b30a51>] mdt_attr_set+0x251/0x590 [mdt] [<ffffffffa0b310f5>] mdt_reint_setattr+0x365/0x1330 [mdt] [<ffffffffa062f886>] ? __req_capsule_get+0x176/0x750 [ptlrpc] [<ffffffffa0603646>] ? lustre_pack_reply_flags+0xb6/0x210 [ptlrpc] [<ffffffffa0b2a281>] mdt_reint_rec+0x41/0xe0 [mdt] [<ffffffffa0b23ada>] mdt_reint_internal+0x50a/0x810 [mdt] [<ffffffffa0b23e24>] mdt_reint+0x44/0xe0 [mdt] [<ffffffffa0b17932>] mdt_handle_common+0x922/0x1740 [mdt] [<ffffffffa0b18825>] mdt_regular_handle+0x15/0x20 [mdt] [<ffffffffa061382d>] ptlrpc_server_handle_request+0x40d/0xea0 [ptlrpc] [<ffffffffa031765e>] ? cfs_timer_arm+0xe/0x10 [libcfs] [<ffffffffa060acb7>] ? ptlrpc_wait_event+0xa7/0x2a0 [ptlrpc] [<ffffffff810533f3>] ? __wake_up+0x53/0x70 [<ffffffffa0614e19>] ptlrpc_main+0xb59/0x1860 [ptlrpc] [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc] [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc] [<ffffffffa06142c0>] ? ptlrpc_main+0x0/0x1860 [ptlrpc] [<ffffffff8100c140>] ? child_rip+0x0/0x20 Code: 33 4b 03 00 00 00 00 00 c7 05 21 4b 03 00 02 00 00 00 48 c7 c7 80 c7 ad a0 48 8b 48 40 48 8b \ 93 10 04 00 00 31 c0 e8 50 f8 87 ff <48> 8b 83 10 04 00 00 49 89 04 24 b8 08 00 00 00 48 8b 5d d8 4\ c RIP [<ffffffffa0aa7c90>] osd_xattr_get+0x170/0x350 [osd_ldiskfs] RSP <ffff8800667d1af0> CR2: 0000000000000340

            And to answer your question, yes the existing open-by-fid test works fine. example results:

            == sanity test 154: Open-by-FID == 13:24:16 (1345494256)
            stat fid [0x200000400:0x3:0x0]
            touch fid [0x200000400:0x3:0x0]
            write to fid [0x200000400:0x3:0x0]
            read fid [0x200000400:0x3:0x0]
            append write to fid [0x200000400:0x3:0x0]
            rename fid [0x200000400:0x3:0x0]
            mv: cannot move `/mnt/lustre/.lustre/fid/[0x200000400:0x3:0x0]' to `/mnt/lustre/f.sanity.154.1': Operation not permitted
            mv: cannot move `/mnt/lustre/f.sanity.154.1' to `/mnt/lustre/.lustre/fid/[0x200000400:0x3:0x0]': Operation not permitted
            truncate fid [0x200000400:0x3:0x0]
            link fid [0x200000400:0x3:0x0]
            setfacl fid [0x200000400:0x3:0x0]
            getfacl fid [0x200000400:0x3:0x0]
            getfacl: Removing leading '/' from absolute path names
            unlink fid [0x200000400:0x3:0x0]
            unlink: cannot unlink `/mnt/lustre/.lustre/fid/[0x200000400:0x3:0x0]': Operation not permitted
            mknod fid [0x200000400:0x3:0x0]
            mknod: `/mnt/lustre/.lustre/fid/[0x200000400:0x3:0x0]': Operation not permitted
            stat non-exist fid [0xf00000400:0x1:0x0]
            stat: cannot stat `/mnt/lustre/.lustre/fid/[0xf00000400:0x1:0x0]': No such file or directory
            write to non-exist fid [0xf00000400:0x1:0x0]
            /usr/lib64/lustre/tests/sanity.sh: line 7846: /mnt/lustre/.lustre/fid/[0xf00000400:0x1:0x0]: Operation not permitted
            link new fid [0xf00000400:0x1:0x0]
            ln: creating hard link `/mnt/lustre/.lustre/fid/[0xf00000400:0x1:0x0]' => `/mnt/lustre/f.sanity.154': Operation not permitted
            ls [0x200000400:0xb:0x0]
            touch [0x200000400:0xb:0x0]/f.sanity.154.1
            touch /mnt/lustre/.lustre/fid/f.sanity.154
            touch: setting times of `/mnt/lustre/.lustre/fid/f.sanity.154': Invalid argument
            Open-by-FID succeeded
            Resetting fail_loc on all nodes...done.
            PASS 154 (1s)

            bogl Bob Glossman (Inactive) added a comment - And to answer your question, yes the existing open-by-fid test works fine. example results: == sanity test 154: Open-by-FID == 13:24:16 (1345494256) stat fid [0x200000400:0x3:0x0] touch fid [0x200000400:0x3:0x0] write to fid [0x200000400:0x3:0x0] read fid [0x200000400:0x3:0x0] append write to fid [0x200000400:0x3:0x0] rename fid [0x200000400:0x3:0x0] mv: cannot move `/mnt/lustre/.lustre/fid/ [0x200000400:0x3:0x0] ' to `/mnt/lustre/f.sanity.154.1': Operation not permitted mv: cannot move `/mnt/lustre/f.sanity.154.1' to `/mnt/lustre/.lustre/fid/ [0x200000400:0x3:0x0] ': Operation not permitted truncate fid [0x200000400:0x3:0x0] link fid [0x200000400:0x3:0x0] setfacl fid [0x200000400:0x3:0x0] getfacl fid [0x200000400:0x3:0x0] getfacl: Removing leading '/' from absolute path names unlink fid [0x200000400:0x3:0x0] unlink: cannot unlink `/mnt/lustre/.lustre/fid/ [0x200000400:0x3:0x0] ': Operation not permitted mknod fid [0x200000400:0x3:0x0] mknod: `/mnt/lustre/.lustre/fid/ [0x200000400:0x3:0x0] ': Operation not permitted stat non-exist fid [0xf00000400:0x1:0x0] stat: cannot stat `/mnt/lustre/.lustre/fid/ [0xf00000400:0x1:0x0] ': No such file or directory write to non-exist fid [0xf00000400:0x1:0x0] /usr/lib64/lustre/tests/sanity.sh: line 7846: /mnt/lustre/.lustre/fid/ [0xf00000400:0x1:0x0] : Operation not permitted link new fid [0xf00000400:0x1:0x0] ln: creating hard link `/mnt/lustre/.lustre/fid/ [0xf00000400:0x1:0x0] ' => `/mnt/lustre/f.sanity.154': Operation not permitted ls [0x200000400:0xb:0x0] touch [0x200000400:0xb:0x0] /f.sanity.154.1 touch /mnt/lustre/.lustre/fid/f.sanity.154 touch: setting times of `/mnt/lustre/.lustre/fid/f.sanity.154': Invalid argument Open-by-FID succeeded Resetting fail_loc on all nodes...done. PASS 154 (1s)

            refreshed version available for review at http://review.whamcloud.com/#change,3726

            I tend to agree that the approach minimizing special case code would be more elegant, but couldn't massage the existing attachment into a working form. refreshed the old commit as the quicker path.

            bogl Bob Glossman (Inactive) added a comment - refreshed version available for review at http://review.whamcloud.com/#change,3726 I tend to agree that the approach minimizing special case code would be more elegant, but couldn't massage the existing attachment into a working form. refreshed the old commit as the quicker path.
            jhammond John Hammond added a comment -

            Hi Bob,

            First off, please post your version when you get a chance.

            If you're basing on 3013 then .lustre has a special lookup() implementation. So logically no creates, links, unlinks, rmdirs, or renames (to/from/into/onto) should be allowed---since you could never access the file by that name afterwards.

            I say allow chown, chmod, getfattr, setfattr on .lustre and .lustre/fid.

            stat '.lustre/fid' should succeed. That it failed is worrisome. After your patch did you check that open-by-fid still worked as intended?

            All that said, I think the real-boy approach is much more sound in the long term. I will rebase my build setup to see about porting the patch to master.

            Thanks,

            John

            jhammond John Hammond added a comment - Hi Bob, First off, please post your version when you get a chance. If you're basing on 3013 then .lustre has a special lookup() implementation. So logically no creates, links, unlinks, rmdirs, or renames (to/from/into/onto) should be allowed---since you could never access the file by that name afterwards. I say allow chown, chmod, getfattr, setfattr on .lustre and .lustre/fid. stat '.lustre/fid' should succeed. That it failed is worrisome. After your patch did you check that open-by-fid still worked as intended? All that said, I think the real-boy approach is much more sound in the long term. I will rebase my build setup to see about porting the patch to master. Thanks, John

            John,
            I couldn't get the real inode approach from your attachment to fly at all. Instead I've been looking at rebasing the patch in http://review.whamcloud.com/#change,3103 on current master. I've gotten it to build & run, but can't get it to pass the sanity tests 154b,c from your sanity tests patch. I edited out all the echo "SKIPPING" to enable everything. I'm not seeing any oops from missing ops, but I'm not seeing passes either. I'm not clear about which ops are supposed to succeed and which are supposed to fail. results look like:

            == sanity test 154b: Test md operations on .lustre == 11:13:50 (1345486430)
            stat /mnt/lustre/.lustre
            touch /mnt/lustre/.lustre
            ln /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154/x154
            ln: `/mnt/lustre/.lustre': hard link not allowed for directory
            touch /mnt/lustre/.lustre/f154
            touch .lustre/f154 should fail.
            mknod /mnt/lustre/.lustre/p154 p
            mknod .lustre/p154 p should fail.
            mkdir /mnt/lustre/.lustre/d154
            mkdir .lustre/d154 should fail.
            ls /mnt/lustre/.lustre
            d154
            f154
            p154
            chmod ugo+rwx /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            skipping chown 0:0 /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            setfattr -n user.test_md_ops -v foo /mnt/lustre/.lustre (Oops).
            setfattr: /mnt/lustre/.lustre: Operation not permitted
            getfattr -n user.test_md_ops /mnt/lustre/.lustre
            getfattr: Removing leading '/' from absolute path names

            1. file: mnt/lustre/.lustre
              user.test_md_ops

            getfattr -d /mnt/lustre/.lustre (LBUG in mo_xattr_list()).
            setfattr -x user.test_md_ops /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            setfattr: /mnt/lustre/.lustre: Operation not permitted
            rename /mnt/lustre/d0.sanity/d154 onto /mnt/lustre/.lustre (Oops in osd_xattr_get()).
            mv: cannot move `/mnt/lustre/d0.sanity/d154' to `/mnt/lustre/.lustre': Directory not empty
            rename /mnt/lustre/.lustre into /mnt/lustre/d0.sanity/d154
            sanity test_154b: @@@@@@ FAIL: mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail.
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/sanity.sh:7921:test_md_ops()
            = /usr/lib64/lustre/tests/sanity.sh:7931:test_154b()
            = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one()
            = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged()
            = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test()
            = /usr/lib64/lustre/tests/sanity.sh:7934:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154b.*.1345486430.log
            Dumping logs only on local client.
            FAIL 154b (0s)

            == sanity test 154c: Test md operations on .lustre/fid == 11:13:50 (1345486430)
            stat /mnt/lustre/.lustre/fid
            stat: cannot stat `/mnt/lustre/.lustre/fid': No such file or directory
            sanity test_154c: @@@@@@ FAIL: cannot stat /mnt/lustre/.lustre/fid.
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/sanity.sh:7874:test_md_ops()
            = /usr/lib64/lustre/tests/sanity.sh:7939:test_154c()
            = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one()
            = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged()
            = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test()
            = /usr/lib64/lustre/tests/sanity.sh:7942:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154c.*.1345486430.log
            Dumping logs only on local client.
            FAIL 154c (0s)
            ..................................................== sanity sanity.sh test complete, duration 1 sec == 11:13:50 (1345486430)
            /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154b mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail.
            /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154c cannot stat /mnt/lustre/.lustre/fid.
            rm: cannot remove `/mnt/lustre/d0.sanity/d154/.lustre': Operation not permitted
            sanity : @@@@@@ FAIL: remove sub-test dirs failed
            Trace dump:
            = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit()
            = /usr/lib64/lustre/tests/test-framework.sh:3639:error()
            = /usr/lib64/lustre/tests/test-framework.sh:3238:check_and_cleanup_lustre()
            = /usr/lib64/lustre/tests/sanity.sh:9648:main()
            Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity..*.1345486430.log
            Dumping logs only on local client.
            sanity returned 0
            Finished at Mon Aug 20 11:13:50 PDT 2012 in 2s
            /usr/lib64/lustre/tests/auster: completed with rc 0

            bogl Bob Glossman (Inactive) added a comment - John, I couldn't get the real inode approach from your attachment to fly at all. Instead I've been looking at rebasing the patch in http://review.whamcloud.com/#change,3103 on current master. I've gotten it to build & run, but can't get it to pass the sanity tests 154b,c from your sanity tests patch. I edited out all the echo "SKIPPING" to enable everything. I'm not seeing any oops from missing ops, but I'm not seeing passes either. I'm not clear about which ops are supposed to succeed and which are supposed to fail. results look like: == sanity test 154b: Test md operations on .lustre == 11:13:50 (1345486430) stat /mnt/lustre/.lustre touch /mnt/lustre/.lustre ln /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154/x154 ln: `/mnt/lustre/.lustre': hard link not allowed for directory touch /mnt/lustre/.lustre/f154 touch .lustre/f154 should fail. mknod /mnt/lustre/.lustre/p154 p mknod .lustre/p154 p should fail. mkdir /mnt/lustre/.lustre/d154 mkdir .lustre/d154 should fail. ls /mnt/lustre/.lustre d154 f154 p154 chmod ugo+rwx /mnt/lustre/.lustre (Oops in osd_xattr_get()). skipping chown 0:0 /mnt/lustre/.lustre (Oops in osd_xattr_get()). setfattr -n user.test_md_ops -v foo /mnt/lustre/.lustre (Oops). setfattr: /mnt/lustre/.lustre: Operation not permitted getfattr -n user.test_md_ops /mnt/lustre/.lustre getfattr: Removing leading '/' from absolute path names file: mnt/lustre/.lustre user.test_md_ops getfattr -d /mnt/lustre/.lustre (LBUG in mo_xattr_list()). setfattr -x user.test_md_ops /mnt/lustre/.lustre (Oops in osd_xattr_get()). setfattr: /mnt/lustre/.lustre: Operation not permitted rename /mnt/lustre/d0.sanity/d154 onto /mnt/lustre/.lustre (Oops in osd_xattr_get()). mv: cannot move `/mnt/lustre/d0.sanity/d154' to `/mnt/lustre/.lustre': Directory not empty rename /mnt/lustre/.lustre into /mnt/lustre/d0.sanity/d154 sanity test_154b: @@@@@@ FAIL: mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail. Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/sanity.sh:7921:test_md_ops() = /usr/lib64/lustre/tests/sanity.sh:7931:test_154b() = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one() = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test() = /usr/lib64/lustre/tests/sanity.sh:7934:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154b.*.1345486430.log Dumping logs only on local client. FAIL 154b (0s) == sanity test 154c: Test md operations on .lustre/fid == 11:13:50 (1345486430) stat /mnt/lustre/.lustre/fid stat: cannot stat `/mnt/lustre/.lustre/fid': No such file or directory sanity test_154c: @@@@@@ FAIL: cannot stat /mnt/lustre/.lustre/fid. Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/sanity.sh:7874:test_md_ops() = /usr/lib64/lustre/tests/sanity.sh:7939:test_154c() = /usr/lib64/lustre/tests/test-framework.sh:3872:run_one() = /usr/lib64/lustre/tests/test-framework.sh:3901:run_one_logged() = /usr/lib64/lustre/tests/test-framework.sh:3721:run_test() = /usr/lib64/lustre/tests/sanity.sh:7942:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity.test_154c.*.1345486430.log Dumping logs only on local client. FAIL 154c (0s) ..................................................== sanity sanity.sh test complete, duration 1 sec == 11:13:50 (1345486430) /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154b mv -T /mnt/lustre/.lustre /mnt/lustre/d0.sanity/d154 should fail. /usr/lib64/lustre/tests/sanity.sh: FAIL: test_154c cannot stat /mnt/lustre/.lustre/fid. rm: cannot remove `/mnt/lustre/d0.sanity/d154/.lustre': Operation not permitted sanity : @@@@@@ FAIL: remove sub-test dirs failed Trace dump: = /usr/lib64/lustre/tests/test-framework.sh:3617:error_noexit() = /usr/lib64/lustre/tests/test-framework.sh:3639:error() = /usr/lib64/lustre/tests/test-framework.sh:3238:check_and_cleanup_lustre() = /usr/lib64/lustre/tests/sanity.sh:9648:main() Dumping lctl log to /tmp/test_logs/2012-08-20/111348/sanity..*.1345486430.log Dumping logs only on local client. sanity returned 0 Finished at Mon Aug 20 11:13:50 PDT 2012 in 2s /usr/lib64/lustre/tests/auster: completed with rc 0
            pjones Peter Jones added a comment -

            Bob

            Could you please help out with this one?

            Thanks

            Peter

            pjones Peter Jones added a comment - Bob Could you please help out with this one? Thanks Peter

            People

              niu Niu Yawei (Inactive)
              jhammond John Hammond
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: