Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13119

lustre-initialization crashed in common_file_perm() on SLES12

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.14.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f8038c56-3208-11ea-adca-52540065bddc

      lustre-initialization failed with the following error:

      'trevis-42vm12 crashed during lustre-initialization-1'
      

      The stack trace on the MDS looks like:

      LDISKFS-fs (dm-4): mounted filesystem with ordered data mode.
      Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
      IP: [<ffffffff812d5995>] common_file_perm+0x15/0x180
      Oops: 0000 [#1] SMP 
      Supported: No, Unsupported modules are loaded
      CPU: 0 PID: 2995 Comm: mount.lustre Tainted: 4.4.180-94.100_lustre
      Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      Call Trace:
      security_file_permission+0x3e/0xc0
      iterate_dir+0x32/0x110
      osd_ios_general_scan+0x12e/0x250 [osd_ldiskfs]
      osd_initial_OI_scrub+0x5e/0xc00 [osd_ldiskfs]
      osd_scrub_setup+0x8f5/0x960 [osd_ldiskfs]
      osd_device_alloc+0x5ac/0x8c0 [osd_ldiskfs]
      obd_setup+0xb8/0x230 [obdclass]
      class_setup+0x468/0x7c0 [obdclass]
      class_process_config+0x1890/0x27b0 [obdclass]
      do_lcfg+0x235/0x490 [obdclass]
      lustre_start_simple+0x85/0x1f0 [obdclass]
      server_fill_super+0xe81/0x1640 [obdclass]
      lustre_fill_super+0x436/0x8d0 [obdclass]
      mount_nodev+0x48/0xa0
      mount_fs+0x3a/0x170
      vfs_kern_mount+0x62/0x110
      do_mount+0x213/0xcd0
      SyS_mount+0x85/0xd0
      

      It could be that this is related to iterate_dir() taking a fake filp as an argument, and somehow filp is not filled in sufficiently for security_file_permission()->common_file_perm().

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      lustre-initialization lustre-initialization - 'trevis-42vm12 crashed during lustre-initialization-1'

      Attachments

        Issue Links

          Activity

            [LU-13119] lustre-initialization crashed in common_file_perm() on SLES12
            pjones Peter Jones added a comment -

            Landed for 2.14

            pjones Peter Jones added a comment - Landed for 2.14

            Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37184/
            Subject: LU-13119 osd-ldiskfs: set f_cred for app armour
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 33082e057d214793c70085a33f1d82b3915db3a9

            gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37184/ Subject: LU-13119 osd-ldiskfs: set f_cred for app armour Project: fs/lustre-release Branch: master Current Patch Set: Commit: 33082e057d214793c70085a33f1d82b3915db3a9
            arshad512 Arshad Hussain added a comment - Seen on Master. https://testing.whamcloud.com/test_sets/aedd7978-3a88-11ea-80b4-52540065bddc

            Sigh. I didn't commit my comment until now...

            adilger Andreas Dilger added a comment - Sigh. I didn't commit my comment until now...

            It seems that the SLES kernel is trying to apply AppArmor security policies to the Lustre filesystem (not sure why, since we , since common_file_perm() only exists in security/apparmor/lsm.c:

            static int common_file_perm(int op, struct file *file, u32 mask)
            {
                    struct aa_file_cxt *fcxt = file->f_security;
                    struct aa_profile *profile, *fprofile = aa_cred_profile(file->f_cred);
                    int error = 0;
            
                    BUG_ON(!fprofile);
            
                    if (!file->f_path.mnt ||
                        !mediated_filesystem(file->f_path.dentry))
                            return 0;
            
                    profile = __aa_current_profile();
            
                    /* revalidate access, if task is unconfined, or the cached cred
                     * doesn't match or if the request is for more permissions than
                     * was granted.
                     *
                     * Note: the test for !unconfined(fprofile) is to handle file
                     *       delegation from unconfined tasks
                     */
                    if (!unconfined(profile) && !unconfined(fprofile) &&
                        ((fprofile != profile) || (mask & ~fcxt->allow)))
                            error = aa_file_perm(op, profile, file, mask);
            
                    return error;
            }
            
            
            static int apparmor_file_permission(struct file *file, int mask)
            {
                    return common_file_perm(OP_FPERM, file, mask);
            }
            
            int iterate_dir(struct file *file, struct dir_context *ctx)
            {               
                    struct inode *inode = file_inode(file);
                    bool shared = false;
                    int res = -ENOTDIR;
                    if (file->f_op->iterate_shared)
                            shared = true;
                    else if (!file->f_op->iterate)
                            goto out;
            
                    res = security_file_permission(file, MAY_READ);
                    :
                    :
            }
            
            osd_ios_general_scan(struct osd_thread_info *info, struct osd_device *dev,
                                 struct dentry *dentry, filldir_t filldir)
            {
                    struct file                  *filp  = &info->oti_file;
                    struct inode                 *inode = dentry->d_inode;
                    const struct file_operations *fops  = inode->i_fop;
            
                    filp->f_pos = 0;
                    filp->f_path.dentry = dentry;
                    filp->f_flags |= O_NOATIME;
                    filp->f_mode = FMODE_64BITHASH | FMODE_NONOTIFY;
                    filp->f_mapping = inode->i_mapping;
                    filp->f_op = fops;
                    filp->private_data = NULL;
                    set_file_inode(filp, inode);
                    rc = osd_security_file_alloc(filp);
                    if (rc)
                            RETURN(rc);        do {
                            buf.oifb_items = 0;
                            rc = iterate_dir(filp, &buf.ctx);
                    } while (rc >= 0 && buf.oifb_items > 0 &&
            
            adilger Andreas Dilger added a comment - It seems that the SLES kernel is trying to apply AppArmor security policies to the Lustre filesystem (not sure why, since we , since common_file_perm() only exists in security/apparmor/lsm.c : static int common_file_perm( int op, struct file *file, u32 mask) { struct aa_file_cxt *fcxt = file->f_security; struct aa_profile *profile, *fprofile = aa_cred_profile(file->f_cred); int error = 0; BUG_ON(!fprofile); if (!file->f_path.mnt || !mediated_filesystem(file->f_path.dentry)) return 0; profile = __aa_current_profile(); /* revalidate access, if task is unconfined, or the cached cred * doesn't match or if the request is for more permissions than * was granted. * * Note: the test for !unconfined(fprofile) is to handle file * delegation from unconfined tasks */ if (!unconfined(profile) && !unconfined(fprofile) && ((fprofile != profile) || (mask & ~fcxt->allow))) error = aa_file_perm(op, profile, file, mask); return error; } static int apparmor_file_permission(struct file *file, int mask) { return common_file_perm(OP_FPERM, file, mask); } int iterate_dir(struct file *file, struct dir_context *ctx) { struct inode *inode = file_inode(file); bool shared = false ; int res = -ENOTDIR; if (file->f_op->iterate_shared) shared = true ; else if (!file->f_op->iterate) goto out; res = security_file_permission(file, MAY_READ); : : } osd_ios_general_scan(struct osd_thread_info *info, struct osd_device *dev, struct dentry *dentry, filldir_t filldir) { struct file *filp = &info->oti_file; struct inode *inode = dentry->d_inode; const struct file_operations *fops = inode->i_fop; filp->f_pos = 0; filp->f_path.dentry = dentry; filp->f_flags |= O_NOATIME; filp->f_mode = FMODE_64BITHASH | FMODE_NONOTIFY; filp->f_mapping = inode->i_mapping; filp->f_op = fops; filp->private_data = NULL; set_file_inode(filp, inode); rc = osd_security_file_alloc(filp); if (rc) RETURN(rc); do { buf.oifb_items = 0; rc = iterate_dir(filp, &buf.ctx); } while (rc >= 0 && buf.oifb_items > 0 &&

            James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/37184
            Subject: LU-13119 osd-ldiskfs: set f_cred for app armour
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 82d5ab9c64a290791be79b83ff006c926ddc0311

            gerrit Gerrit Updater added a comment - James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/37184 Subject: LU-13119 osd-ldiskfs: set f_cred for app armour Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 82d5ab9c64a290791be79b83ff006c926ddc0311

            SUSE is using app amour which has different requirements. osd-ldiskfs open codes struct file creating instead of using 

            alloc_file_pseudo() so bits are missed. Looking at common_file_perm() in the SUSE kernel code it expects 

            file->f_cred to set. Eventually I like to move to alloc_file_pseudo() but that is a bit tricky in the way struct file data structures are handling in ldiskfs as scratch areas.

            simmonsja James A Simmons added a comment - SUSE is using app amour which has different requirements. osd-ldiskfs open codes struct file creating instead of using  alloc_file_pseudo() so bits are missed. Looking at common_file_perm() in the SUSE kernel code it expects  file->f_cred to set. Eventually I like to move to alloc_file_pseudo() but that is a bit tricky in the way struct file data structures are handling in ldiskfs as scratch areas.

            According to the data on the Maloo page it is SLES12.3.

            adilger Andreas Dilger added a comment - According to the data on the Maloo page it is SLES12.3.

            Which SLES is this?

            simmonsja James A Simmons added a comment - Which SLES is this?

            It looks like the use of iterate_dir() was introduced by patch https://review.whamcloud.com/34714 "LU-11832 ldiskfs: properly handle VFS parallel locking".

            adilger Andreas Dilger added a comment - It looks like the use of iterate_dir() was introduced by patch https://review.whamcloud.com/34714 " LU-11832 ldiskfs: properly handle VFS parallel locking ".

            People

              simmonsja James A Simmons
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: