[LU-13119] lustre-initialization crashed in common_file_perm() on SLES12 Created: 09/Jan/20 Updated: 23/Jan/20 Resolved: 23/Jan/20 |
|
| Status: | Resolved |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | Lustre 2.14.0 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Maloo | Assignee: | James A Simmons |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com> This issue relates to the following test suite run: https://testing.whamcloud.com/test_sets/f8038c56-3208-11ea-adca-52540065bddc lustre-initialization failed with the following error: 'trevis-42vm12 crashed during lustre-initialization-1' The stack trace on the MDS looks like: LDISKFS-fs (dm-4): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc BUG: unable to handle kernel NULL pointer dereference at 0000000000000078 IP: [<ffffffff812d5995>] common_file_perm+0x15/0x180 Oops: 0000 [#1] SMP Supported: No, Unsupported modules are loaded CPU: 0 PID: 2995 Comm: mount.lustre Tainted: 4.4.180-94.100_lustre Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: security_file_permission+0x3e/0xc0 iterate_dir+0x32/0x110 osd_ios_general_scan+0x12e/0x250 [osd_ldiskfs] osd_initial_OI_scrub+0x5e/0xc00 [osd_ldiskfs] osd_scrub_setup+0x8f5/0x960 [osd_ldiskfs] osd_device_alloc+0x5ac/0x8c0 [osd_ldiskfs] obd_setup+0xb8/0x230 [obdclass] class_setup+0x468/0x7c0 [obdclass] class_process_config+0x1890/0x27b0 [obdclass] do_lcfg+0x235/0x490 [obdclass] lustre_start_simple+0x85/0x1f0 [obdclass] server_fill_super+0xe81/0x1640 [obdclass] lustre_fill_super+0x436/0x8d0 [obdclass] mount_nodev+0x48/0xa0 mount_fs+0x3a/0x170 vfs_kern_mount+0x62/0x110 do_mount+0x213/0xcd0 SyS_mount+0x85/0xd0 It could be that this is related to iterate_dir() taking a fake filp as an argument, and somehow filp is not filled in sufficiently for security_file_permission()->common_file_perm(). VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV |
| Comments |
| Comment by Andreas Dilger [ 09/Jan/20 ] |
|
It looks like the use of iterate_dir() was introduced by patch https://review.whamcloud.com/34714 " |
| Comment by James A Simmons [ 09/Jan/20 ] |
|
Which SLES is this? |
| Comment by Andreas Dilger [ 09/Jan/20 ] |
|
According to the data on the Maloo page it is SLES12.3. |
| Comment by James A Simmons [ 09/Jan/20 ] |
|
SUSE is using app amour which has different requirements. osd-ldiskfs open codes struct file creating instead of using alloc_file_pseudo() so bits are missed. Looking at common_file_perm() in the SUSE kernel code it expects file->f_cred to set. Eventually I like to move to alloc_file_pseudo() but that is a bit tricky in the way struct file data structures are handling in ldiskfs as scratch areas. |
| Comment by Gerrit Updater [ 10/Jan/20 ] |
|
James Simmons (jsimmons@infradead.org) uploaded a new patch: https://review.whamcloud.com/37184 |
| Comment by Andreas Dilger [ 11/Jan/20 ] |
|
It seems that the SLES kernel is trying to apply AppArmor security policies to the Lustre filesystem (not sure why, since we , since common_file_perm() only exists in security/apparmor/lsm.c: static int common_file_perm(int op, struct file *file, u32 mask) { struct aa_file_cxt *fcxt = file->f_security; struct aa_profile *profile, *fprofile = aa_cred_profile(file->f_cred); int error = 0; BUG_ON(!fprofile); if (!file->f_path.mnt || !mediated_filesystem(file->f_path.dentry)) return 0; profile = __aa_current_profile(); /* revalidate access, if task is unconfined, or the cached cred * doesn't match or if the request is for more permissions than * was granted. * * Note: the test for !unconfined(fprofile) is to handle file * delegation from unconfined tasks */ if (!unconfined(profile) && !unconfined(fprofile) && ((fprofile != profile) || (mask & ~fcxt->allow))) error = aa_file_perm(op, profile, file, mask); return error; } static int apparmor_file_permission(struct file *file, int mask) { return common_file_perm(OP_FPERM, file, mask); } int iterate_dir(struct file *file, struct dir_context *ctx) { struct inode *inode = file_inode(file); bool shared = false; int res = -ENOTDIR; if (file->f_op->iterate_shared) shared = true; else if (!file->f_op->iterate) goto out; res = security_file_permission(file, MAY_READ); : : } osd_ios_general_scan(struct osd_thread_info *info, struct osd_device *dev, struct dentry *dentry, filldir_t filldir) { struct file *filp = &info->oti_file; struct inode *inode = dentry->d_inode; const struct file_operations *fops = inode->i_fop; filp->f_pos = 0; filp->f_path.dentry = dentry; filp->f_flags |= O_NOATIME; filp->f_mode = FMODE_64BITHASH | FMODE_NONOTIFY; filp->f_mapping = inode->i_mapping; filp->f_op = fops; filp->private_data = NULL; set_file_inode(filp, inode); rc = osd_security_file_alloc(filp); if (rc) RETURN(rc); do { buf.oifb_items = 0; rc = iterate_dir(filp, &buf.ctx); } while (rc >= 0 && buf.oifb_items > 0 && |
| Comment by Andreas Dilger [ 11/Jan/20 ] |
|
Sigh. I didn't commit my comment until now... |
| Comment by Arshad Hussain [ 19/Jan/20 ] |
|
Seen on Master. https://testing.whamcloud.com/test_sets/aedd7978-3a88-11ea-80b4-52540065bddc |
| Comment by Gerrit Updater [ 23/Jan/20 ] |
|
Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/37184/ |
| Comment by Peter Jones [ 23/Jan/20 ] |
|
Landed for 2.14 |