Details
-
Bug
-
Resolution: Fixed
-
Major
-
Lustre 2.12.0
-
3
-
9223372036854775807
Description
I am trying to incorporate some centos7 ARM server testing into my setup and I am having crashes on MDS mount.
If I have selinux enabled, it oopses like this:
[ 617.809020] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 617.809487] Mem abort info: [ 617.809701] Exception class = DABT (current EL), IL = 32 bits [ 617.809968] SET = 0, FnV = 0 [ 617.810146] EA = 0, S1PTW = 0 [ 617.810312] Data abort info: [ 617.810463] ISV = 0, ISS = 0x00000007 [ 617.810665] CM = 0, WnR = 0 [ 617.810864] user pgtable: 64k pages, 48-bit VAs, pgd = ffff8000c76d9200 [ 617.811221] [0000000000000000] *pgd=00000000a1370003, *pud=00000000a1370003, *pmd=00000000a1e90003, *pte=0000000000000000 [ 617.812422] Internal error: Oops: 96000007 [#1] SMP [ 617.812842] Modules linked in: loop dm_flakey dm_mod lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache jbd2 sunrpc vfat fat crc32_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sg virtio_rng ip_tables xfs libcrc32c virtio_scsi virtio_net virtio_blk virtio_console virtio_pci virtio_mmio virtio_ring virtio [ 617.815784] CPU: 0 PID: 5095 Comm: mount.lustre Tainted: G OE ------------ 4.14.0 #1 [ 617.816218] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ 617.816626] task: ffff8000c8f5dc00 task.stack: ffff0000211a0000 [ 617.817343] PC is at selinux_file_permission+0x68/0x154 [ 617.817631] LR is at selinux_file_permission+0x68/0x154 [ 617.817893] pc : [<ffff0000083614e0>] lr : [<ffff0000083614e0>] pstate: 60000005 [ 617.818238] sp : ffff0000211af380 [ 617.818406] x29: ffff0000211af380 x28: ffff000000b81000 [ 617.818723] x27: ffff8000dc101000 x26: ffff000000b81004 [ 617.819006] x25: ffff0000014c0440 x24: ffff000008d13c08 [ 617.819273] x23: 0000000000000893 x22: 0000000000000000 [ 617.819537] x21: ffff8000ddee1280 x20: 0000000000000004 [ 617.819811] x19: ffff8000c3229248 x18: 0000ffff917be400 [ 617.820104] x17: 0000000000000000 x16: ffff8000c8f5dc00 [ 617.820380] x15: 000000000284cc40 x14: 0000000000000000 [ 617.820647] x13: 0000000000000b88 x12: 0000000000000018 [ 617.820911] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f [ 617.821189] x9 : 0000000000000000 x8 : ffff000008d13c08 [ 617.821500] x7 : 0000000000000040 x6 : 7460e9b027bc9900 [ 617.821846] x5 : 0000000000010000 x4 : 000000000000f450 [ 617.822159] x3 : ffff000000b81000 x2 : 0000000000000000 [ 617.822428] x1 : 0000000000000000 x0 : 0000000000000000 [ 617.822787] Process mount.lustre (pid: 5095, stack limit = 0xffff0000211a0000) [ 617.823305] Call trace: [ 617.823657] Exception stack(0xffff0000211af240 to 0xffff0000211af380) [ 617.824175] f240: 0000000000000000 0000000000000000 0000000000000000 ffff000000b81000 [ 617.824586] f260: 000000000000f450 0000000000010000 7460e9b027bc9900 0000000000000040 [ 617.824946] f280: ffff000008d13c08 0000000000000000 7f7f7f7f7f7f7f7f 0101010101010101 [ 617.825309] f2a0: 0000000000000018 0000000000000b88 0000000000000000 000000000284cc40 [ 617.825667] f2c0: ffff8000c8f5dc00 0000000000000000 0000ffff917be400 ffff8000c3229248 [ 617.826057] f2e0: 0000000000000004 ffff8000ddee1280 0000000000000000 0000000000000893 [ 617.826497] f300: ffff000008d13c08 ffff0000014c0440 ffff000000b81004 ffff8000dc101000 [ 617.826884] f320: ffff000000b81000 ffff0000211af380 ffff0000083614e0 ffff0000211af380 [ 617.827322] f340: ffff0000083614e0 0000000060000005 ffff8000c3229248 0000000000000004 [ 617.827729] f360: 0001000000000000 0000000000000000 ffff0000211af380 ffff0000083614e0 [ 617.828421] [<ffff0000083614e0>] selinux_file_permission+0x68/0x154 [ 617.828765] [<ffff000008356848>] security_file_permission+0x58/0xf8 [ 617.829110] [<ffff0000082b1798>] iterate_dir+0x44/0x1b8 [ 617.830573] [<ffff000001e529f0>] osd_ios_general_scan+0xf8/0x2b0 [osd_ldiskfs] [ 617.831760] [<ffff000001e5b8d4>] osd_initial_OI_scrub+0x9c/0x13e0 [osd_ldiskfs] [ 617.832909] [<ffff000001e5daac>] osd_scrub_setup+0xb44/0x1118 [osd_ldiskfs] [ 617.833977] [<ffff000001e2d4ec>] osd_device_alloc+0x544/0x950 [osd_ldiskfs] [ 617.836078] [<ffff000000eb9d9c>] class_setup+0x7bc/0xd20 [obdclass] [ 617.838397] [<ffff000000ec3a20>] class_process_config+0x1708/0x2e90 [obdclass] [ 617.840457] [<ffff000000eca358>] do_lcfg+0x2b0/0x6d8 [obdclass] [ 617.842867] [<ffff000000ecf48c>] lustre_start_simple+0x154/0x3f8 [obdclass] [ 617.844903] [<ffff000000f04ed0>] osd_start+0x500/0xa40 [obdclass] [ 617.847245] [<ffff000000f10a64>] server_fill_super+0x1d4/0x1848 [obdclass] [ 617.849294] [<ffff000000ed3794>] lustre_fill_super+0x62c/0xdb0 [obdclass] [ 617.849680] [<ffff0000082a02b4>] mount_nodev+0x5c/0xbc [ 617.852008] [<ffff000000ecadb4>] lustre_mount+0x4c/0x80 [obdclass] [ 617.852371] [<ffff0000082a12f8>] mount_fs+0x54/0x16c [ 617.852627] [<ffff0000082bfb40>] vfs_kern_mount+0x58/0x154 [ 617.852886] [<ffff0000082c2fcc>] do_mount+0x1cc/0xbac [ 617.853191] [<ffff0000082c3d34>] SyS_mount+0x88/0xd4 [ 617.853463] Exception stack(0xffff0000211afec0 to 0xffff0000211b0000) [ 617.853771] fec0: 00000000315a0050 0000ffffd141a180 000000000040e098 0000000001000000 [ 617.854155] fee0: 00000000315a0070 0000000000000bd0 0000ffff93f0add4 0000000000000000 [ 617.854520] ff00: 0000000000000028 1999999999999999 00000000ffffffff 0000000000000005 [ 617.854871] ff20: 0000000000000005 ffffffffffffffff 000000008408bd8e 0000001ffa1dea16 [ 617.855231] ff40: 0000ffff93fa0000 00000000004301d0 0000ffffd1413860 0000ffffd1417168 [ 617.855588] ff60: 0000ffffd14171a0 0000000000000000 00000000315a0070 0000000000000000 [ 617.855980] ff80: 000000000042f000 00000000fffffff5 0000ffffd141c168 000000000042f000 [ 617.856356] ffa0: 0000ffffd1413e40 0000ffffd1413a90 0000000000404868 0000ffffd1413a90 [ 617.856746] ffc0: 0000ffff93fa0008 0000000080000000 00000000315a0050 0000000000000028 [ 617.857121] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 617.857501] [<ffff00000808359c>] __sys_trace_return+0x0/0x4 [ 617.857947] Code: d2800001 aa1503e0 52800022 97fff630 (b94002c0) [ 617.858923] ---[ end trace 007561cc33cd3443 ]--- [ 617.859326] Kernel panic - not syncing: Fatal exception [ 617.859686] SMP: stopping secondary CPUs [ 617.860166] Kernel Offset: disabled [ 617.860311] CPU features: 0x1802082 [ 617.860452] Memory Limit: none [ 617.860660] ---[ end Kernel panic - not syncing: Fatal exception
this is not suposed to happen, since we were handling selinux in the past.
The other problem is once I disable selinux it then hangs on MDS mount:
[ 243.391052] INFO: task mount.lustre:2636 blocked for more than 120 seconds. [ 243.393134] Tainted: G OE ------------ 4.14.0 #1 [ 243.394963] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.399802] mount.lustre D 0 2636 2635 0x00000224 [ 243.401896] Call trace: [ 243.403266] [<ffff000008085a8c>] __switch_to+0x8c/0xa8 [ 243.404571] [<ffff0000088154d0>] __schedule+0x328/0x860 [ 243.406277] [<ffff000008815a3c>] schedule+0x34/0x8c [ 243.407956] [<ffff000008818f60>] rwsem_down_write_failed+0x134/0x238 [ 243.410015] [<ffff00000881839c>] down_write+0x54/0x58 [ 243.411962] [<ffff00000281b390>] osd_ios_root_fill+0xd0/0x578 [osd_ldiskfs] [ 243.415804] [<ffff000002681798>] call_filldir+0xd8/0x148 [ldiskfs] [ 243.419054] [<ffff000002682170>] ldiskfs_readdir+0x670/0x7b8 [ldiskfs] [ 243.420477] [<ffff0000082b18a4>] iterate_dir+0x150/0x1b8 [ 243.421835] [<ffff0000028129f0>] osd_ios_general_scan+0xf8/0x2b0 [osd_ldiskfs] [ 243.423755] [<ffff00000281b8d4>] osd_initial_OI_scrub+0x9c/0x13e0 [osd_ldiskfs] [ 243.425517] [<ffff00000281daac>] osd_scrub_setup+0xb44/0x1118 [osd_ldiskfs] [ 243.427166] [<ffff0000027ed4ec>] osd_device_alloc+0x544/0x950 [osd_ldiskfs] [ 243.428993] [<ffff000001b79d9c>] class_setup+0x7bc/0xd20 [obdclass] [ 243.430611] [<ffff000001b83a20>] class_process_config+0x1708/0x2e90 [obdclass] [ 243.432616] [<ffff000001b8a358>] do_lcfg+0x2b0/0x6d8 [obdclass] [ 243.434258] [<ffff000001b8f48c>] lustre_start_simple+0x154/0x3f8 [obdclass] [ 243.436161] [<ffff000001bc4ed0>] osd_start+0x500/0xa40 [obdclass] [ 243.438178] [<ffff000001bd0a64>] server_fill_super+0x1d4/0x1848 [obdclass] [ 243.440867] [<ffff000001b93794>] lustre_fill_super+0x62c/0xdb0 [obdclass] [ 243.443388] [<ffff0000082a02b4>] mount_nodev+0x5c/0xbc [ 243.445407] [<ffff000001b8adb4>] lustre_mount+0x4c/0x80 [obdclass] [ 243.447436] [<ffff0000082a12f8>] mount_fs+0x54/0x16c [ 243.449257] [<ffff0000082bfb40>] vfs_kern_mount+0x58/0x154 [ 243.456371] [<ffff0000082c2fcc>] do_mount+0x1cc/0xbac [ 243.458503] [<ffff0000082c3d34>] SyS_mount+0x88/0xd4 [ 243.460257] Exception stack(0xffff00001022fec0 to 0xffff000010230000) [ 243.461894] fec0: 00000000057a0030 0000ffffd373c5b0 000000000040e098 0000000001000000 [ 243.464401] fee0: 00000000057a0050 0000000000000bd0 0000ffff8746add4 0000000000000000 [ 243.466183] ff00: 0000000000000028 1999999999999999 00000000ffffffff 0000000000000005 [ 243.467956] ff20: 0000000000000005 ffffffffffffffff 0000000098866d56 00000024f08e838c [ 243.469742] ff40: 0000ffff87500000 00000000004301d0 0000ffffd3735c90 0000ffffd3739598 [ 243.471617] ff60: 0000ffffd37395d0 0000000000000000 00000000057a0050 0000000000000000 [ 243.474134] ff80: 000000000042f000 00000000fffffff5 0000ffffd373e598 000000000042f000 [ 243.476517] ffa0: 0000ffffd3736270 0000ffffd3735ec0 0000000000404868 0000ffffd3735ec0 [ 243.478290] ffc0: 0000ffff87500008 0000000080000000 00000000057a0030 0000000000000028 [ 243.480078] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 243.481970] [<ffff00000808359c>] __sys_trace_return+0x0/0x4
I am using patch from LU-11200 to enable server side/ldiskfs building.
Attachments
Issue Links
- is related to
-
LU-12598 osd_ios_lf_fill() returns 0 on some error paths.
- Resolved
-
LU-13119 lustre-initialization crashed in common_file_perm() on SLES12
- Resolved
-
LU-6766 add support for arm64
- Resolved
-
LU-12977 fix i_mutex for ldiskfs_truncate() in osd_execute_truncate()
- Resolved
- is related to
-
LU-11200 Centos 8 arm64 server support
- Resolved
-
LU-12137 update client to use iterate_shared
- Resolved