[LU-11832] ARM servers crashing on MDS startup Created: 02/Jan/19  Updated: 09/Jan/20  Resolved: 14/Dec/19

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: Lustre 2.12.0
Fix Version/s: Lustre 2.14.0

Type: Bug Priority: Major
Reporter: Oleg Drokin Assignee: James A Simmons
Resolution: Fixed Votes: 0
Labels: arm

Issue Links:
Related
is related to LU-11200 Centos 8 arm64 server support Resolved
is related to LU-12137 update client to use iterate_shared Resolved
is related to LU-12598 osd_ios_lf_fill() returns 0 on some e... Resolved
is related to LU-13119 lustre-initialization crashed in comm... Resolved
is related to LU-6766 add support for arm64 Resolved
is related to LU-12977 fix i_mutex for ldiskfs_truncate() in... Resolved
Severity: 3
Rank (Obsolete): 9223372036854775807

 Description   

I am trying to incorporate some centos7 ARM server testing into my setup and I am having crashes on MDS mount.

If I have selinux enabled, it oopses like this:

[  617.809020] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[  617.809487] Mem abort info:
[  617.809701]   Exception class = DABT (current EL), IL = 32 bits
[  617.809968]   SET = 0, FnV = 0
[  617.810146]   EA = 0, S1PTW = 0
[  617.810312] Data abort info:
[  617.810463]   ISV = 0, ISS = 0x00000007
[  617.810665]   CM = 0, WnR = 0
[  617.810864] user pgtable: 64k pages, 48-bit VAs, pgd = ffff8000c76d9200
[  617.811221] [0000000000000000] *pgd=00000000a1370003, *pud=00000000a1370003, *pmd=00000000a1e90003, *pte=0000000000000000
[  617.812422] Internal error: Oops: 96000007 [#1] SMP
[  617.812842] Modules linked in: loop dm_flakey dm_mod lustre(OE) ofd(OE) osp(OE) lod(OE) ost(OE) mdt(OE) mdd(OE) mgs(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) lfsck(OE) obdecho(OE) mgc(OE) lov(OE) mdc(OE) osc(OE) lmv(OE) fid(OE) fld(OE) ptlrpc_gss(OE) ptlrpc(OE) obdclass(OE) ksocklnd(OE) lnet(OE) libcfs(OE) ext4 mbcache jbd2 sunrpc vfat fat crc32_ce ghash_ce sha2_ce sha256_arm64 sha1_ce sg virtio_rng ip_tables xfs libcrc32c virtio_scsi virtio_net virtio_blk virtio_console virtio_pci virtio_mmio virtio_ring virtio
[  617.815784] CPU: 0 PID: 5095 Comm: mount.lustre Tainted: G           OE  ------------   4.14.0 #1
[  617.816218] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[  617.816626] task: ffff8000c8f5dc00 task.stack: ffff0000211a0000
[  617.817343] PC is at selinux_file_permission+0x68/0x154
[  617.817631] LR is at selinux_file_permission+0x68/0x154
[  617.817893] pc : [<ffff0000083614e0>] lr : [<ffff0000083614e0>] pstate: 60000005
[  617.818238] sp : ffff0000211af380
[  617.818406] x29: ffff0000211af380 x28: ffff000000b81000 
[  617.818723] x27: ffff8000dc101000 x26: ffff000000b81004 
[  617.819006] x25: ffff0000014c0440 x24: ffff000008d13c08 
[  617.819273] x23: 0000000000000893 x22: 0000000000000000 
[  617.819537] x21: ffff8000ddee1280 x20: 0000000000000004 
[  617.819811] x19: ffff8000c3229248 x18: 0000ffff917be400 
[  617.820104] x17: 0000000000000000 x16: ffff8000c8f5dc00 
[  617.820380] x15: 000000000284cc40 x14: 0000000000000000 
[  617.820647] x13: 0000000000000b88 x12: 0000000000000018 
[  617.820911] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f 
[  617.821189] x9 : 0000000000000000 x8 : ffff000008d13c08 
[  617.821500] x7 : 0000000000000040 x6 : 7460e9b027bc9900 
[  617.821846] x5 : 0000000000010000 x4 : 000000000000f450 
[  617.822159] x3 : ffff000000b81000 x2 : 0000000000000000 
[  617.822428] x1 : 0000000000000000 x0 : 0000000000000000 
[  617.822787] Process mount.lustre (pid: 5095, stack limit = 0xffff0000211a0000)
[  617.823305] Call trace:
[  617.823657] Exception stack(0xffff0000211af240 to 0xffff0000211af380)
[  617.824175] f240: 0000000000000000 0000000000000000 0000000000000000 ffff000000b81000
[  617.824586] f260: 000000000000f450 0000000000010000 7460e9b027bc9900 0000000000000040
[  617.824946] f280: ffff000008d13c08 0000000000000000 7f7f7f7f7f7f7f7f 0101010101010101
[  617.825309] f2a0: 0000000000000018 0000000000000b88 0000000000000000 000000000284cc40
[  617.825667] f2c0: ffff8000c8f5dc00 0000000000000000 0000ffff917be400 ffff8000c3229248
[  617.826057] f2e0: 0000000000000004 ffff8000ddee1280 0000000000000000 0000000000000893
[  617.826497] f300: ffff000008d13c08 ffff0000014c0440 ffff000000b81004 ffff8000dc101000
[  617.826884] f320: ffff000000b81000 ffff0000211af380 ffff0000083614e0 ffff0000211af380
[  617.827322] f340: ffff0000083614e0 0000000060000005 ffff8000c3229248 0000000000000004
[  617.827729] f360: 0001000000000000 0000000000000000 ffff0000211af380 ffff0000083614e0
[  617.828421] [<ffff0000083614e0>] selinux_file_permission+0x68/0x154
[  617.828765] [<ffff000008356848>] security_file_permission+0x58/0xf8
[  617.829110] [<ffff0000082b1798>] iterate_dir+0x44/0x1b8
[  617.830573] [<ffff000001e529f0>] osd_ios_general_scan+0xf8/0x2b0 [osd_ldiskfs]
[  617.831760] [<ffff000001e5b8d4>] osd_initial_OI_scrub+0x9c/0x13e0 [osd_ldiskfs]
[  617.832909] [<ffff000001e5daac>] osd_scrub_setup+0xb44/0x1118 [osd_ldiskfs]
[  617.833977] [<ffff000001e2d4ec>] osd_device_alloc+0x544/0x950 [osd_ldiskfs]
[  617.836078] [<ffff000000eb9d9c>] class_setup+0x7bc/0xd20 [obdclass]
[  617.838397] [<ffff000000ec3a20>] class_process_config+0x1708/0x2e90 [obdclass]
[  617.840457] [<ffff000000eca358>] do_lcfg+0x2b0/0x6d8 [obdclass]
[  617.842867] [<ffff000000ecf48c>] lustre_start_simple+0x154/0x3f8 [obdclass]
[  617.844903] [<ffff000000f04ed0>] osd_start+0x500/0xa40 [obdclass]
[  617.847245] [<ffff000000f10a64>] server_fill_super+0x1d4/0x1848 [obdclass]
[  617.849294] [<ffff000000ed3794>] lustre_fill_super+0x62c/0xdb0 [obdclass]
[  617.849680] [<ffff0000082a02b4>] mount_nodev+0x5c/0xbc
[  617.852008] [<ffff000000ecadb4>] lustre_mount+0x4c/0x80 [obdclass]
[  617.852371] [<ffff0000082a12f8>] mount_fs+0x54/0x16c
[  617.852627] [<ffff0000082bfb40>] vfs_kern_mount+0x58/0x154
[  617.852886] [<ffff0000082c2fcc>] do_mount+0x1cc/0xbac
[  617.853191] [<ffff0000082c3d34>] SyS_mount+0x88/0xd4
[  617.853463] Exception stack(0xffff0000211afec0 to 0xffff0000211b0000)
[  617.853771] fec0: 00000000315a0050 0000ffffd141a180 000000000040e098 0000000001000000
[  617.854155] fee0: 00000000315a0070 0000000000000bd0 0000ffff93f0add4 0000000000000000
[  617.854520] ff00: 0000000000000028 1999999999999999 00000000ffffffff 0000000000000005
[  617.854871] ff20: 0000000000000005 ffffffffffffffff 000000008408bd8e 0000001ffa1dea16
[  617.855231] ff40: 0000ffff93fa0000 00000000004301d0 0000ffffd1413860 0000ffffd1417168
[  617.855588] ff60: 0000ffffd14171a0 0000000000000000 00000000315a0070 0000000000000000
[  617.855980] ff80: 000000000042f000 00000000fffffff5 0000ffffd141c168 000000000042f000
[  617.856356] ffa0: 0000ffffd1413e40 0000ffffd1413a90 0000000000404868 0000ffffd1413a90
[  617.856746] ffc0: 0000ffff93fa0008 0000000080000000 00000000315a0050 0000000000000028
[  617.857121] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  617.857501] [<ffff00000808359c>] __sys_trace_return+0x0/0x4
[  617.857947] Code: d2800001 aa1503e0 52800022 97fff630 (b94002c0) 
[  617.858923] ---[ end trace 007561cc33cd3443 ]---
[  617.859326] Kernel panic - not syncing: Fatal exception
[  617.859686] SMP: stopping secondary CPUs
[  617.860166] Kernel Offset: disabled
[  617.860311] CPU features: 0x1802082
[  617.860452] Memory Limit: none
[  617.860660] ---[ end Kernel panic - not syncing: Fatal exception

this is not suposed to happen, since we were handling selinux in the past.

The other problem is once I disable selinux it then hangs on MDS mount:

[  243.391052] INFO: task mount.lustre:2636 blocked for more than 120 seconds.
[  243.393134]       Tainted: G           OE  ------------   4.14.0 #1
[  243.394963] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.399802] mount.lustre    D    0  2636   2635 0x00000224
[  243.401896] Call trace:
[  243.403266] [<ffff000008085a8c>] __switch_to+0x8c/0xa8
[  243.404571] [<ffff0000088154d0>] __schedule+0x328/0x860
[  243.406277] [<ffff000008815a3c>] schedule+0x34/0x8c
[  243.407956] [<ffff000008818f60>] rwsem_down_write_failed+0x134/0x238
[  243.410015] [<ffff00000881839c>] down_write+0x54/0x58
[  243.411962] [<ffff00000281b390>] osd_ios_root_fill+0xd0/0x578 [osd_ldiskfs]
[  243.415804] [<ffff000002681798>] call_filldir+0xd8/0x148 [ldiskfs]
[  243.419054] [<ffff000002682170>] ldiskfs_readdir+0x670/0x7b8 [ldiskfs]
[  243.420477] [<ffff0000082b18a4>] iterate_dir+0x150/0x1b8
[  243.421835] [<ffff0000028129f0>] osd_ios_general_scan+0xf8/0x2b0 [osd_ldiskfs]
[  243.423755] [<ffff00000281b8d4>] osd_initial_OI_scrub+0x9c/0x13e0 [osd_ldiskfs]
[  243.425517] [<ffff00000281daac>] osd_scrub_setup+0xb44/0x1118 [osd_ldiskfs]
[  243.427166] [<ffff0000027ed4ec>] osd_device_alloc+0x544/0x950 [osd_ldiskfs]
[  243.428993] [<ffff000001b79d9c>] class_setup+0x7bc/0xd20 [obdclass]
[  243.430611] [<ffff000001b83a20>] class_process_config+0x1708/0x2e90 [obdclass]
[  243.432616] [<ffff000001b8a358>] do_lcfg+0x2b0/0x6d8 [obdclass]
[  243.434258] [<ffff000001b8f48c>] lustre_start_simple+0x154/0x3f8 [obdclass]
[  243.436161] [<ffff000001bc4ed0>] osd_start+0x500/0xa40 [obdclass]
[  243.438178] [<ffff000001bd0a64>] server_fill_super+0x1d4/0x1848 [obdclass]
[  243.440867] [<ffff000001b93794>] lustre_fill_super+0x62c/0xdb0 [obdclass]
[  243.443388] [<ffff0000082a02b4>] mount_nodev+0x5c/0xbc
[  243.445407] [<ffff000001b8adb4>] lustre_mount+0x4c/0x80 [obdclass]
[  243.447436] [<ffff0000082a12f8>] mount_fs+0x54/0x16c
[  243.449257] [<ffff0000082bfb40>] vfs_kern_mount+0x58/0x154
[  243.456371] [<ffff0000082c2fcc>] do_mount+0x1cc/0xbac
[  243.458503] [<ffff0000082c3d34>] SyS_mount+0x88/0xd4
[  243.460257] Exception stack(0xffff00001022fec0 to 0xffff000010230000)
[  243.461894] fec0: 00000000057a0030 0000ffffd373c5b0 000000000040e098 0000000001000000
[  243.464401] fee0: 00000000057a0050 0000000000000bd0 0000ffff8746add4 0000000000000000
[  243.466183] ff00: 0000000000000028 1999999999999999 00000000ffffffff 0000000000000005
[  243.467956] ff20: 0000000000000005 ffffffffffffffff 0000000098866d56 00000024f08e838c
[  243.469742] ff40: 0000ffff87500000 00000000004301d0 0000ffffd3735c90 0000ffffd3739598
[  243.471617] ff60: 0000ffffd37395d0 0000000000000000 00000000057a0050 0000000000000000
[  243.474134] ff80: 000000000042f000 00000000fffffff5 0000ffffd373e598 000000000042f000
[  243.476517] ffa0: 0000ffffd3736270 0000ffffd3735ec0 0000000000404868 0000ffffd3735ec0
[  243.478290] ffc0: 0000ffff87500008 0000000080000000 00000000057a0030 0000000000000028
[  243.480078] ffe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  243.481970] [<ffff00000808359c>] __sys_trace_return+0x0/0x4

I am using patch from LU-11200 to enable server side/ldiskfs building.



 Comments   
Comment by Andreas Dilger [ 02/Jan/19 ]

I found a bug with 32-bIt ARM client at mount due to the last patch landed before 2.12.0. Will submit a fix, but I don't think it relates to this.

Comment by James A Simmons [ 20/May/19 ]

Can people try https://review.whamcloud.com/#/c/34714. This seems to resolve these problems.

Comment by James A Simmons [ 24/Jun/19 ]

After some input from Neil Brown I have gotten patch  https://review.whamcloud.com/#/c/34714. ARM ldiskfs servers should mount with new issues.

Comment by Gerrit Updater [ 14/Dec/19 ]

Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/34714/
Subject: LU-11832 ldiskfs: properly handle VFS parallel locking
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 41dd393ffbd94877dad411cfd66105379258b579

Comment by Peter Jones [ 14/Dec/19 ]

Landed for 2.14

Comment by Alexander Boyko [ 31/Dec/19 ]

After this patch Lustre build is broken for me. I don't know why defines HAVE_DIR_CONTEXT were removed in such way, These kind of kernel differences were handled at a configure way usually.

/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:859:21: error: field ‘ctx’ has incomplete type
  struct dir_context ctx;
                     ^
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c: In function ‘osd_check_lmv’:
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:965:3: error: field name not in record or union initializer
   .ctx.actor = osd_stripe_dir_filldir,
   ^
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:965:3: error: (near initialization for ‘oclb’)
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:1024:3: error: implicit declaration of function ‘iterate_dir’ [-Werror=implicit-function-declaration]
   rc = iterate_dir(filp, &oclb.ctx);
   ^
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c: At top level:
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:6520:21: error: field ‘ctx’ has incomplete type
  struct dir_context ctx;
                     ^
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c: In function ‘osd_ldiskfs_it_fill’:
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:6610:3: error: field name not in record or union initializer
   .ctx.actor = osd_ldiskfs_filldir,
   ^
/home/test/lustre-release/lustre/osd-ldiskfs/osd_handler.c:6610:3: error: (near initialization for ‘buf’)

@James A Simmons could you clarify this dirty hack ?

@Peter Jones, why building didn't find this error during gerrit integration process?

Comment by James A Simmons [ 31/Dec/19 ]

If you look at commit 41dd393ffbd94877dad411cfd66105379258b579 it explains in detail why this was done. What kernel are you using? Can you point me to the source. interate_dir() / struct dir_context is used everywhere in the kernel so this shouldn't be broken.

Comment by Peter Jones [ 31/Dec/19 ]

aboyko you must be building in a different way than we do in the standard CI system - builds have been continuing to build ok over the past couple of weeks. Once the difference is understood, it would be possible for you to setup build slaves matching your desired configuration in a manner that passes results back into Gerrit - that way situations like this can be avoided.

Comment by Alexander Boyko [ 31/Dec/19 ]

Well, I have a slightly older kernel than which_patch has, 3.10-693.21, and Lustre works perfect until this fix. configure process allows to support a wide range of kernel version and I'm a bit surprised that this strict dependence  was made.

 

Comment by James A Simmons [ 31/Dec/19 ]

Went to RHEL web site to down the source and now it redirects you to their latest RHEL7 kernels  I suspect this is a older kernel issue. Currently I see in lustre/Changelog that the oldest kernel officially supported is RHEL7.5. Also for the optional kernel patches they only go back to RHEL7.5. I have a local patch that removes all pre-RHEL7.5 ldiskfs patches but haven't pushed it yet since its unclear if we need to support all RHEL7 versions.

Personally I'm not a fan of support older than RHEL7.5  kernels for several reasons. First is the any kernels before RHEL7.4 don't support the spectra fixes for sites concerned about security. I also remember issues with a broken mlx stack and another issue with the rw_sem being broken in an earlier RHEL version. Its up to Peter what is supported.

Comment by Peter Jones [ 02/Jan/20 ]

From the discussions in the LWG to date, the plan for 2.14 is for RHEL8.1 to be the primary distro and I would expect only the latest RHEL7 kernel to also be tested. Anything older is exposed to these kind of failures cropping up from time to time. There are definitely options as a community to expand what is being tested but I think that it would be better to continue any such discussions in the LWG forum to allow broader participation.

Generated at Sat Feb 10 02:47:19 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.