Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13901

kernel panic on kiblnd_startup

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Minor
    • None
    • Lustre 2.12.5
    • None
    • CentOS 7.8, kernel 3.10.0-1127.18.2.el7
    • 3
    • 9223372036854775807

    Description

      We are starting our work on doing a OS upgrade to the latest CentOS7.  As part of this we are also upgrading out Lustre 2.12.5.  We use DKMS to manage the install.  After build we tried to mount one of our Lustre filesystems and got:

      Aug 11 14:47:26 holyitc02 kernel: [ 1184.012839] BUG: unable to handle kernel NULL pointer dereference at 0000000000000168
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.012877] IP: [<ffffffff81c4c5e6>] dev_get_flags+0x6/0x70
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.012905] PGD 8000001fb12f0067 PUD 1ffee89067 PMD 0
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.012930] Oops: 0000 1 SMP
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.012946] Modules linked in: ko2iblnd(OE) ptlrpc(OE+) cdc_ether usbnet mii vfat fat uas usb_storage mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter fuse obdclass(OE) lnet(OE) nfsv3 nfs_acl nfs lockd grace fscache dell_rbu libcfs(OE) netconsole ib_isert iscsi_target_mod target_core_mod ib_umad ib_ipoib mlx4_ib ext4 mbcache jbd2 sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel rfkill kvm iTCO_wdt iTCO_vendor_support dell_wmi_descriptor irqbypass mxm_wmi dcdbas crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr joydev sg ipmi_si ipmi_devintf ipmi_msghandler mei_me wmi mei lpc_ich acpi_power_meter binfmt_misc rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm ib_core libiscsi scsi_transport_iscsi ip_tables xfs libcrc32c mlx4_en sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm mlx4_core ahci drm libahci libata tg3 crct10dif_pclmul crct10dif_common crc32c_intel megaraid_sas devlink ptp drm_panel_orientation_quirks pps_core dm_mirror dm_region_hash dm_log dm_mod
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014118] CPU: 34 PID: 12727 Comm: modprobe Tainted: G OE ------------ 3.10.0-1127.18.2.el7.x86_64 #1
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014196] Hardware name: Dell Inc. PowerEdge M630/0JXJPT, BIOS 2.0.2 03/16/2016
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014254] task: ffff892773c10000 ti: ffff8907511f4000 task.ti: ffff8907511f4000
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014311] RIP: 0010:[<ffffffff81c4c5e6>] [<ffffffff81c4c5e6>] dev_get_flags+0x6/0x70
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014380] RSP: 0018:ffff8907511f7ad8 EFLAGS: 00010286
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014422] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffffffffb0
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014477] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffffffffb0
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014531] RBP: ffff8907511f7b30 R08: 000000000000000a R09: 0000000000000001
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014585] R10: 0000000000000911 R11: ffff8907511f760e R12: 0000000000000000
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014639] R13: 0000000000000000 R14: ffffffff82315c80 R15: 0000000000000000
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014695] FS: 00007f91343cd740(0000) GS:ffff89077fc40000(0000) knlGS:0000000000000000
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014756] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014801] CR2: 0000000000000168 CR3: 0000001fd12e8000 CR4: 00000000001607e0
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014856] Call Trace:
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014901] [<ffffffffc0b84189>] ? lnet_inet_enumerate+0x59/0x2d0 [lnet]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.014957] [<ffffffff8170834e>] ? getnstimeofday64+0xe/0x30
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015010] [<ffffffffc0cafa4e>] kiblnd_startup+0x2be/0x1840 [ko2iblnd]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015078] [<ffffffffc0b7d918>] lnet_startup_lndnet+0x138/0x900 [lnet]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015145] [<ffffffffc0b87f52>] ? lnet_parse_networks+0x772/0xaf0 [lnet]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015212] [<ffffffffc0b81715>] LNetNIInit+0x6b5/0xc10 [lnet]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015262] [<ffffffff81d84022>] ? mutex_lock+0x12/0x2f
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015312] [<ffffffffc0c4d000>] ? 0xffffffffc0c4cfff
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015403] [<ffffffffc0f2143c>] ptlrpc_ni_init+0x2c/0x1a0 [ptlrpc]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015498] [<ffffffffc0f215c1>] ptlrpc_init_portals+0x11/0xf0 [ptlrpc]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015595] [<ffffffffc0c4d187>] ptlrpc_init+0x187/0x1000 [ptlrpc]
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015649] [<ffffffff8160210a>] do_one_initcall+0xba/0x240
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015699] [<ffffffff8171ee5a>] load_module+0x271a/0x2bb0
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015745] [<ffffffff819b3400>] ? ddebug_proc_write+0x100/0x100
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015797] [<ffffffff8171f3df>] SyS_init_module+0xef/0x140
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015845] [<ffffffff81d92ed2>] system_call_fastpath+0x25/0x2a
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.015892] Code: 90 a0 02 00 00 48 85 d2 74 bb 4c 89 ee e8 63 9f d4 ff 48 8b 1b 49 39 dc 75 c7 5b 41 5c 41 5d 5d c3 0f 1f 40 00 0f 1f 44 00 00 55 <0f> b7 87 b8 01 00 00 8b 97 b0 01 00 00 48 89 e5 81 e2 bf fc fc
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.016255] RIP [<ffffffff81c4c5e6>] dev_get_flags+0x6/0x70
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.016307] RSP <ffff8907511f7ad8>
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.019113] CR2: 0000000000000168
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.033701] --[ end trace 36a69e80c691a70c ]--
      Aug 11 14:47:26 holyitc02 kernel: [ 1184.103089] Kernel panic - not syncing: Fatal exception

       

      Attachments

        Activity

          People

            wc-triage WC Triage
            pauledmon Paul Edmon
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: