Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17247

BUG: unable to handle kernel NULL pointer dereference in kiblnd_passive_connect

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Major
    • None
    • Lustre 2.16.0
    • None
    • master, RHEL8.7
    • 3
    • 9223372036854775807

    Description

      server crashed due to NULL pointer dereference in kiblnd_passive_connect below

      [14161.702631] libcfs: HW NUMA nodes: 1, HW CPU cores: 24, npartitions: 4
      [14161.705274] alg: No test for adler32 (adler32-zlib)
      [14162.456545] Key type ._llcrypt registered
      [14162.457357] Key type .llcrypt registered
      [14162.484133] Lustre: Lustre: Build Version: 2.15.58_109_g40074d3
      [14162.540341] LNet: Using FastReg for registration
      [14162.750736] LNet: Added LNI 10.0.11.209@o2ib12 [32/1024/0/180]
      [14162.950680] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
      [14162.951989] PGD 0 
      [14162.952520] Oops: 0000 [#1] SMP NOPTI
      [14162.953250] CPU: 22 PID: 201160 Comm: kworker/22:4 Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-425.13.1.el8_lustre.ddn17.x86_64 #1
      [14162.955184] Hardware name: DDN SFA400NVX2E, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      [14162.956667] Workqueue: ib_cm cm_work_handler [ib_cm]
      [14162.957565] RIP: 0010:kiblnd_passive_connect+0x1395/0x1620 [ko2iblnd]
      [14162.958644] Code: c7 05 63 81 01 00 00 01 00 00 e8 26 03 f4 ff 48 89 df ba 40 00 00 00 48 89 c6 e8 06 10 f4 ff 45 8b b4 24 24 01 00 00 49 89 c7 <48> 8b 04 25 40 00 00 00 48 8d 58 38 e8 fa 02 f4 ff 48 89 df ba 40
      [14162.961535] RSP: 0018:ff7a599b4dca79a0 EFLAGS: 00010246
      [14162.962473] RAX: ffffffffc1038f00 RBX: 0005001614010bd1 RCX: 0000000000000000
      [14162.963534] LNet: Added LNI 20.1.11.209@o2ib22 [32/1024/0/180]
      [14162.963649] RDX: ffffffffc1038f12 RSI: 0000000000000000 RDI: 0000000000000000
      [14162.965863] RBP: ff36491ca4dbcc00 R08: 0000000000000001 R09: 0000000000000000
      [14162.967015] R10: ffffffffc1038f40 R11: ffffffffc1038f12 R12: ff364925b2ba2a00
      [14162.968167] R13: ff36492daa67a5b0 R14: 0000000000000000 R15: ffffffffc1038f00
      [14162.969313] FS:  0000000000000000(0000) GS:ff36493e31b80000(0000) knlGS:0000000000000000
      [14162.970594] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [14162.971560] CR2: 0000000000000040 CR3: 0000000f8bc10003 CR4: 0000000000771ee0
      [14162.972711] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [14162.973846] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [14162.974976] PKRU: 55555554
      [14162.975553] Call Trace:
      [14162.976085]  ? xas_store+0x56/0x5a0
      [14162.976755]  kiblnd_cm_callback+0x3d7/0x1e90 [ko2iblnd]
      [14162.977639]  ? __xa_alloc_cyclic+0x49/0xe0
      [14162.978375]  cma_cm_event_handler+0x25/0xd0 [rdma_cm]
      [14162.979227]  cma_ib_req_handler+0x7d1/0x1260 [rdma_cm]
      [14162.980090]  ? update_group_capacity+0x25/0x220
      [14162.980872]  cm_process_work+0x22/0xf0 [ib_cm]
      [14162.981638]  cm_req_handler+0x7f1/0xf40 [ib_cm]
      [14162.982416]  cm_work_handler+0x79c/0xf30 [ib_cm]
      [14162.983198]  ? __switch_to+0x10c/0x450
      [14162.983872]  ? finish_task_switch+0xaf/0x2e0
      [14162.984607]  process_one_work+0x1a7/0x360
      [14162.985300]  ? create_worker+0x1a0/0x1a0
      [14162.985979]  worker_thread+0x30/0x390
      [14162.986623]  ? create_worker+0x1a0/0x1a0
      [14162.987292]  kthread+0x10b/0x130
      [14162.987874]  ? set_kthread_struct+0x50/0x50
      [14162.988577]  ret_from_fork+0x1f/0x40
      [14162.989205] Modules linked in: ko2iblnd(OE) ptlrpc(OE+) obdclass(OE) lnet(OE) libcfs(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) sunrpc intel_rapl_msr intel_rapl_common nfit libnvdimm kvm_intel kvm irqbypass iTCO_wdt ppdev iTCO_vendor_support crct10dif_pclmul crc32_pclmul bochs drm_vram_helper drm_ttm_helper ghash_clmulni_intel ttm rapl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops pcspkr i2c_i801 drm joydev lpc_ich i6300esb parport_pc parport ext4 mbcache jbd2 sr_mod sd_mod cdrom t10_pi sg mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) mlx5_core(OE) mlxfw(OE) pci_hyperv_intf ahci tls libahci psample mlxdevm(OE) virtio_net libata bnxt_en crc32c_intel net_failover serio_raw virtio_blk mlx_compat(OE) virtio_scsi failover dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libcfs]
      

      Attachments

        Issue Links

          Activity

            People

              wc-triage WC Triage
              sihara Shuichi Ihara
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: