Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-18415

PCC panic on CSI driver when add/remove PCC backends

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0
    • None
    • None
    • 3
    • 9223372036854775807

    Description

      PCC panic on CSI driver when add/remove PCC backends on a client as follows:

      [162411.826873] general protection fault, probably for non-canonical address 0x5a5a5a5a5a5a5a5a: 0000 [#1] SMP NOPTI
      [162411.837375] CPU: 33 PID: 529596 Comm: lctl Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-477.27.1.el8_8.x86_64 #1
      [162411.849439] Hardware name: Bull SAS H252-Z10-00/MZ12-HD0-00, BIOS M13a 11/16/2022
      [162411.857204] RIP: 0010:strlen+0x0/0x20
      [162411.861055] Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee e9 c3 00 42 00 0f 1f 00 <80> 3f 00 74 14 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 e9 a7
      [162411.880089] RSP: 0018:ffffc0d88bebfd98 EFLAGS: 00010206
      [162411.885504] RAX: ffff9eb80c7bd060 RBX: ffff9eab4bef36c0 RCX: ffff9eb80c7bd024
      [162411.892818] RDX: ffffffffc1adb368 RSI: ffff9eab4bef36c0 RDI: 5a5a5a5a5a5a5a5a
      [162411.900132] RBP: ffff9eb80c7bc008 R08: 0000000000000000 R09: ffff9eac3493d02d
      [162411.907452] R10: 0000000062343731 R11: 0000000000000004 R12: ffff9eb80c7bc000
      [162411.914776] R13: ffff9eab68e9d104 R14: ffff9eab68e9d100 R15: ffff9eab4bef36c0
      [162411.922098] FS:  00007f2c75ef3740(0000) GS:ffff9ee82e640000(0000) knlGS:0000000000000000
      [162411.930377] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [162411.936312] CR2: 00007fff233e0dd8 CR3: 00000001bebd2004 CR4: 0000000000770ee0
      [162411.943640] PKRU: 55555554
      [162411.946539] Call Trace:
      [162411.949185]  pcc_dataset_rule_init+0x27/0x1c0 [lustre]
      [162411.954539]  pcc_cmd_handle+0x74b/0x10c0 [lustre]
      [162411.959455]  ? _cond_resched+0x15/0x30
      [162411.963404]  ? ll_pcc_seq_write+0x76/0x3e0 [lustre]
      [162411.968489]  ? __kmalloc+0x113/0x250
      [162411.972263]  ll_pcc_seq_write+0x291/0x3e0 [lustre]
      [162411.977258]  full_proxy_write+0x53/0x80
      [162411.981289]  vfs_write+0xa5/0x1b0
      [162411.984799]  ksys_write+0x4f/0xb0
      [162411.988306]  do_syscall_64+0x5b/0x1b0
      [162411.992158]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
      [162411.997398] RIP: 0033:0x7f2c74e819e5
      [162412.001161] Code: 00 00 75 05 48 83 c4 58 c3 e8 27 4a ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 66 da 20 00 85 c0 75 12 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 53 c3 66 90 41 54 49 89 d4 55 48 89 f5 53 89
      [162412.020223] RSP: 002b:00007fff233e54a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [162412.027990] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2c74e819e5
      [162412.035321] RDX: 00000000000000ef RSI: 00007fff233e55b0 RDI: 0000000000000003
      [162412.042652] RBP: 00007fff233e54b0 R08: 0000000000000100 R09: 0000000000000040
      [162412.049985] R10: 000000000000000d R11: 0000000000000246 R12: 00007fff233e5500
      [162412.057315] R13: 00007fff233e55b0 R14: 0000000000000003 R15: 00007f2c74a46374
      [162412.064647] Modules linked in: mgc(OE) lustre(OE) mdc(OE) fid(OE) lov(OE) osc(OE) lmv(OE) fld(OE) ptlrpc(OE) ko2iblnd(OE) obdclass(OE) lnet(OE) libcfs(OE) veth vxlan ip6_udp_tunnel udp_tunnel ip6t_MASQUERADE nf_conntrack_netlink xt_nat ipt_MASQUERADE nft_limit ipt_REJECT nf_reject_ipv4 xt_limit xt_NFLOG nfnetlink_log xt_physdev xt_mark xt_multiport xt_addrtype xt_conntrack xt_comment nft_compat nft_chain_nat nft_counter nf_tables ip_set nfnetlink iptable_filter iptable_nat ip_tables nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c overlay br_netfilter bridge stp llc vfio_pci vfio_virqfd vfio_iommu_type1 vfio cuse fuse intel_rapl_msr intel_rapl_common ipmi_ssif wmi_bmof rdma_ucm(OE) amd64_edac_mod edac_mce_amd rdma_cm(OE) amd_energy iw_cm(OE) kvm_amd kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ib_ipoib(OE) pcspkr sp5100_tco acpi_ipmi ib_cm(OE) ccp ptdma ipmi_si i2c_piix4 k10temp wmi ipmi_devintf ipmi_msghandler ib_umad(OE) acpi_cpufreq knem(OE) ext4
      [162412.064694]  mbcache jbd2 sd_mod t10_pi sg mlx5_ib(OE) ib_uverbs(OE) ib
      

      There are some errors in the codes for PCC cleanup.

      Attachments

        Activity

          [LU-18415] PCC panic on CSI driver when add/remove PCC backends
          pjones Peter Jones added a comment -

          Merged for 2.17

          pjones Peter Jones added a comment - Merged for 2.17

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56824/
          Subject: LU-18415 pcc: fix panic when add/remove PCC backends
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: 8595ae132032748bb382e70cb80bfb768aca3f23

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/56824/ Subject: LU-18415 pcc: fix panic when add/remove PCC backends Project: fs/lustre-release Branch: master Current Patch Set: Commit: 8595ae132032748bb382e70cb80bfb768aca3f23

          "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56824
          Subject: LU-18415 pcc: fix panic when add/remove PCC backends
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 7ac48bd46fcc7ef3144d4491cf2f87ed0cc60e7f

          gerrit Gerrit Updater added a comment - "Qian Yingjin <qian@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/56824 Subject: LU-18415 pcc: fix panic when add/remove PCC backends Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 7ac48bd46fcc7ef3144d4491cf2f87ed0cc60e7f

          People

            qian_wc Qian Yingjin
            qian_wc Qian Yingjin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: