Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-17418

Lustre crashed immediately after load with debug kernel

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.16.0
    • None
    • None
    • RHEL 8.5 debug kernel
    • 3
    • 9223372036854775807

    Description

      LU-9859 libcfs: refactor libcfs initialization. had removed libcfs debug init from libcfs module init itself.
      it caused a panic if debug settings will applied after module load.

      crash7> bt
      PID: 11558  TASK: ffff888113890000  CPU: 1   COMMAND: "lt-lctl"
       #0 [ffff88810ab5f848] __show_regs at ffffffff8186a0c5
       #1 [ffff88810ab5f8d0] general_protection at ffffffff83a0115e
          [exception RIP: cfs_trace_lock_tcd+37]
          RIP: ffffffffc1495a95  RSP: ffff88810ab5f988  RFLAGS: 00010296
          RAX: dffffc0000000000  RBX: 00000000000000c0  RCX: ffffffffc14d3580
          RDX: 0000000000000029  RSI: 0000000000000000  RDI: 000000000000014c
          RBP: ffff88810ab5fbb0   R8: 0000000000000001   R9: 00000000130e0580
          R10: ffff88810ab5fbc8  R11: ffffed102270b633  R12: 00000000000000c0
          R13: ffffffffc14902a0  R14: 0000000000000001  R15: 0000000000000000
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #2 [ffff88810ab5f998] libcfs_debug_msg at ffffffffc1496bec [libcfs]
       #3 [ffff88810ab5fbc0] cfs_str2mask at ffffffffc149e1b5 [libcfs]
       #4 [ffff88810ab5fc30] libcfs_debug_str2mask at ffffffffc149117b [libcfs]
       #5 [ffff88810ab5fd20] proc_dobitmasks at ffffffffc1492c90 [libcfs]
       #6 [ffff88810ab5fd70] lnet_debugfs_write at ffffffffc1491d44 [libcfs]
       #7 [ffff88810ab5fe00] full_proxy_write at ffffffff8242f903
       #8 [ffff88810ab5fe48] vfs_write at ffffffff8220ca57
       #9 [ffff88810ab5fe88] ksys_write at ffffffff8220d218
      #10 [ffff88810ab5ff28] do_syscall_64 at ffffffff81809865
      #11 [ffff88810ab5ff50] entry_SYSCALL_64_after_hwframe at ffffffff83a000b2
          RIP: 00007fed5c5e5915  RSP: 00007ffdaaa3f9c8  RFLAGS: 00000246
          RAX: ffffffffffffffda  RBX: 00007ffdaaa43422  RCX: 00007fed5c5e5915
          RDX: 000000000000001c  RSI: 00007ffdaaa43422  RDI: 0000000000000003
          RBP: 0000000000000003   R8: 0000000000000800   R9: 0000000000000003
          R10: 000000000000000b  R11: 0000000000000246  R12: 00007ffdaaa41bb8
          R13: 00000000017252a0  R14: 00000000017252e0  R15: 0000000000000000
          ORIG_RAX: 0000000000000001  CS: 0033  SS: 002b
      

      lets restore old behavior.

      Attachments

        Issue Links

          Activity

            [LU-17418] Lustre crashed immediately after load with debug kernel

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53825/
            Subject: LU-17418 libcfs: support debug setup for libcfs modules
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: d12b18195cedd629e94b5bafecd7c9509484c89f

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/53825/ Subject: LU-17418 libcfs: support debug setup for libcfs modules Project: fs/lustre-release Branch: master Current Patch Set: Commit: d12b18195cedd629e94b5bafecd7c9509484c89f

            "jsimmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53825
            Subject: LU-17418 tests: test lctl set_param when only libcfs.ko loaded
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: dc82487403c6de947237764ae40a16f10f9d064e

            gerrit Gerrit Updater added a comment - "jsimmons <jsimmons@infradead.org>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53825 Subject: LU-17418 tests: test lctl set_param when only libcfs.ko loaded Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: dc82487403c6de947237764ae40a16f10f9d064e
            simmonsja James A Simmons added a comment - Patch https://review.whamcloud.com/c/fs/lustre-release/+/41664 does resolve this issue.

            "Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53821
            Subject: LU-17418 libcfs: fix libcfs init.
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 1174e98b1d7ac4047ab3e45bb4b3f1af9414ac4d

            gerrit Gerrit Updater added a comment - "Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53821 Subject: LU-17418 libcfs: fix libcfs init. Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 1174e98b1d7ac4047ab3e45bb4b3f1af9414ac4d

            "Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53650
            Subject: LU-17418 libcfs: fix libcfs init.
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 49837ac0c06e559aa81917cc92add6382aa3b78d

            gerrit Gerrit Updater added a comment - "Alexey Lyashkov <alexey.lyashkov@hpe.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/53650 Subject: LU-17418 libcfs: fix libcfs init. Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 49837ac0c06e559aa81917cc92add6382aa3b78d
            pjones Peter Jones added a comment -

            Sorry I am not following - can you please post a link to the patch in Gerrit? Nothing has been auto-commented in Jira and I cannot find anything searching on the LU-17418 reference...

            pjones Peter Jones added a comment - Sorry I am not following - can you please post a link to the patch in Gerrit? Nothing has been auto-commented in Jira and I cannot find anything searching on the LU-17418 reference...

            Peter,

            I have a patch and it posted today.

            From: "Mr. NeilBrown" <neilb@suse.de>
            Date: Wed, 8 Nov 2023 21:15:09 -0500
            Subject: [PATCH] LU-9859 libcfs: refactor libcfs initialization.
            Linux-commit: 64bf0b1a079d61e9e059b9dc7a58e064c7d994ae

            Change-Id: I6b5ecdba0defc6e033f78d8fc2b9be9e26c7f720
            Signed-off-by: Mr. NeilBrown <neilb@suse.de>
            Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
            Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52700
            Tested-by: jenkins <devops@whamcloud.com>
            Tested-by: Maloo <maloo@whamcloud.com>
            Reviewed-by: Timothy Day <timday@amazon.com>
            Reviewed-by: Oleg Drokin <green@whamcloud.com>

            shadow Alexey Lyashkov added a comment - Peter, I have a patch and it posted today. From: "Mr. NeilBrown" <neilb@suse.de> Date: Wed, 8 Nov 2023 21:15:09 -0500 Subject: [PATCH] LU-9859 libcfs: refactor libcfs initialization. Linux-commit: 64bf0b1a079d61e9e059b9dc7a58e064c7d994ae Change-Id: I6b5ecdba0defc6e033f78d8fc2b9be9e26c7f720 Signed-off-by: Mr. NeilBrown <neilb@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/52700 Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Timothy Day <timday@amazon.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
            pjones Peter Jones added a comment -

            James

            I think that this was your change being referenced. Would this same crash still happen on a more current kernel? RHEL 9.3 is the primary kernel targeted for the 2.16 release...

            Peter

            pjones Peter Jones added a comment - James I think that this was your change being referenced. Would this same crash still happen on a more current kernel? RHEL 9.3 is the primary kernel targeted for the 2.16 release... Peter

            People

              simmonsja James A Simmons
              shadow Alexey Lyashkov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: