Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.17.0, Lustre 2.15.6
    • Lustre 2.16.0, Lustre 2.15.6
    • 3
    • 9223372036854775807

    Description

      Red Hat Enterprise Linux 9.5 Beta release kernel version: 5.14.0-503.2.1.el9_5

      2024-09-06 Lucas Zampieri <lzampier@redhat.com> [5.14.0-503.2.1.el9_5]
      
          - sctp: fix association labeling in the duplicate COOKIE-ECHO case (Ondrej Mosnacek) [RHEL-48647]
          - s390/ap: Refine AP bus bindings complete processing (Cédric Le Goater) [RHEL-50373]
          - ice: Add netif_device_attach/detach into PF reset flow (Michal Schmidt) [RHEL-56084]
      2024-09-03 Lucas Zampieri <lzampier@redhat.com> [5.14.0-503.1.1.el9_5]
      
          - usb: xhci: prevent potential failure in handle_tx_event() for Transfer events without TRB (Desnes Nunes) [RHEL-52378] {CVE-2024-42226}
          - redhat: set defaults for RHEL 9.5 (Lucas Zampieri)
      2024-08-22 Lucas Zampieri <lzampier@redhat.com> [5.14.0-503.el9]
      
          - Revert "Merge: scsi: fnic: driver update" (John Meneghini) [RHEL-36420]
          - dev/parport: fix the array out-of-bounds risk (CKI Backport Bot) [RHEL-54990] {CVE-2024-42301}
          - leds: trigger: Unregister sysfs attributes before calling deactivate() (CKI Backport Bot) [RHEL-54835] {CVE-2024-43830}
          - null_blk: fix validation of block size (Ming Lei) [RHEL-51322] {CVE-2024-41077}
          - s390/fpu: Re-add exception handling in load_fpu_state() (Aristeu Rozanski) [RHEL-39346]
          - redhat: spec: add cachestat to kselftest package (Eric Chanudet) [RHEL-50302]
          - selftests: cachestat: Fix build warnings on ppc64 (Eric Chanudet) [RHEL-50302]
          - selftests/cachestat: Fix print_cachestat format (Eric Chanudet) [RHEL-50302]
          - selftests: cachestat: use proper syscall number macro (Eric Chanudet) [RHEL-50302]
          - selftests: cachestat: properly link in librt (Eric Chanudet) [RHEL-50302]
          - selftests: cachestat: catch failing fsync test on tmpfs (Eric Chanudet) [RHEL-50302]
          - selftests: cachestat: test for cachestat availability (Eric Chanudet) [RHEL-50302]
          - selftests: add selftests for cachestat (Eric Chanudet) [RHEL-50302]
      <~snip~>
      

      Attachments

        Issue Links

          Activity

            [LU-18387] RHEL 9.5 support
            eaujames Etienne Aujames made changes -
            Labels New: LTS15
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18728 [ LU-18728 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18652 [ LU-18652 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18624 [ LU-18624 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18624 [ LU-18624 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18652 [ LU-18652 ]

            After migrating from RHEL 9.4 to RHEL 9.5, we have had 1-2% failure rate per hour of compute nodes in our cluster. I cannot pinpoint it to lustre, as I have not yet managed to get even a kernel panic out from the instantly freezing nodes. I'm reporting here in hope of someone else having similar experience. The behaviour reminds me of freezes I saw without the patch to LU-17696, but occurring much more rarely.

            I will next concentrate on getting more information how exactly the nodes die.

            akaslompolo Simppa Akaslompolo added a comment - After migrating from RHEL 9.4 to RHEL 9.5, we have had 1-2% failure rate per hour of compute nodes in our cluster. I cannot pinpoint it to lustre, as I have not yet managed to get even a kernel panic out from the instantly freezing nodes. I'm reporting here in hope of someone else having similar experience. The behaviour reminds me of freezes I saw without the patch to LU-17696 , but occurring much more rarely. I will next concentrate on getting more information how exactly the nodes die.

            "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57870/
            Subject: LU-18387 kernel: add missing patch for RHEL 9.5
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: 7fb09a184a22e020da57e6277b3d52c29228155e

            gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/57870/ Subject: LU-18387 kernel: add missing patch for RHEL 9.5 Project: fs/lustre-release Branch: master Current Patch Set: Commit: 7fb09a184a22e020da57e6277b3d52c29228155e
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18646 [ LU-18646 ]
            adilger Andreas Dilger made changes -
            Link New: This issue is related to LU-18676 [ LU-18676 ]

            People

              yujian Jian Yu
              yujian Jian Yu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: