[LU-15126] RHEL 8.5 support Created: 19/Oct/21  Updated: 14/Feb/22  Resolved: 30/Nov/21

Status: Resolved
Project: Lustre
Component/s: None
Affects Version/s: None
Fix Version/s: Lustre 2.12.8, Lustre 2.15.0

Type: Improvement Priority: Minor
Reporter: Jian Yu Assignee: Jian Yu
Resolution: Fixed Votes: 0
Labels: llnl

Issue Links:
Related
is related to LU-15184 LBUG: ASSERTION( buf_size == strlen(s... Resolved
is related to LU-15212 Update ZFS version to 2.0.1 Resolved
is related to LU-15222 Update ZFS version to 2.0.6 Resolved
is related to LU-15409 kernel update [RHEL8.5 4.18.0-348.7.1... Resolved
Rank (Obsolete): 9223372036854775807

 Description   

Announcing the Beta release of Red Hat Enterprise Linux 8.5:
https://access.redhat.com/announcements/6360652

The Beta release of RHEL 8.5 offers:

  • New container management tools: Continuously increase collaboration between the development and operations teams to closely work together and power the adoption of containers with standardized, and secure container development tooling and base images.
  • Extended proactive management: Extended capabilities in the Red Hat Insights services - vulnerability, compliance, and subscriptions to enable organizations to more efficiently, and effectively manage their RHEL estates across the open hybrid cloud, including deployments in the public cloud.
  • More powerful data visualization: Features and new enhancements that can create simpler views of complex system performance data at scale powered by the RHEL web console, Grafana dashboard, and Red Hat Performance Co-Pilot.
  • Support for additional development tools: New RHEL support for OpenJDK (Java) and .NET 6 to continuously bring a solid foundation for developers to modernize and build next generation workloads and applications.


 Comments   
Comment by Gerrit Updater [ 19/Oct/21 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45285
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: 2e2ab9deb0cf130bdcb8cb07fdef5d43f47d240e

Comment by Jian Yu [ 19/Oct/21 ]

There is no conflict while applying the patches in ldiskfs-4.18-rhel8.4.series to RHEL 8.5 kernel 4.18.0-339.el8. I'm working on the server support patch.

Comment by Gerrit Updater [ 20/Oct/21 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45306
Subject: LU-15126 kernel: RHEL 8.5 server support
Project: fs/lustre-release
Branch: master
Current Patch Set: 1
Commit: e0e9020d8deed0a593589fbc856909fe4a67f830

Comment by Jian Yu [ 20/Oct/21 ]

Both client and server builds passed on my local vm node. After installing the rpms, I ran into the following kernel panic failure while running runtests:

Lustre: DEBUG MARKER: -----============= acceptance-small: runtests ============----- Wed Oct 20 10:15:59 PDT 2021
Lustre: DEBUG MARKER: vm85: executing check_config_client /mnt/lustre
Lustre: DEBUG MARKER: Using TIMEOUT=20
Lustre: Modifying parameter general.*.*.lbug_on_grant_miscount in log params
Lustre: DEBUG MARKER: == runtests test 1: All Runtests ========================= 10:16:00 (1634750160)
Lustre: DEBUG MARKER: touching /mnt/lustre at Wed Oct 20 10:16:05 PDT 2021 (@1634750165)
Lustre: DEBUG MARKER: create an empty file /mnt/lustre/hosts.7841
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.7841
Lustre: DEBUG MARKER: comparing /etc/hosts and /mnt/lustre/hosts.7841
Lustre: DEBUG MARKER: renaming /mnt/lustre/hosts.7841 to /mnt/lustre/hosts.7841.ren
Lustre: DEBUG MARKER: copying /etc/hosts to /mnt/lustre/hosts.7841 again
LustreError: 8831:0:(pack_generic.c:805:lustre_msg_string()) can't unpack non-NULL terminated string in msg 000000008f1ca60d buffer[8] len 0
LustreError: 8831:0:(layout.c:2165:__req_capsule_get()) @@@ Wrong buffer for field 'file_secctx_name' (8 of 12) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (client)  req@00000000fd326d4b x1714159751286720/t0(0) o101->lustre-MDT0000-mdc-ffff9ab7ea66e000@0@lo:12/10 lens 656/0 e 0 to 0 dl 0 ref 1 fl New:PQU/0/ffffffff rc 0/-1 job:''
LustreError: 8831:0:(mdc_lib.c:137:mdc_file_secctx_pack()) ASSERTION( buf_size == strlen(secctx_name) + 1 ) failed:
LustreError: 8831:0:(mdc_lib.c:137:mdc_file_secctx_pack()) LBUG
Pid: 8831, comm: cp 4.18.0-339.el8_lustre.x86_64 #1 SMP Tue Oct 19 12:47:57 PDT 2021
Call Trace TBD:
[<0>] libcfs_call_trace+0x6f/0x90 [libcfs]
[<0>] lbug_with_loc+0x43/0x80 [libcfs]
[<0>] mdc_file_secctx_pack.part.4+0xcb/0x100 [mdc]
[<0>] mdc_open_pack+0x22d/0x2f0 [mdc]
[<0>] mdc_intent_open_pack+0x2b6/0x900 [mdc]
[<0>] mdc_enqueue_base+0x503/0x1420 [mdc]
[<0>] mdc_intent_lock+0x219/0x560 [mdc]
[<0>] lmv_intent_open+0x29e/0xb70 [lmv]
[<0>] lmv_intent_lock+0x19c/0x390 [lmv]
[<0>] ll_lookup_it+0x6fc/0x1c90 [lustre]
[<0>] ll_atomic_open+0x256/0x1960 [lustre]
[<0>] path_openat+0xeff/0x14f0
[<0>] do_filp_open+0x93/0x100
[<0>] do_sys_open+0x184/0x220
[<0>] do_syscall_64+0x5b/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
Kernel panic - not syncing: LBUG
Comment by Jian Yu [ 20/Oct/21 ]

sanity test hit the same issue:

Lustre: DEBUG MARKER: -----============= acceptance-small: sanity ============----- Wed Oct 20 11:00:45 PDT 2021
Lustre: DEBUG MARKER: excepting tests: 42a 42b 42c 407 312 817 411
Lustre: DEBUG MARKER: vm85: executing check_config_client /mnt/lustre
Lustre: DEBUG MARKER: Using TIMEOUT=20
Lustre: Modifying parameter general.*.*.lbug_on_grant_miscount in log params
LustreError: 12549:0:(pack_generic.c:805:lustre_msg_string()) can't unpack non-NULL terminated string in msg 00000000f47f1c92 buffer[8] len 0
LustreError: 12549:0:(layout.c:2165:__req_capsule_get()) @@@ Wrong buffer for field 'file_secctx_name' (8 of 12) in format 'LDLM_INTENT_OPEN', 0 vs. 0 (client)  req@00000000c6c4e1e0 x1714162568806976/t0(0) o101->lustre-MDT0000-mdc-ffff9ea0c33ae000@0@lo:12/10 lens 648/0 e 0 to 0 dl 0 ref 1 fl New:PQU/0/ffffffff rc 0/-1 job:'' 
LustreError: 12549:0:(mdc_lib.c:137:mdc_file_secctx_pack()) ASSERTION( buf_size == strlen(secctx_name) + 1 ) failed:  
LustreError: 12549:0:(mdc_lib.c:137:mdc_file_secctx_pack()) LBUG
Comment by Jian Yu [ 27/Oct/21 ]
LustreError: 8831:0:(pack_generic.c:805:lustre_msg_string()) can't unpack non-NULL terminated string in msg 000000008f1ca60d buffer[8] len 0

In lustre_msg_string(), the value of blen returned from lustre_msg_buflen_v2(m, index) is 0, which caused the above error.
While in lustre_msg_buflen_v2(), somehow the index value passed to it is equal to the number of buffers in lm_buflens[], which made the function return 0.

static inline __u32 lustre_msg_buflen_v2(struct lustre_msg_v2 *m, __u32 n)
{
        if (n >= m->lm_bufcount)
                return 0;       
        
        return m->lm_buflens[n];
}
Comment by Jian Yu [ 01/Nov/21 ]

It turned out the above error occurred with SELinux disabled, but op_data->op_file_secctx_name was security.selinux in mdc_open_pack(). I created a ticket LU-15184.
After enabling SELinux, runtests passed:
https://testing.whamcloud.com/test_sessions/2f61f9db-5ec6-402c-866f-b986d7323178

Comment by Jian Yu [ 10/Nov/21 ]

The new RHEL 8.5 kernel version is 4.18.0-348.el8:
https://access.redhat.com/errata/RHSA-2021:4356

RHEL 8.5 is GA:
https://access.redhat.com/announcements/6488381

Comment by Sebastien Buisson [ 10/Nov/21 ]

Patch https://review.whamcloud.com/45501 has been pushed to address problem reported under LU-15184.

Comment by Gerrit Updater [ 10/Nov/21 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45525
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: b2_14
Current Patch Set: 1
Commit: 1c8b67036f98b0a783cab24629a848c83d0bdef3

Comment by Gerrit Updater [ 10/Nov/21 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45526
Subject: LU-15126 kernel: RHEL 8.5 server support
Project: fs/lustre-release
Branch: b2_14
Current Patch Set: 1
Commit: 0601cec02a2815ce112fb46d70d73d2a8682c8b4

Comment by Gerrit Updater [ 10/Nov/21 ]

"Jian Yu <yujian@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/45528
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: b2_12
Current Patch Set: 1
Commit: 30bf2165dfe39d26e051300ae3d38a01ff7f56b4

Comment by Jian Yu [ 15/Nov/21 ]

Kernel version 4.18.0-348.2.1.el8_5 security fixes:

  • kernel: Insufficient validation of user-supplied sizes for the MSG_CRYPTO message type (CVE-2021-43267)
  • kernel: timer tree corruption leads to missing wakeup and system freeze (CVE-2021-20317)

https://access.redhat.com/errata/RHSA-2021:4647?sc_cid=701600000006NHXAA2

Comment by Gerrit Updater [ 17/Nov/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45528/
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: b2_12
Current Patch Set:
Commit: 2151156020d4ea3995d9b8e118ebb62fc7fc339e

Comment by Gerrit Updater [ 30/Nov/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45285/
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 951f31789f76295d182f56bef1fa8d92f69e7e2a

Comment by Gerrit Updater [ 30/Nov/21 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45306/
Subject: LU-15126 kernel: RHEL 8.5 server support
Project: fs/lustre-release
Branch: master
Current Patch Set:
Commit: 605ac53b1a621afa5d94b9854a7a1783d7e24afe

Comment by Peter Jones [ 30/Nov/21 ]

Landed for 2.15

Comment by Jian Yu [ 12/Jan/22 ]

Here is the patch series of RHEL 8.5 client and server support on Lustre b2_14 branch: https://review.whamcloud.com/46055

Comment by Gerrit Updater [ 05/Feb/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45525/
Subject: LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Project: fs/lustre-release
Branch: b2_14
Current Patch Set:
Commit: 2b0999fa71559fa7b131ec7b98dd2474606eb838

Comment by Gerrit Updater [ 05/Feb/22 ]

"Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/45526/
Subject: LU-15126 kernel: RHEL 8.5 server support
Project: fs/lustre-release
Branch: b2_14
Current Patch Set:
Commit: a81d4b984b1a4de7a309370ce3e6f461704a5f65

Generated at Sat Feb 10 03:15:40 UTC 2024 using Jira 9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c.