[LU-17201] LNetError in o2iblnd.c with qib HCA under EL9.2 Created: 16/Oct/23 Updated: 21/Oct/23 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.15.3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Minor |
| Reporter: | Nathan Crawford | Assignee: | WC Triage |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Alma Linux 9.2 |
||
| Severity: | 3 |
| Rank (Obsolete): | 9223372036854775807 |
| Description |
|
LNET loads the tcp interface fine, but o2ib fails with this kernel message: We are trying (perhaps over-hopefully) to get the lustre client to work in EL9 on old Qlogic/Intel Truescale Infiniband hardware. RedHat had removed the qib module back in EL8, although it remains in the mainline kernels from kernel.org. The ELRepo repository maintains a few of these RH-deprecated kernel modules compiled against the RHEL kernel. As of kmod-ib_qib-1.11-6.el9_2.elrepo, this module actually works. The closest bug report I could find is LU-10549, which suggests a mismatch in real vs. expected data fields reported by the module. I suspect no-one has actually tried the EL9 kernel ib_qib with lustre, considering it only started working last week. In the mean time, I'll try to swap out the EL9.2 kernel + kmod with the ELRepo-maintained kernel-lt, which includes the standard kernel.org qib module.
|
| Comments |
| Comment by Fredrik Nyström [ 18/Oct/23 ] |
|
Quick question, did you also install infinipath-psm (provides /etc/udev/rules.d/60-ipath.rules)? We did some tests late last year with Rocky 9 + ib_qib with the patch that is now in elrepo. Had to install infinipath-psm from CentOS 7 (failed to rebuild it for el9). We did not try Lustre o2ib. |
| Comment by Nathan Crawford [ 21/Oct/23 ] |
|
We haven't tried to install the psm libs, but they weren't needed for the Lustre client on Rocky 8. The proposed workaround isn't going to work as the ELRepo kernel-lt for el9 is already too new (6.1.58). May need to dig a bit more. |