Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
lustre-2.12.6_9.llnl client
kernel-4.18.0-305.0.0.1toss.t4.x86_64
RHEL84
-
3
-
9223372036854775807
Description
lnet_selftest fails between two nodes over Omnipath
dk.opal63.llnl.gov.7:00000001:00020000:43.0:1622598261.714620:0:129525:0:(brw_test.c:415:brw_bulk_ready()) BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103
Bulk transfers work over Infiniband (although in that test 1 of the nodes was RHEL 7.9 and an earlier Lustre patch stack). Bulk transfers also work over tcp using ksocklnd.
lctl pings work fine between the same two nodes.
mpibench and other MPI applications also work fine over Omnipath between two nodes.
See https://github.com/LLNL/lustre/releases/tag/2.12.6_9.llnl for the patch stack
Attachments
Issue Links
Activity
Link | Original: This issue is related to JFC-21 [ JFC-21 ] |
Fix Version/s | New: Lustre 2.12.8 [ 15093 ] |
Link | New: This issue is duplicated by NEC-83 [ NEC-83 ] |
Fix Version/s | New: Lustre 2.15.0 [ 14791 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
"Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/44296/
Subject:
LU-14733o2iblnd: Avoid double posting invalidateProject: fs/lustre-release
Branch: b2_14
Current Patch Set:
Commit: 29da7cba3e7b3461d895010c7f7284b9649aba52