Details
-
Bug
-
Resolution: Fixed
-
Critical
-
None
-
lustre-2.12.6_9.llnl client
kernel-4.18.0-305.0.0.1toss.t4.x86_64
RHEL84
-
3
-
9223372036854775807
Description
lnet_selftest fails between two nodes over Omnipath
dk.opal63.llnl.gov.7:00000001:00020000:43.0:1622598261.714620:0:129525:0:(brw_test.c:415:brw_bulk_ready()) BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103
Bulk transfers work over Infiniband (although in that test 1 of the nodes was RHEL 7.9 and an earlier Lustre patch stack). Bulk transfers also work over tcp using ksocklnd.
lctl pings work fine between the same two nodes.
mpibench and other MPI applications also work fine over Omnipath between two nodes.
See https://github.com/LLNL/lustre/releases/tag/2.12.6_9.llnl for the patch stack
Attachments
Issue Links
Activity
Link | Original: This issue is related to JFC-21 [ JFC-21 ] |
Fix Version/s | New: Lustre 2.12.8 [ 15093 ] |
Link | New: This issue is duplicated by NEC-83 [ NEC-83 ] |
Fix Version/s | New: Lustre 2.15.0 [ 14791 ] | |
Resolution | New: Fixed [ 1 ] | |
Status | Original: Open [ 1 ] | New: Resolved [ 5 ] |
Labels | Original: LTS12 llnl topllnl | New: LTS12 llnl |
Labels | Original: llnl topllnl | New: LTS12 llnl topllnl |
Attachment | New: 02-post_state.patch [ 39551 ] | |
Attachment | New: 01-move_null.patch [ 39552 ] |
Attachment | Original: 02-post_state.patch [ 39550 ] |