Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14733

brw_bulk_ready() BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: Critical Critical
    • Lustre 2.12.8, Lustre 2.15.0
    • None
    • lustre-2.12.6_9.llnl client
      kernel-4.18.0-305.0.0.1toss.t4.x86_64
      RHEL84
    • 3
    • 9223372036854775807

      lnet_selftest fails between two nodes over Omnipath

      dk.opal63.llnl.gov.7:00000001:00020000:43.0:1622598261.714620:0:129525:0:(brw_test.c:415:brw_bulk_ready()) BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103 

      Bulk transfers work over Infiniband (although in that test 1 of the nodes was RHEL 7.9 and an earlier Lustre patch stack).  Bulk transfers also work over tcp using ksocklnd.

      lctl pings work fine between the same two nodes.

      mpibench and other MPI applications also work fine over Omnipath between two nodes.

      See https://github.com/LLNL/lustre/releases/tag/2.12.6_9.llnl for the patch stack

        1. trace2.txt
          51 kB
          Mike Marciniszyn
        2. trace1.txt
          36 kB
          Mike Marciniszyn
        3. post_state.patch
          3 kB
          Mike Marciniszyn
        4. move_null.patch
          0.8 kB
          Mike Marciniszyn
        5. linux-kernel-test.patch
          2 kB
          Mike Marciniszyn
        6. kprobes-off.sh
          2 kB
          Mike Marciniszyn
        7. kprobes.sh
          5 kB
          Mike Marciniszyn
        8. dmesg.opal63.txt
          139 kB
          Olaf Faaland
        9. dmesg.opal188.txt
          147 kB
          Olaf Faaland
        10. dk.opal63.llnl.gov.7.txt
          757 kB
          Olaf Faaland
        11. dk.opal188.llnl.gov.7.txt
          1.03 MB
          Olaf Faaland
        12. diff.txt
          1 kB
          Serguei Smirnov
        13. build.txt
          268 kB
          Olaf Faaland
        14. 02-post_state.patch
          4 kB
          Mike Marciniszyn
        15. 01-move_null.patch
          1 kB
          Mike Marciniszyn

            ssmirnov Serguei Smirnov
            ofaaland Olaf Faaland
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: