Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14733

brw_bulk_ready() BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Critical
    • Lustre 2.12.8, Lustre 2.15.0
    • None
    • lustre-2.12.6_9.llnl client
      kernel-4.18.0-305.0.0.1toss.t4.x86_64
      RHEL84
    • 3
    • 9223372036854775807

    Description

      lnet_selftest fails between two nodes over Omnipath

      dk.opal63.llnl.gov.7:00000001:00020000:43.0:1622598261.714620:0:129525:0:(brw_test.c:415:brw_bulk_ready()) BRW bulk READ failed for RPC from 12345-192.168.128.126@o2ib18: -103 

      Bulk transfers work over Infiniband (although in that test 1 of the nodes was RHEL 7.9 and an earlier Lustre patch stack).  Bulk transfers also work over tcp using ksocklnd.

      lctl pings work fine between the same two nodes.

      mpibench and other MPI applications also work fine over Omnipath between two nodes.

      See https://github.com/LLNL/lustre/releases/tag/2.12.6_9.llnl for the patch stack

      Attachments

        1. 01-move_null.patch
          1 kB
          Mike Marciniszyn
        2. 02-post_state.patch
          4 kB
          Mike Marciniszyn
        3. build.txt
          268 kB
          Olaf Faaland
        4. diff.txt
          1 kB
          Serguei Smirnov
        5. dk.opal188.llnl.gov.7.txt
          1.03 MB
          Olaf Faaland
        6. dk.opal63.llnl.gov.7.txt
          757 kB
          Olaf Faaland
        7. dmesg.opal188.txt
          147 kB
          Olaf Faaland
        8. dmesg.opal63.txt
          139 kB
          Olaf Faaland
        9. kprobes.sh
          5 kB
          Mike Marciniszyn
        10. kprobes-off.sh
          2 kB
          Mike Marciniszyn
        11. linux-kernel-test.patch
          2 kB
          Mike Marciniszyn
        12. move_null.patch
          0.8 kB
          Mike Marciniszyn
        13. post_state.patch
          3 kB
          Mike Marciniszyn
        14. trace1.txt
          36 kB
          Mike Marciniszyn
        15. trace2.txt
          51 kB
          Mike Marciniszyn

        Issue Links

          Activity

            People

              ssmirnov Serguei Smirnov
              ofaaland Olaf Faaland
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: