Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3447

Client RDMA too fragmented: 128/255 src 128/256 dst frags

Details

    • Bug
    • Resolution: Fixed
    • Major
    • None
    • Lustre 2.1.5
    • Lustre servers running 2.1.5, Lustre clients with 1.8.9.
    • 3
    • 8618

    Description

      During an IOR-like benchmark doing directIO from multiple clients (16, 64) clients get disconnected and evicted. The MPI process dies in misery and some of it's processes aren't even killable.

      We've seen that there was a similar bug a while ago that was marked as solved, it was occuring on lnet routers (https://bugzilla.lustre.org/show_bug.cgi?id=13607). This one is on clients.

      What can lead to the "RDMA too fragmented" issue? Any hint or suggestion? Client log messages are in the attached file.

      Regards,
      Erich

      Attachments

        Activity

          [LU-3447] Client RDMA too fragmented: 128/255 src 128/256 dst frags

          Customer was able to resolve problem. No more required here.

          jfc John Fuchs-Chesney (Inactive) added a comment - Customer was able to resolve problem. No more required here.

          Erich,
          Do you want us to keep this ticket open?
          Maybe you have had a chance to test the issue on a later installation?
          Thanks,
          ~ jfc.

          jfc John Fuchs-Chesney (Inactive) added a comment - Erich, Do you want us to keep this ticket open? Maybe you have had a chance to test the issue on a later installation? Thanks, ~ jfc.
          efocht Erich Focht added a comment -

          Hi Bruno,

          unfortunately we cannot use the module option there. It is a huge enironment with several Lustre setups and the customer is not willing to switch that option over everywhere. Which we'd need to do (as far as I understand) on clients as well as on servers. So we can't switch the clients selectively over. But we will test it as soon as we can on another (upcoming) installation.

          Regards,
          Erich

          efocht Erich Focht added a comment - Hi Bruno, unfortunately we cannot use the module option there. It is a huge enironment with several Lustre setups and the customer is not willing to switch that option over everywhere. Which we'd need to do (as far as I understand) on clients as well as on servers. So we can't switch the clients selectively over. But we will test it as soon as we can on another (upcoming) installation. Regards, Erich

          Hello Eric,
          Any news on your side ??

          bfaccini Bruno Faccini (Inactive) added a comment - Hello Eric, Any news on your side ??

          Hello Eric,
          Working more on this very un-frequent problem, it seems highly possible that it is caused by upper-layer/application doing big and un-aligned I/Os. Since you indicated that your customer got it when running some MPI application doing Direct-IOs, can you also check on his side about the fact that these I/Os could be unaligned (page boundaries) and about their size ??

          bfaccini Bruno Faccini (Inactive) added a comment - Hello Eric, Working more on this very un-frequent problem, it seems highly possible that it is caused by upper-layer/application doing big and un-aligned I/Os. Since you indicated that your customer got it when running some MPI application doing Direct-IOs, can you also check on his side about the fact that these I/Os could be unaligned (page boundaries) and about their size ??
          efocht Erich Focht added a comment -

          Hi Bruno,

          is that option available on 1.8.9 as well as on 2.X? Thanks for pointing me to it!

          It is difficult to do that in the customer's environment if we need to set this on both clients and servers, he has 3-4 Lustre filesystems (not all from us), a mix of versions, and 3.5k clients. But I'll try to find an opportunity to do it and discuss with the customer.

          Best regards,
          Erich

          efocht Erich Focht added a comment - Hi Bruno, is that option available on 1.8.9 as well as on 2.X? Thanks for pointing me to it! It is difficult to do that in the customer's environment if we need to set this on both clients and servers, he has 3-4 Lustre filesystems (not all from us), a mix of versions, and 3.5k clients. But I'll try to find an opportunity to do it and discuss with the customer. Best regards, Erich
          pjones Peter Jones added a comment -

          Bruno

          Can you please advise?

          Thanks

          Peter

          pjones Peter Jones added a comment - Bruno Can you please advise? Thanks Peter

          Hello Eric,
          Thank's for the hint that solved the issue on your side.
          But to be complete on this it would be nice to give a try to the "map_on_demand" dynamic feature (o2iblnd proc/module parameter, but this has to be set on all nodes) that may also be a way to fix such problem.

          bfaccini Bruno Faccini (Inactive) added a comment - Hello Eric, Thank's for the hint that solved the issue on your side. But to be complete on this it would be nice to give a try to the "map_on_demand" dynamic feature (o2iblnd proc/module parameter, but this has to be set on all nodes) that may also be a way to fix such problem.
          efocht Erich Focht added a comment - - edited

          Increasing the MTT size on the client nodes seems to solve the problem. For instructions: http://community.mellanox.com/docs/DOC-1120
          We've set log_num_mtt to 24.

          Having a more meaningful error message would be nice.

          This bug can be closed.

          efocht Erich Focht added a comment - - edited Increasing the MTT size on the client nodes seems to solve the problem. For instructions: http://community.mellanox.com/docs/DOC-1120 We've set log_num_mtt to 24. Having a more meaningful error message would be nice. This bug can be closed.

          People

            bfaccini Bruno Faccini (Inactive)
            efocht Erich Focht
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: