Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-9238

Enhancement for route failure detection

    XMLWordPrintable

Details

    • New Feature
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 9223372036854775807

    Description

      I've been thinking about ways to enhance route failure detection since the asymmetric route failure detection doesn't do much for multi-hop configurations. The idea I had was to extend the lnet ping info to include route up/down status. This way peers could get route status of their next hop and use that information in selecting an appropriate next hop for future sends. Furthermore, in multi-hop configurations any bad hop on the route should eventually percolate to all peers that use that route. This isn't an ideal solution since it requires a wire protocol change, but I thought I would open this ticket to discuss further or maybe we can come up with another option.

      Attachments

        Issue Links

          Activity

            People

              ashehata Amir Shehata (Inactive)
              hornc Chris Horn
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: