Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14955

LNet: change use of fatal error flag for ni selection to be a part of health feature

Details

    • Improvement
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • None
    • None
    • 9223372036854775807

    Description

      Making use of fatal error flag for NI selection to be a part of health feature allows the user to control it by turning the health feature on/off. Some user may decide to turn of fatal link state detection if LNet is configured with a single NI.

      Attachments

        Activity

          [LU-14955] LNet: change use of fatal error flag for ni selection to be a part of health feature
          pjones Peter Jones added a comment -

          Landed for 2.16

          pjones Peter Jones added a comment - Landed for 2.16

          "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44746/
          Subject: LU-14955 lnet: Use fatal NI if none other available
          Project: fs/lustre-release
          Branch: master
          Current Patch Set:
          Commit: ff3322fd0c77a8042558711d9f410326d2aa6375

          gerrit Gerrit Updater added a comment - "Oleg Drokin <green@whamcloud.com>" merged in patch https://review.whamcloud.com/44746/ Subject: LU-14955 lnet: Use fatal NI if none other available Project: fs/lustre-release Branch: master Current Patch Set: Commit: ff3322fd0c77a8042558711d9f410326d2aa6375

          "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44746
          Subject: LU-14955 lnet: make fatal ni handling part of health feature
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 2f9ae83236e0c4cda5b5a1ae04ab1f71a3cf6036

          gerrit Gerrit Updater added a comment - "Serguei Smirnov <ssmirnov@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/44746 Subject: LU-14955 lnet: make fatal ni handling part of health feature Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 2f9ae83236e0c4cda5b5a1ae04ab1f71a3cf6036

          Andreas, yes, after some debating, the patch that I'm about to push is going to allow the only NI to be picked regardless of its fatal state. As an extrapolation, if there are many NIs and they are all in fatal state, LNet is also going to be able to select one.

          ssmirnov Serguei Smirnov added a comment - Andreas, yes, after some debating, the patch that I'm about to push is going to allow the only NI to be picked regardless of its fatal state. As an extrapolation, if there are many NIs and they are all in fatal state, LNet is also going to be able to select one.

          Serguei, is there any real benefit for the only link/interface on a node to be marked unavailable? I don't think that makes sense. In general, I guess enabling LNet Health doesn't make sense for a system with only a single link/interface, since there isn't any choice but to continue using that one interface. In most such cases, the error will be transient, so retrying will fix the problem, and if the only interface on a client is permanently broken, then there isn't anything that can be done anyway.

          adilger Andreas Dilger added a comment - Serguei, is there any real benefit for the only link/interface on a node to be marked unavailable? I don't think that makes sense. In general, I guess enabling LNet Health doesn't make sense for a system with only a single link/interface, since there isn't any choice but to continue using that one interface. In most such cases, the error will be transient, so retrying will fix the problem, and if the only interface on a client is permanently broken, then there isn't anything that can be done anyway.

          People

            ssmirnov Serguei Smirnov
            ssmirnov Serguei Smirnov
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: