Details

    • New Feature
    • Resolution: Unresolved
    • Major
    • None
    • Lustre 2.7.0, Lustre 2.8.0
    • 9223372036854775807

    Description

      In suppress ping environment the evicted client is not able to recover from evicted state until the first access to the server which evicted the client. In the situation, the access gets -EIO and immediately return. This may cause user job ends with error termination.

      We can avoid the situation by running "lfs df" before every single operation. But it's really troublesome and we actually cannot do such a thing.

      Eviction notifier, this patch provides, is one of the solution to the problem. With this function.
      At first, the target(MDT, OST) which evicted a client notifies MGS an eviction event.
      Then MGS send a request to the evicted client.
      Finally, getting the request and the client sends a ping to the target server to find "I'm evicted".

      Attachments

        Activity

          [LU-6657] Eviction Notifier

          There are a finite number of operations that a client will do after being evicted and idle for some time. Just walk through them and figure out which work and which do not.

          For instance, I would hope that an open() call would work after eviction. Hopefully when the open() fails, the client reconnects and retries the open(), and the application is none the wiser that this occurred. Is that the case?

          morrone Christopher Morrone (Inactive) added a comment - There are a finite number of operations that a client will do after being evicted and idle for some time. Just walk through them and figure out which work and which do not. For instance, I would hope that an open() call would work after eviction. Hopefully when the open() fails, the client reconnects and retries the open(), and the application is none the wiser that this occurred. Is that the case?
          nozaki Hiroya Nozaki (Inactive) added a comment - - edited

          hmm ... under suppress ping environment clients are left evicted state since an eviction event and we don't want to leave clients evicted until the first access since the event. That's why there's no particular target case. Fujitsu is expected to reduce the number of error which the end users get from eviction.

          nozaki Hiroya Nozaki (Inactive) added a comment - - edited hmm ... under suppress ping environment clients are left evicted state since an eviction event and we don't want to leave clients evicted until the first access since the event. That's why there's no particular target case. Fujitsu is expected to reduce the number of error which the end users get from eviction.

          Generally speaking, no, we would not retry bulk IO after an eviction. If a client already has an open file handle when the eviction occurs, then any currently under way or future operations on that file should receive an error. If a client is able to reconnect after eviction without a reboot of the node, even though the client might seem like the same client to us, the client is a completely new client instance from the servers' perspective. All previous state was lost.

          The eviction notifier approach does not help with that particular issue.

          I think the best path forward is for you to open tickets on exactly what operations were failing for you so we work on fixing them.

          morrone Christopher Morrone (Inactive) added a comment - Generally speaking, no, we would not retry bulk IO after an eviction. If a client already has an open file handle when the eviction occurs, then any currently under way or future operations on that file should receive an error. If a client is able to reconnect after eviction without a reboot of the node, even though the client might seem like the same client to us, the client is a completely new client instance from the servers' perspective. All previous state was lost. The eviction notifier approach does not help with that particular issue. I think the best path forward is for you to open tickets on exactly what operations were failing for you so we work on fixing them.

          I cannot help answering "yes" to the question. We should think and get a fundamental solution, though I think this feature is a kinda reasonably cheap workaround, at least now, when using suppress ping.

          Considering bulk-I/O, can we simply resend an request or should we carefully examine which operations can be resend depending on an each situation ?

          nozaki Hiroya Nozaki (Inactive) added a comment - I cannot help answering "yes" to the question. We should think and get a fundamental solution, though I think this feature is a kinda reasonably cheap workaround, at least now, when using suppress ping. Considering bulk-I/O, can we simply resend an request or should we carefully examine which operations can be resend depending on an each situation ?

          Yes, but we avoided having eviction notifications for good reason: evictions normally occur because we are unable to talk to the client. Adding new communication for a client with which we are unable to communicate seems like a less than desirable design decision. While I can imagine situations where that would work, I can also imagine situations where the added communication causes more harm then good.

          Also, f I understand your solution you have not really solved the underlying problem, you have merely shrunk the window in which the problem can occur. Since the notification goes sideways through the MGS, there is still a window in which that is happening when the client can reconnect to the server and still get an error.

          But fixing the client bugs would fix the problem completely, would it not?

          morrone Christopher Morrone (Inactive) added a comment - Yes, but we avoided having eviction notifications for good reason: evictions normally occur because we are unable to talk to the client. Adding new communication for a client with which we are unable to communicate seems like a less than desirable design decision. While I can imagine situations where that would work, I can also imagine situations where the added communication causes more harm then good. Also, f I understand your solution you have not really solved the underlying problem, you have merely shrunk the window in which the problem can occur. Since the notification goes sideways through the MGS, there is still a window in which that is happening when the client can reconnect to the server and still get an error. But fixing the client bugs would fix the problem completely, would it not?

          There are lots of codes derived from the eviction mechanism on the client-side like if-statement checking exp_failed. So I think it takes piles of time to see if whether or not the replacement is perfectly completed. Which is why I thought I shouldn't have touch this now and created a new logic on the eviction mechanism.

          This feature works independently of the other features though the code is dependent on fsdb. And we can disable it if we like. so I think getting this feature is more reasonable than thinking up and confirming the non-eviciton logic on the client side.

          nozaki Hiroya Nozaki (Inactive) added a comment - There are lots of codes derived from the eviction mechanism on the client-side like if-statement checking exp_failed. So I think it takes piles of time to see if whether or not the replacement is perfectly completed. Which is why I thought I shouldn't have touch this now and created a new logic on the eviction mechanism. This feature works independently of the other features though the code is dependent on fsdb. And we can disable it if we like. so I think getting this feature is more reasonable than thinking up and confirming the non-eviciton logic on the client side.

          This sounds like a pretty major protocol change to Lustre, and I think we need to have more discussion about whether this is a reasonable approach.

          I believe that the current high level design dictates that when a client performs an operation and discovers that it has been evicted, it should reconnect and resend the operation. So in places where the client does not currently do that, can we not simply fix the client?

          morrone Christopher Morrone (Inactive) added a comment - This sounds like a pretty major protocol change to Lustre, and I think we need to have more discussion about whether this is a reasonable approach. I believe that the current high level design dictates that when a client performs an operation and discovers that it has been evicted, it should reconnect and resend the operation. So in places where the client does not currently do that, can we not simply fix the client?

          Hiroya Nozaki (nozaki.hiroya@jp.fujitsu.com) uploaded a new patch: http://review.whamcloud.com/14987
          Subject: LU-6657 mgs: eviction notifier
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: f16ff2e5cab2179692fdfd380a3b520df3bb71c6

          gerrit Gerrit Updater added a comment - Hiroya Nozaki (nozaki.hiroya@jp.fujitsu.com) uploaded a new patch: http://review.whamcloud.com/14987 Subject: LU-6657 mgs: eviction notifier Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: f16ff2e5cab2179692fdfd380a3b520df3bb71c6

          People

            nozaki Hiroya Nozaki (Inactive)
            nozaki Hiroya Nozaki (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated: