Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-5652

client eviction if lock enqueue reply is lost

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • 3
    • 15838

    Description

      A client will be evicted in this case:

      • ldlm lock is granted while sending lock enqueue reply
      • another thread tries to enqueue conflicting lock which can either set LDLM_FL_AST_SENT for this reply, or send blocking AST.*
      • lock enqueue reply is lost
      • RPC deadline on client side is longer than waiting lock deadline

      If all these happened, this client will be evicted even with AST resend (LU-5520)

      This patch is a workaround, it will guarantee waiting lock deadline is longer than server RPC deadline, which should be close to client side RPC deadline, so client at least has a chance to resend RPC.

      This patch cannot help if there are multiple messages lost, for example, if resent RPC is lost again. Also, if there is huge network latency like router congestion on the path from client to server, then we may still have client eviction even with this patch, because server has no idea about network latency.

      Attachments

        Activity

          People

            liang Liang Zhen (Inactive)
            liang Liang Zhen (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated: