Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12199

md's are not detached from uncommitted messages that have health check performed on them

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.12.0, Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      It's possible for lnet_is_health_check() to return "true" when the
      message has not hit the network. In this situation the message is freed
      without detaching the MD. As a result, requests do not receive their
      unlink events and these requests are stuck forever.

      This issue was discovered while testing the MR routing feature under LNet router failure conditions.

      Bug was introduced by the LNet health feature commit 70616605dd44be37068f4e1a4745a2f8b90eb1f5 https://review.whamcloud.com/32764

      Attachments

        Activity

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: