Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-12199

md's are not detached from uncommitted messages that have health check performed on them

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.13.0, Lustre 2.12.3
    • Lustre 2.12.0, Lustre 2.13.0
    • None
    • 3
    • 9223372036854775807

    Description

      It's possible for lnet_is_health_check() to return "true" when the
      message has not hit the network. In this situation the message is freed
      without detaching the MD. As a result, requests do not receive their
      unlink events and these requests are stuck forever.

      This issue was discovered while testing the MR routing feature under LNet router failure conditions.

      Bug was introduced by the LNet health feature commit 70616605dd44be37068f4e1a4745a2f8b90eb1f5 https://review.whamcloud.com/32764

      Attachments

        Activity

          [LU-12199] md's are not detached from uncommitted messages that have health check performed on them

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36039/
          Subject: LU-12199 lnet: verify msg is commited for send/recv
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set:
          Commit: 73c8ae59cb2bd8352301d8f09ef1309adb5c8202

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36039/ Subject: LU-12199 lnet: verify msg is commited for send/recv Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: 73c8ae59cb2bd8352301d8f09ef1309adb5c8202

          Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36038/
          Subject: LU-12199 lnet: Ensure md is detached when msg is not committed
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set:
          Commit: d5a05a56fa29259b28dcc766af391ee0f3a357fd

          gerrit Gerrit Updater added a comment - Oleg Drokin (green@whamcloud.com) merged in patch https://review.whamcloud.com/36038/ Subject: LU-12199 lnet: Ensure md is detached when msg is not committed Project: fs/lustre-release Branch: b2_12 Current Patch Set: Commit: d5a05a56fa29259b28dcc766af391ee0f3a357fd

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36039
          Subject: LU-12199 lnet: verify msg is commited for send/recv
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: ae2a031220d21bf3a511457cbe091134278c0cec

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36039 Subject: LU-12199 lnet: verify msg is commited for send/recv Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: ae2a031220d21bf3a511457cbe091134278c0cec

          Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36038
          Subject: LU-12199 lnet: Ensure md is detached when msg is not committed
          Project: fs/lustre-release
          Branch: b2_12
          Current Patch Set: 1
          Commit: 7699683cd5779316ff7d9429df1a7428c978b2b0

          gerrit Gerrit Updater added a comment - Minh Diep (mdiep@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/36038 Subject: LU-12199 lnet: Ensure md is detached when msg is not committed Project: fs/lustre-release Branch: b2_12 Current Patch Set: 1 Commit: 7699683cd5779316ff7d9429df1a7428c978b2b0

          Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34797/
          Subject: LU-12199 lnet: verify msg is commited for send/recv
          Project: fs/lustre-release
          Branch: multi-rail
          Current Patch Set:
          Commit: fc6b321036f34c00d5b32b49c817dc0034fbad9e

          gerrit Gerrit Updater added a comment - Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34797/ Subject: LU-12199 lnet: verify msg is commited for send/recv Project: fs/lustre-release Branch: multi-rail Current Patch Set: Commit: fc6b321036f34c00d5b32b49c817dc0034fbad9e

          Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34885/
          Subject: LU-12199 lnet: Ensure md is detached when msg is not committed
          Project: fs/lustre-release
          Branch: multi-rail
          Current Patch Set:
          Commit: b65f3a1767ae82c7f629320187b33eb8670da537

          gerrit Gerrit Updater added a comment - Amir Shehata (ashehata@whamcloud.com) merged in patch https://review.whamcloud.com/34885/ Subject: LU-12199 lnet: Ensure md is detached when msg is not committed Project: fs/lustre-release Branch: multi-rail Current Patch Set: Commit: b65f3a1767ae82c7f629320187b33eb8670da537

          Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34971
          Subject: LU-12199 lnet: verify msg is commited for send/recv
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: 6b9f43a82e6fe8ce90cb925c9e46023cc76a196c

          gerrit Gerrit Updater added a comment - Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34971 Subject: LU-12199 lnet: verify msg is commited for send/recv Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 6b9f43a82e6fe8ce90cb925c9e46023cc76a196c

          Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34885
          Subject: LU-12199 lnet: Ensure md is detached when msg is not committed
          Project: fs/lustre-release
          Branch: multi-rail
          Current Patch Set: 1
          Commit: cac7eba2fe4f0852dbf416388ce6831027c1f555

          gerrit Gerrit Updater added a comment - Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34885 Subject: LU-12199 lnet: Ensure md is detached when msg is not committed Project: fs/lustre-release Branch: multi-rail Current Patch Set: 1 Commit: cac7eba2fe4f0852dbf416388ce6831027c1f555

          Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34797
          Subject: LU-12199 lnet: verify msg is commited for send/recv
          Project: fs/lustre-release
          Branch: multi-rail
          Current Patch Set: 1
          Commit: a4cc4392e989fe33299324a9ebb3d7fdfa45baad

          gerrit Gerrit Updater added a comment - Amir Shehata (ashehata@whamcloud.com) uploaded a new patch: https://review.whamcloud.com/34797 Subject: LU-12199 lnet: verify msg is commited for send/recv Project: fs/lustre-release Branch: multi-rail Current Patch Set: 1 Commit: a4cc4392e989fe33299324a9ebb3d7fdfa45baad

          Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/34709
          Subject: LU-12199 lnet: Ensure md is detached when msg is not committed
          Project: fs/lustre-release
          Branch: master
          Current Patch Set: 1
          Commit: cffdb4bb4afc698bb4df6cef9d74d85cd0e2b876

          gerrit Gerrit Updater added a comment - Chris Horn (hornc@cray.com) uploaded a new patch: https://review.whamcloud.com/34709 Subject: LU-12199 lnet: Ensure md is detached when msg is not committed Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: cffdb4bb4afc698bb4df6cef9d74d85cd0e2b876

          People

            hornc Chris Horn
            hornc Chris Horn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: