Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-27

(mds_open.c:1667:mds_close()) @@@ no handle for file close ino

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 1.8.6
    • Lustre 1.8.6
    • None
    • 3
    • 10454

    Description

      We are hitting frequent MDS hangs at Titech due to LBUG caused by "no handle for file close in mds_close()".
      it looks like similar bug 22104 and 22528, but no solution and patches yet to address this problem.
      could you have a look at attachment and any suggestions?

      Attachments

        Activity

          [LU-27] (mds_open.c:1667:mds_close()) @@@ no handle for file close ino
          pjones Peter Jones added a comment -

          ok I will mark it as resolved. Ihara, please reopen if you feel that this is inappropriate

          pjones Peter Jones added a comment - ok I will mark it as resolved. Ihara, please reopen if you feel that this is inappropriate

          Yes, I think we can mark this as resolved, not sure if it should be marked by reporter or assignee.

          niu Niu Yawei (Inactive) added a comment - Yes, I think we can mark this as resolved, not sure if it should be marked by reporter or assignee.
          pjones Peter Jones added a comment -

          Great! So, does any work remain or can we mark this issue as resolved?

          pjones Peter Jones added a comment - Great! So, does any work remain or can we mark this issue as resolved?

          No, the master doesn't have this bug.

          niu Niu Yawei (Inactive) added a comment - No, the master doesn't have this bug.
          pjones Peter Jones added a comment -

          It looks like this patch has landed on the Oracle 1.8.6. Is the same fix needed for master?

          pjones Peter Jones added a comment - It looks like this patch has landed on the Oracle 1.8.6. Is the same fix needed for master?

          Yes, the patch has been posted on BZ and Gerrit for review.

          niu Niu Yawei (Inactive) added a comment - Yes, the patch has been posted on BZ and Gerrit for review.

          Niu, just confirmation. you did file this on bugzilla as 24360, then moving forward to review patches, right?

          ihara Shuichi Ihara (Inactive) added a comment - Niu, just confirmation. you did file this on bugzilla as 24360, then moving forward to review patches, right?

          Yes, there are some defects in the mds_verify_child():

          • Wrongly decref child lock in the "no child lock wanted" case;
          • Wrongly decref parent lock in the "reget child lock successfully" case;

          This bug isn't necessarily caused by the "no handle for file close in mds_close()", so I think it's not similar to bug 22104 and 22528.

          Will post a patch for review soon.

          niu Niu Yawei (Inactive) added a comment - Yes, there are some defects in the mds_verify_child(): Wrongly decref child lock in the "no child lock wanted" case; Wrongly decref parent lock in the "reget child lock successfully" case; This bug isn't necessarily caused by the "no handle for file close in mds_close()", so I think it's not similar to bug 22104 and 22528. Will post a patch for review soon.

          Assigned to Niu, per Liang's comments and suggestion.

          dferber Dan Ferber (Inactive) added a comment - Assigned to Niu, per Liang's comments and suggestion.

          People

            niu Niu Yawei (Inactive)
            ihara Shuichi Ihara (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: