Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13070

mdd_orphan_destroy loop caused by compatibility issue on upgrades to 2.11 or later

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: Lustre 2.11.0, Lustre 2.12.4
    • Fix Version/s: Lustre 2.14.0, Lustre 2.12.4
    • Labels:
    • Severity:
      3
    • Rank (Obsolete):
      9223372036854775807

      Description

      While investigating of the customer issue, we found that the original trigger for the problem is a compatibility issue between Lustre 2.11 and older Lustre versions. Code introduced by LU-7787 to "clean up orphan object handling" was incomplete. The format for names of orphans in the PENDING dir was changed in Lustre 2.11. The old format names are not recognized by mdd_orphan_destroy() in Lustre 2.11, leading to an endless loop. There's a check for the old format name, used in mdd_orphan_delete(), but that check was not included in mdd_orphan_destroy().

      Here is the relevant code segment from mdd_orphan_delete():

      rc = dt_delete(env, mdd->mdd_orphans, key, th);
       if (rc == -ENOENT) {
           key = mdd_orphan_key_fill_20(env, mdo2fid(obj));
           rc = dt_delete(env, mdd->mdd_orphans, key, th);
       }  

      This same ENOENT sequence should be included in mdd_orphan_destroy().

      It looks like LU-11418 trying to solve the problem, but it removes symptoms, not the root cause.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                artem_blagodarenko Artem Blagodarenko
                Reporter:
                artem_blagodarenko Artem Blagodarenko
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: