Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-13070

mdd_orphan_destroy loop caused by compatibility issue on upgrades to 2.11 or later

    XMLWordPrintable

Details

    • 3
    • 9223372036854775807

    Description

      While investigating of the customer issue, we found that the original trigger for the problem is a compatibility issue between Lustre 2.11 and older Lustre versions. Code introduced by LU-7787 to "clean up orphan object handling" was incomplete. The format for names of orphans in the PENDING dir was changed in Lustre 2.11. The old format names are not recognized by mdd_orphan_destroy() in Lustre 2.11, leading to an endless loop. There's a check for the old format name, used in mdd_orphan_delete(), but that check was not included in mdd_orphan_destroy().

      Here is the relevant code segment from mdd_orphan_delete():

      rc = dt_delete(env, mdd->mdd_orphans, key, th);
       if (rc == -ENOENT) {
           key = mdd_orphan_key_fill_20(env, mdo2fid(obj));
           rc = dt_delete(env, mdd->mdd_orphans, key, th);
       }  

      This same ENOENT sequence should be included in mdd_orphan_destroy().

      It looks like LU-11418 trying to solve the problem, but it removes symptoms, not the root cause.

      Attachments

        Issue Links

          Activity

            People

              artem_blagodarenko Artem Blagodarenko (Inactive)
              artem_blagodarenko Artem Blagodarenko (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: