Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4293

lfs_migrate is failing with a volatile file Operation not permitted error

Details

    • Bug
    • Resolution: Fixed
    • Major
    • Lustre 2.6.0, Lustre 2.5.1
    • Lustre 2.4.0, Lustre 2.4.1
    • None
    • Lustre 2.4.1 RHEL6 2.6.32-358.18.1.el6_lustre.x86_64

    Description

      "lfs_migrate -y" aborts with this error

      cannot swap layouts between <filename> and a volatile file (Operation not permitted)

      This seems to happen for all files. The lfs_migrate operation aborts on the first file.

      Attachments

        Issue Links

          Activity

            [LU-4293] lfs_migrate is failing with a volatile file Operation not permitted error
            pjones Peter Jones added a comment -

            Landed for 2.5.1 and 2.6

            pjones Peter Jones added a comment - Landed for 2.5.1 and 2.6
            bogl Bob Glossman (Inactive) added a comment - backport to b2_5: http://review.whamcloud.com/9278
            aalba6675 Anthony Alba added a comment -

            I have also observed this on a filesystem created with 2.1.x and migrated to 2.4.2.
            Exactly the same error message but this happens only with some directories.
            Some directories lfs_migrate'd perfectly.

            aalba6675 Anthony Alba added a comment - I have also observed this on a filesystem created with 2.1.x and migrated to 2.4.2. Exactly the same error message but this happens only with some directories. Some directories lfs_migrate'd perfectly.
            bfaccini Bruno Faccini (Inactive) added a comment - - edited

            My patch at http://review.whamcloud.com/8737 seems to address the original need (allow legitimate IGIFs, handle root special-case, ...) for this ticket now, only have to answer reviewers comments.

            bfaccini Bruno Faccini (Inactive) added a comment - - edited My patch at http://review.whamcloud.com/8737 seems to address the original need (allow legitimate IGIFs, handle root special-case, ...) for this ticket now, only have to answer reviewers comments.
            yujian Jian Yu added a comment -

            Patch http://review.whamcloud.com/8616 was cherry-picked to Lustre b2_5 branch.

            yujian Jian Yu added a comment - Patch http://review.whamcloud.com/8616 was cherry-picked to Lustre b2_5 branch.

            I wonder if the -EPERM error seen on non-IGIF files is due to some file ownership problem like LU-3826 or similar?

                    if ((attr1->la_uid != attr2->la_uid) ||
                        (attr1->la_gid != attr2->la_gid))
                            RETURN(-EPERM);
            

            This will fail if lfs_migrate is not creating the file with the same ownership as the original file? If lfs_migrate is running as root, then it should be able to fchown() the file after it is created. The actual permissions don't matter, because the temporary file will be deleted, but this is proof that the caller of the migrate has permission to do this swap.

            adilger Andreas Dilger added a comment - I wonder if the -EPERM error seen on non-IGIF files is due to some file ownership problem like LU-3826 or similar? if ((attr1->la_uid != attr2->la_uid) || (attr1->la_gid != attr2->la_gid)) RETURN(-EPERM); This will fail if lfs_migrate is not creating the file with the same ownership as the original file? If lfs_migrate is running as root, then it should be able to fchown() the file after it is created. The actual permissions don't matter, because the temporary file will be deleted, but this is proof that the caller of the migrate has permission to do this swap.

            Patch to allow layout swap for IGIF file is at http://review.whamcloud.com/8737.

            bfaccini Bruno Faccini (Inactive) added a comment - Patch to allow layout swap for IGIF file is at http://review.whamcloud.com/8737 .

            Humm thanks, I understand I better had to read LU-4392 sub-task and learn more about LFSCK behavior than to ask, sorry !!

            So now, do we really need to detect such files with a wrongly assigned IGIF by LFSCK (to be fixed in LU-4392 sub-task) ?? Because if not, fix for this ticket's original issue could simply be to add fid_is_igif() test for both files having their layouts swapped, in mdd_layout_swap_allowed().
            I may miss some special cases about files with IGIF here since you wrote about "internal system files by using their (internal) IGIF FIDs" ??…

            I will also work on the MDT volatile object leak upon layouts swap failure, may be as part as a new ticket.

            bfaccini Bruno Faccini (Inactive) added a comment - Humm thanks, I understand I better had to read LU-4392 sub-task and learn more about LFSCK behavior than to ask, sorry !! So now, do we really need to detect such files with a wrongly assigned IGIF by LFSCK (to be fixed in LU-4392 sub-task) ?? Because if not, fix for this ticket's original issue could simply be to add fid_is_igif() test for both files having their layouts swapped, in mdd_layout_swap_allowed(). I may miss some special cases about files with IGIF here since you wrote about "internal system files by using their (internal) IGIF FIDs" ??… I will also work on the MDT volatile object leak upon layouts swap failure, may be as part as a new ticket.

            I created the IGIF files under 1.8 and upgraded to 2.x. You could also get the same effect by mounting a 2.4 MDT as ldiskfs, deleting the "lma" xattr, then rounding and running LFSCK to fix the OI.

            adilger Andreas Dilger added a comment - I created the IGIF files under 1.8 and upgraded to 2.x. You could also get the same effect by mounting a 2.4 MDT as ldiskfs, deleting the "lma" xattr, then rounding and running LFSCK to fix the OI.

            Sorry to be late, but I am back on this one.

            Andreas, sorry to ask but can you explain me how the files created in MDT root-directory have an IGIF assigned ??

            I also confirm that as part of LU-3834, and fault-injection during layouts-swap to verify patch behavior, I reproduce the volatile object leak (inode links number is 1 and e2fsck detects "Unattached inode") on MDT. In my case, and for one layouts-swap forced error, I see one orphan inode with ".^L^S^T^R:VOLATILE"/LUSTRE_VOLATILE_HDR linkEA but also one with "i_am_nobody", did you also find this ?

            But anyway, this clearly indicate that there is something to address and fix upon layouts-swap error.

            bfaccini Bruno Faccini (Inactive) added a comment - Sorry to be late, but I am back on this one. Andreas, sorry to ask but can you explain me how the files created in MDT root-directory have an IGIF assigned ?? I also confirm that as part of LU-3834 , and fault-injection during layouts-swap to verify patch behavior, I reproduce the volatile object leak (inode links number is 1 and e2fsck detects "Unattached inode") on MDT. In my case, and for one layouts-swap forced error, I see one orphan inode with ".^L^S^T^R:VOLATILE"/LUSTRE_VOLATILE_HDR linkEA but also one with "i_am_nobody", did you also find this ? But anyway, this clearly indicate that there is something to address and fix upon layouts-swap error.

            People

              bfaccini Bruno Faccini (Inactive)
              wbaudler Wolfgang Baudler
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: