Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-16835

lustre-initialization: Operation not supported while trying to set fs label, tune2fs 1.47.0-wc1

Details

    • Bug
    • Resolution: Fixed
    • Minor
    • Lustre 2.16.0
    • Lustre 2.16.0
    • None
    • 3
    • 9223372036854775807

    Description

      This issue was created by maloo for Andreas Dilger <adilger@whamcloud.com>

      This issue relates to the following test suite runs:
      https://testing.whamcloud.com/test_sets/1aec9220-17f7-4e40-831b-e742497b94e7
      https://testing.whamcloud.com/test_sets/f1b438e1-bcd1-448a-aaab-fce424899d3f
      https://testing.whamcloud.com/test_sets/b8ab97a5-4f83-4a85-918c-56f732236ba8

      lustre-initialization failed with the following error:

      2023-05-12T14:31:49 CMD: onyx-113vm10 mkdir -p /mnt/lustre-mds1; mount -t lustre -o localrecov  /dev/mapper/mds1_flakey /mnt/lustre-mds1
      2023-05-12T14:31:49    /mnt/lustre-mds1: Operation not supported while trying to set fs label
      2023-05-12T14:31:49    tune2fs 1.47.0-wc1 (28-Apr-2023)
      2023-05-12T14:31:54 CMD: onyx-113vm10 e2label /dev/mapper/mds1_flakey 				2>/dev/null | grep -E ':[a-zA-Z]{3}[0-9]{4}'
      2023-05-12T14:31:54 pdsh@onyx-91vm5: onyx-113vm10: ssh exited with exit code 1
      2023-05-12T14:31:54 Commit the device label on /dev/lvm-Role_MDS/P1
      2023-05-12T14:31:54 CMD: onyx-113vm10 sync; sleep 1; sync
      2023-05-12T14:31:59 CMD: onyx-113vm10 e2label /dev/mapper/mds1_flakey 2>/dev/null
      2023-05-12T14:31:59 pdsh@onyx-91vm5: onyx-113vm10: ssh exited with exit code 1
      2023-05-12T14:31:59 no label for /dev/mapper/mds1_flakey
      

      Test session details:
      clients: https://build.whamcloud.com/job/lustre-master-next/703 - 4.18.0-372.9.1.el8.aarch64
      servers: https://build.whamcloud.com/job/lustre-master-next/703 - 4.18.0-372.32.1.el8_lustre.x86_64

      This may be related to the update to use e2fsprogs-1.47.0-wc1, which may have added support for new ioctl() to get/set the filesystem label (FS_IOC_GETFSLABEL and FS_IOC_SETFSLABEL) but this was added only in kernel v5.16-rc4-36-gbbc605cdb1e1. It may be that the fallback for older kernels is not working?

      VVVVVVV DO NOT REMOVE LINES BELOW, Added by Maloo for auto-association VVVVVVV
      lustre-initialization lustre-initialization - "lustre-initialization timed out"

      Attachments

        Issue Links

          Activity

            [LU-16835] lustre-initialization: Operation not supported while trying to set fs label, tune2fs 1.47.0-wc1

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55008
            Subject: LU-16835 tune2fs: reset ioctl error for old filesystem
            Project: tools/e2fsprogs
            Branch: master-lustre-test
            Current Patch Set: 1
            Commit: 3b78035336f0863d106fc29f008f336354c97569

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/55008 Subject: LU-16835 tune2fs: reset ioctl error for old filesystem Project: tools/e2fsprogs Branch: master-lustre-test Current Patch Set: 1 Commit: 3b78035336f0863d106fc29f008f336354c97569
            pjones Peter Jones added a comment -

            Seems to all be merged for 2.16

            pjones Peter Jones added a comment - Seems to all be merged for 2.16

            "Li Dongyang <dongyangli@ddn.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51073/
            Subject: LU-16835 tune2fs: fall back to old get/set fs label on error
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set:
            Commit: c6af13873a5b1102126fcedb93abeda93132e3ef

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" merged in patch https://review.whamcloud.com/c/tools/e2fsprogs/+/51073/ Subject: LU-16835 tune2fs: fall back to old get/set fs label on error Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: Commit: c6af13873a5b1102126fcedb93abeda93132e3ef

            "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51072/
            Subject: LU-16835 target: server_ioctl() should return ENOTTY
            Project: fs/lustre-release
            Branch: master
            Current Patch Set:
            Commit: aadc6de18ed9af8f26cc829faa5643f8b4f0bba7

            gerrit Gerrit Updater added a comment - "Andreas Dilger <adilger@whamcloud.com>" merged in patch https://review.whamcloud.com/c/fs/lustre-release/+/51072/ Subject: LU-16835 target: server_ioctl() should return ENOTTY Project: fs/lustre-release Branch: master Current Patch Set: Commit: aadc6de18ed9af8f26cc829faa5643f8b4f0bba7
            gerrit Gerrit Updater added a comment - - edited

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51081
            Subject: LU-16835 revert: "LU-137 osd-ldiskfs: pass through resize ioctl"
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 973446ff6ade4d01612bbbf9d0cfe0ab7ec3fc13

            gerrit Gerrit Updater added a comment - - edited "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51081 Subject: LU-16835 revert: " LU-137 osd-ldiskfs: pass through resize ioctl" Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 973446ff6ade4d01612bbbf9d0cfe0ab7ec3fc13

            "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51073
            Subject: LU-16835 tune2fs: fall back to old get/set fs label on error
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set: 1
            Commit: af2a97f298bdda26f9b900ae53b76e779e957872

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51073 Subject: LU-16835 tune2fs: fall back to old get/set fs label on error Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: af2a97f298bdda26f9b900ae53b76e779e957872

            "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51072
            Subject: LU-16835 target: server_ioctl() should return ENOTTY
            Project: fs/lustre-release
            Branch: master
            Current Patch Set: 1
            Commit: 0f25e703321336f82e0a6468bce7cb08051c3fb0

            gerrit Gerrit Updater added a comment - "Li Dongyang <dongyangli@ddn.com>" uploaded a new patch: https://review.whamcloud.com/c/fs/lustre-release/+/51072 Subject: LU-16835 target: server_ioctl() should return ENOTTY Project: fs/lustre-release Branch: master Current Patch Set: 1 Commit: 0f25e703321336f82e0a6468bce7cb08051c3fb0
            dongyang Dongyang Li added a comment - - edited

            I used bpftrace and it confirms ioctl((FS_IOC_GETFSLABEL) does return EOPNOTSUPP as the error message suggested.
            I was struggling to see where does the EOPNOTSUPP come from, as from ext4 level we either return ENOTTY or ENOIOCTLCMD. and from the ioctl syscall level it sets the return value to ENOTTY when it sees ENOIOCTLCMD.
            Then I used a combination of trace-cmd and bpftrace it turns out the ioctl lands on the mount point and goes down to server_ioctl() provided by lustre, and server_ioctl() returns EOPNOTSUPP.

            Adding EOPNOTSUPP check in tune2fs will work, but I doubt upstream will like it, server_ioctl() should really just return ENOTTY and e2label will fallback to old ways to set/get label.

            Now, ATM-2790 actually shows a different issue, the handle_fslable() from tune2fs just quits if it fails to open the mount point. I think in this case we should just fallback to old method as well and no com_err is needed.
            I will prepare the patches and send the tune2fs patch upstream.

            dongyang Dongyang Li added a comment - - edited I used bpftrace and it confirms ioctl((FS_IOC_GETFSLABEL) does return EOPNOTSUPP as the error message suggested. I was struggling to see where does the EOPNOTSUPP come from, as from ext4 level we either return ENOTTY or ENOIOCTLCMD. and from the ioctl syscall level it sets the return value to ENOTTY when it sees ENOIOCTLCMD. Then I used a combination of trace-cmd and bpftrace it turns out the ioctl lands on the mount point and goes down to server_ioctl() provided by lustre, and server_ioctl() returns EOPNOTSUPP. Adding EOPNOTSUPP check in tune2fs will work, but I doubt upstream will like it, server_ioctl() should really just return ENOTTY and e2label will fallback to old ways to set/get label. Now, ATM-2790 actually shows a different issue, the handle_fslable() from tune2fs just quits if it fails to open the mount point. I think in this case we should just fallback to old method as well and no com_err is needed. I will prepare the patches and send the tune2fs patch upstream.
            gerrit Gerrit Updater added a comment - - edited

            "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51068
            Subject: LU-16835 tune2fs: reset ioctl error for old filesystem
            Project: tools/e2fsprogs
            Branch: master-lustre
            Current Patch Set: 1
            Commit: ea443cd29253837b82625de8bfe9147cc60c45a3

            gerrit Gerrit Updater added a comment - - edited "Andreas Dilger <adilger@whamcloud.com>" uploaded a new patch: https://review.whamcloud.com/c/tools/e2fsprogs/+/51068 Subject: LU-16835 tune2fs: reset ioctl error for old filesystem Project: tools/e2fsprogs Branch: master-lustre Current Patch Set: 1 Commit: ea443cd29253837b82625de8bfe9147cc60c45a3

            It looks like this error might be coming from the following code in ext4_ioctl():

                    /*
                     * If any checksums (group descriptors or metadata) are being used
                     * then the checksum seed feature is required to change the UUID.
                     */
                    if (((ext4_has_feature_gdt_csum(sb) || ext4_has_metadata_csum(sb))
                                    && !ext4_has_feature_csum_seed(sb))
                            || ext4_has_feature_stable_inodes(sb))
                            return -EOPNOTSUPP;
            
            

            since we do not enable the EXT4_FEATURE_INCOMPAT_CSUM_SEED on any Lustre filesystems, since we also do not handle the METADATA_CSUM feature. However, we do enable GDT_CSUM, which seems to be tripping this check up. However, tune2fs should just ignore any such error and fall back to modify the superblock directly in that case.

            It looks like the bug is at:

            +
            +               ret = -1;
            +#ifdef __linux__
            +               if (fsuuid) {
            +                       fsuuid->fsu_len - UUID_SIZE;
            +                       fsuuid->fsu_flags = 0;
            +                       memcpy(&fsuuid->fsu_uuid, new_uuid, UUID_SIZE);
            +                       ret = ioctl(fd, EXT4_IOC_SETFSUUID, fsuuid);
            +               }
            +#endif
            +               ret = -1;
            +               /*
            +                * If we can't set the UUID via the ioctl, fall
            +                * back to directly modifying the superblock
            +                .*/
            +               if (ret) {
            +                       memcpy(sb->s_uuid, new_uuid, UUID_SIZE);
            +                       ext2fs_init_csum_seed(fs);
            +                       if (set_csum) {
            +                               for (i = 0; i < fs->group_desc_count; i++)
            +                                       ext2fs_group_desc_csum_set(fs, i);
            +                               fs->flags &= ~EXT2_FLAG_SUPER_ONLY;
            +                       }
            +                       ext2fs_mark_super_dirty(fs);
                            }
            

            In the fallback case (with Linux, and newer kernel that has EXT4_IOC_SETFSUUID but filesystem with no CSUM_SEED feature), it does not clear "ret" so "errno = EOPNOTSUPP" returned from the kernel is still hit.

            adilger Andreas Dilger added a comment - It looks like this error might be coming from the following code in ext4_ioctl() : /* * If any checksums (group descriptors or metadata) are being used * then the checksum seed feature is required to change the UUID. */ if (((ext4_has_feature_gdt_csum(sb) || ext4_has_metadata_csum(sb)) && !ext4_has_feature_csum_seed(sb)) || ext4_has_feature_stable_inodes(sb)) return -EOPNOTSUPP; since we do not enable the EXT4_FEATURE_INCOMPAT_CSUM_SEED on any Lustre filesystems, since we also do not handle the METADATA_CSUM feature. However, we do enable GDT_CSUM , which seems to be tripping this check up. However, tune2fs should just ignore any such error and fall back to modify the superblock directly in that case. It looks like the bug is at: + + ret = -1; +#ifdef __linux__ + if (fsuuid) { + fsuuid->fsu_len - UUID_SIZE; + fsuuid->fsu_flags = 0; + memcpy(&fsuuid->fsu_uuid, new_uuid, UUID_SIZE); + ret = ioctl(fd, EXT4_IOC_SETFSUUID, fsuuid); + } +#endif + ret = -1; + /* + * If we can't set the UUID via the ioctl, fall + * back to directly modifying the superblock + .*/ + if (ret) { + memcpy(sb->s_uuid, new_uuid, UUID_SIZE); + ext2fs_init_csum_seed(fs); + if (set_csum) { + for (i = 0; i < fs->group_desc_count; i++) + ext2fs_group_desc_csum_set(fs, i); + fs->flags &= ~EXT2_FLAG_SUPER_ONLY; + } + ext2fs_mark_super_dirty(fs); } In the fallback case (with Linux, and newer kernel that has EXT4_IOC_SETFSUUID but filesystem with no CSUM_SEED feature), it does not clear " ret " so " errno = EOPNOTSUPP " returned from the kernel is still hit.

            People

              dongyang Dongyang Li
              maloo Maloo
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: