Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-6724

Downgrading from 2.8 with DNE2 patches to 2.5 servers fails: unsupported read-only filesystem feature(s) 2

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.8.0
    • Lustre 2.8.0
    • Lustre 2.5.3 servers plus lastest Lustre 2.8 with DNE2 patches.
    • 3
    • 9223372036854775807

    Description

      This morning I updated to the latest vanilla master and ended up in a state where I could not mount the file system. So I tried migrating back to lustre 2.5 and when I attempted to mount the file system I got these errors:

      [ 1025.018232] Lustre: Lustre: Build Version: 2.5.4--CHANGED-2.6.32-431.29.2.el6.atlas.x86_64
      [ 1062.767104] LDISKFS-fs (dm-7): recovery complete
      [ 1062.791599] LDISKFS-fs (dm-7): mounted filesystem with ordered data mode. quota=on. Opts:
      [ 1063.389770] LustreError: 24686:0:(ofd_fs.c:594:ofd_server_data_init()) sultan-OST0000: unsupported read-only filesystem feature(s) 2
      [ 1063.412582] LustreError: 24686:0:(obd_config.c:572:class_setup()) setup sultan-OST0000 failed (-22)
      [ 1063.421818] LustreError: 24686:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.37.248.67@o2ib1: cfg command failed: rc = -22
      [ 1063.433944] Lustre: cmd=cf003 0:sultan-OST0000 1:dev 2:0 3:f
      [ 1063.440506] LustreError: 15b-f: MGC10.37.248.67@o2ib1: The configuration from log 'sultan-OST0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre.
      [ 1063.458690] LustreError: 15c-8: MGC10.37.248.67@o2ib1: The configuration from log 'sultan-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
      [ 1063.482398] LustreError: 24600:0:(obd_mount_server.c:1254:server_start_targets()) failed to start server sultan-OST0000: -22
      [ 1063.493822] LustreError: 24600:0:(obd_mount_server.c:1737:server_fill_super()) Unable to start targets: -22
      [ 1063.503768] LustreError: 24600:0:(obd_mount_server.c:847:lustre_disconnect_lwp()) sultan-MDT0000-lwp-OST0000: Can't end config log sultan-client.
      [ 1063.516947] LustreError: 24600:0:(obd_mount_server.c:1422:server_put_super()) sultan-OST0000: failed to disconnect lwp. (rc=-2)
      [ 1063.528574] LustreError: 24600:0:(obd_config.c:619:class_cleanup()) Device 3 not setup
      [ 1063.539611] Lustre: server umount sultan-OST0000 complete
      [ 1063.545093] LustreError: 24600:0:(obd_mount.c:1330:lustre_fill_super()) Unable to mount /dev/mapper/sultan-ddn-l0 (-22)
      [ 1070.949382] LDISKFS-fs (dm-6): recovery complete
      [ 1070.956045] LDISKFS-fs (dm-6): mounted filesystem with ordered data mode. quota=on. Opts:
      [ 1071.472962] LustreError: 24982:0:(ofd_fs.c:594:ofd_server_data_init()) sultan-OST0004: unsupported read-only filesystem feature(s) 2
      [ 1071.495949] LustreError: 24982:0:(obd_config.c:572:class_setup()) setup sultan-OST0004 failed (-22)
      [ 1071.505140] LustreError: 24982:0:(obd_config.c:1629:class_config_llog_handler()) MGC10.37.248.67@o2ib1: cfg command failed: rc = -22
      

      Attachments

        Issue Links

          Activity

            [LU-6724] Downgrading from 2.8 with DNE2 patches to 2.5 servers fails: unsupported read-only filesystem feature(s) 2

            Oh this is a old ticket. This problem doesn't exist anymore. Well you do see the issue with mult-slot but their exist a work around. For those that don't know you umount your 2.8 file system then remount with recovery abort. Then unmount again. After that you can reboot into your 2.5 image and remount with no problem. This clears the multi-slot support from your config logs. Other than that we haven't see migration issues.

            simmonsja James A Simmons added a comment - Oh this is a old ticket. This problem doesn't exist anymore. Well you do see the issue with mult-slot but their exist a work around. For those that don't know you umount your 2.8 file system then remount with recovery abort. Then unmount again. After that you can reboot into your 2.5 image and remount with no problem. This clears the multi-slot support from your config logs. Other than that we haven't see migration issues.

            A detailed description of the corruption and the patch to make it required intervention are in LU-6050.
            https://jira.hpdd.intel.com/browse/LU-6050
            http://review.whamcloud.com/#/c/13516/

            paf Patrick Farrell (Inactive) added a comment - A detailed description of the corruption and the patch to make it required intervention are in LU-6050 . https://jira.hpdd.intel.com/browse/LU-6050 http://review.whamcloud.com/#/c/13516/

            The index_in_idif file will not be created in /proc if the filesystem was formatted with 2.7 or later, or if it was enabled at runtime on an older filesystem that was upgraded since it is not possible to disable it after it is set.

            Is there a chance that this was set manually, or the filesystem was mounted as 2.7+ the first time after formatting? The only other possibility I can think of is if the last_rcvd file was deleted and recreated on 2.7+ then it would set this feature when the last_rcvd file is created.

            adilger Andreas Dilger added a comment - The index_in_idif file will not be created in /proc if the filesystem was formatted with 2.7 or later, or if it was enabled at runtime on an older filesystem that was upgraded since it is not possible to disable it after it is set. Is there a chance that this was set manually, or the filesystem was mounted as 2.7+ the first time after formatting? The only other possibility I can think of is if the last_rcvd file was deleted and recreated on 2.7+ then it would set this feature when the last_rcvd file is created.

            I just took a look and index_in_idif does not exist in my proc tree. Looking at the in osd_prepare() this means either lsd_feature_rocompat is being set with the wrong flag or OBD_OCD_VERSION macro is broken.

            simmonsja James A Simmons added a comment - I just took a look and index_in_idif does not exist in my proc tree. Looking at the in osd_prepare() this means either lsd_feature_rocompat is being set with the wrong flag or OBD_OCD_VERSION macro is broken.

            It might make sense that reading the index_in_idif file will print out a message if it hasn't been enabled yet:

            0
            Writing a non-zero value to this file means you cannot downgrade below 2.7.0
            
            adilger Andreas Dilger added a comment - It might make sense that reading the index_in_idif file will print out a message if it hasn't been enabled yet: 0 Writing a non-zero value to this file means you cannot downgrade below 2.7.0

            Copying comment from LU-6663 where this issue was first reported.

            The unknown read-only feature "2" appears to be OBD_ROCOMPAT_IDX_IN_IDIF. That feature shouldn't automatically be enabled, and needs active participation from the administrator:

            static ssize_t
            ldiskfs_osd_index_in_idif_seq_write(struct file *file, const char *buffer,
                                                size_t count, loff_t *off)
            {
                            LCONSOLE_WARN("%s: OST-index in IDIF has been enabled, "
                                          "it cannot be reverted back.\n", osd_name(dev));
                            return -EPERM;
            

            James, did you set the index_in_idif feature in /proc, or is there a bug here that needs to be filed? Looking at the code it doesn't appear that this flag could have been set automatically.

            adilger Andreas Dilger added a comment - Copying comment from LU-6663 where this issue was first reported. The unknown read-only feature "2" appears to be OBD_ROCOMPAT_IDX_IN_IDIF. That feature shouldn't automatically be enabled, and needs active participation from the administrator: static ssize_t ldiskfs_osd_index_in_idif_seq_write(struct file *file, const char *buffer, size_t count, loff_t *off) { LCONSOLE_WARN("%s: OST-index in IDIF has been enabled, " "it cannot be reverted back.\n", osd_name(dev)); return -EPERM; James, did you set the index_in_idif feature in /proc, or is there a bug here that needs to be filed? Looking at the code it doesn't appear that this flag could have been set automatically.

            People

              laisiyao Lai Siyao
              simmonsja James A Simmons
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: