Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4384

Hit unsupported incompat filesystem feature error after downgrade system from 2.6 to 2.5.0

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.2
    • Lustre 2.4.0, Lustre 2.5.0, Lustre 2.6.0
    • 3
    • 12019

    Description

      Before upgrade, server and client are running 2.5.0 ldiskfs, then upgrade the whole system to lustre-master build #1791, it passed; then downgrade the system to 2.5.0 again, when mounting the OST, got following error:

      OST console shows:

      Lustre: DEBUG MARKER: == upgrade-downgrade Lustre version and system information == 11:31:49 (1386963109)
      Lustre: Lustre: Build Version: 2.5.0-RC1--PRISTINE-2.6.32-358.18.1.el6_lustre.x86_64
      LNet: Added LNI 10.10.19.53@tcp [8/256/0/180]
      LNet: Accept secure, port 988
      Lustre: DEBUG MARKER: == upgrade-downgrade End == 11:31:52 (1386963112)
      LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: 
      LustreError: 33847:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10
      LustreError: 33847:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22)
      LustreError: 33847:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22
      Lustre:    cmd=cf003 0:lustre-OST0000  1:dev  2:0  3:f  
      LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
      LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
      LustreError: 33730:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22
      LustreError: 33730:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22
      LustreError: 33730:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client.
      LustreError: 33730:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2)
      LustreError: 33730:0:(obd_config.c:619:class_cleanup()) Device 3 not setup
      Lustre: server umount lustre-OST0000 complete
      LustreError: 33730:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22)
      [root@wtm-88 ~]# 
      

      Attachments

        Issue Links

          Activity

            [LU-4384] Hit unsupported incompat filesystem feature error after downgrade system from 2.6 to 2.5.0
            pjones Peter Jones added a comment -

            Landed for master. Fixes for maintenance branches will be tracked separately.

            pjones Peter Jones added a comment - Landed for master. Fixes for maintenance branches will be tracked separately.

            patches to set OBD_INCOMPAT_FID bit:

            master - http://review.whamcloud.com/9375
            b2_4 - http://review.whamcloud.com/9410
            b2_5 - http://review.whamcloud.com/9411

            The checking fix is needed only in master and it is landed.

            tappro Mikhail Pershin added a comment - patches to set OBD_INCOMPAT_FID bit: master - http://review.whamcloud.com/9375 b2_4 - http://review.whamcloud.com/9410 b2_5 - http://review.whamcloud.com/9411 The checking fix is needed only in master and it is landed.

            Mike, the use of OBD_INCOMPAT_FID is completely separate from any client interoperability. That is needed to prevent old OST code from mounting the filesystem after it has started to create FID_SEQ_NORMAL objects that the 2.3 and older server does not understand.

            adilger Andreas Dilger added a comment - Mike, the use of OBD_INCOMPAT_FID is completely separate from any client interoperability. That is needed to prevent old OST code from mounting the filesystem after it has started to create FID_SEQ_NORMAL objects that the 2.3 and older server does not understand.

            Andreas, if server supports FID_SEQ_NORMAL and create such objects, does that mean the old client will be incompatible? I had impression that client will work anyway with FID being just converted to OID/SEQ format. So I wonder do we have here incompatible case at all?

            tappro Mikhail Pershin added a comment - Andreas, if server supports FID_SEQ_NORMAL and create such objects, does that mean the old client will be incompatible? I had impression that client will work anyway with FID being just converted to OID/SEQ format. So I wonder do we have here incompatible case at all?

            I think all three issues are still blockers for 2.6, and #1 back porting the fix to 2.4.3 and 2.5.1 is a blocker there, so that users can downgrade again.

            adilger Andreas Dilger added a comment - I think all three issues are still blockers for 2.6, and #1 back porting the fix to 2.4.3 and 2.5.1 is a blocker there, so that users can downgrade again.

            Ok, are we safe to reduce this from blocker at this point? Or do we need to continue to track as a 2.6 blocker until these remaining tasks are completed?

            jlevi Jodi Levi (Inactive) added a comment - Ok, are we safe to reduce this from blocker at this point? Or do we need to continue to track as a 2.6 blocker until these remaining tasks are completed?

            I think some more work is still needed to make this handling correct:

            • add a patch for b2_4, b2_5, and master to add OBD_INCOMPAT_FID to OFD_INCOMAP_SUPP
            • fix checking of OFD_INCOMAT_SUPP in b2_5 and master
            • set OBD_INCOMPAT_FID on OSTs when FID_SEQ_NORMAL objects are created
            adilger Andreas Dilger added a comment - I think some more work is still needed to make this handling correct: add a patch for b2_4, b2_5, and master to add OBD_INCOMPAT_FID to OFD_INCOMAP_SUPP fix checking of OFD_INCOMAT_SUPP in b2_5 and master set OBD_INCOMPAT_FID on OSTs when FID_SEQ_NORMAL objects are created

            Can this ticket be closed now that Change, 8810 has landed or is more work needed in this ticket?

            jlevi Jodi Levi (Inactive) added a comment - Can this ticket be closed now that Change, 8810 has landed or is more work needed in this ticket?

            http://review.whamcloud.com/8810

            patch to remove unconditional set of OBD_INCOMPAT_FID for all types of filesystems. Now it is set for MDT only as before.

            tappro Mikhail Pershin added a comment - http://review.whamcloud.com/8810 patch to remove unconditional set of OBD_INCOMPAT_FID for all types of filesystems. Now it is set for MDT only as before.
            sarah Sarah Liu added a comment -

            rolling downgrade also hit this problem on OST

            LNet: Added LNI 10.10.19.53@tcp [8/256/0/180]
            LNet: Accept secure, port 988
            LNet: 9095:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
            LNet: 9096:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
            Lustre: 48 MB is too small for debug buffer size, setting it to 128 MB.
            LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: 
            LustreError: 9242:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10
            LustreError: 9242:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22)
            LustreError: 9242:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22
            Lustre:    cmd=cf003 0:lustre-OST0000  1:dev  2:0  3:f  
            LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
            LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
            LustreError: 9125:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22
            LustreError: 9125:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22
            LustreError: 9125:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client.
            LustreError: 9125:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2)
            LustreError: 9125:0:(obd_config.c:619:class_cleanup()) Device 3 not setup
            Lustre: server umount lustre-OST0000 complete
            LustreError: 9125:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22)
            [root@wtm-88 ~]# 
            
            sarah Sarah Liu added a comment - rolling downgrade also hit this problem on OST LNet: Added LNI 10.10.19.53@tcp [8/256/0/180] LNet: Accept secure, port 988 LNet: 9095:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. LNet: 9096:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. Lustre: 48 MB is too small for debug buffer size, setting it to 128 MB. LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: LustreError: 9242:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10 LustreError: 9242:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22) LustreError: 9242:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22 Lustre: cmd=cf003 0:lustre-OST0000 1:dev 2:0 3:f LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre. LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 9125:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22 LustreError: 9125:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22 LustreError: 9125:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client. LustreError: 9125:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2) LustreError: 9125:0:(obd_config.c:619:class_cleanup()) Device 3 not setup Lustre: server umount lustre-OST0000 complete LustreError: 9125:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22) [root@wtm-88 ~]#

            People

              tappro Mikhail Pershin
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: