Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-4384

Hit unsupported incompat filesystem feature error after downgrade system from 2.6 to 2.5.0

Details

    • Bug
    • Resolution: Fixed
    • Blocker
    • Lustre 2.6.0, Lustre 2.5.2
    • Lustre 2.4.0, Lustre 2.5.0, Lustre 2.6.0
    • 3
    • 12019

    Description

      Before upgrade, server and client are running 2.5.0 ldiskfs, then upgrade the whole system to lustre-master build #1791, it passed; then downgrade the system to 2.5.0 again, when mounting the OST, got following error:

      OST console shows:

      Lustre: DEBUG MARKER: == upgrade-downgrade Lustre version and system information == 11:31:49 (1386963109)
      Lustre: Lustre: Build Version: 2.5.0-RC1--PRISTINE-2.6.32-358.18.1.el6_lustre.x86_64
      LNet: Added LNI 10.10.19.53@tcp [8/256/0/180]
      LNet: Accept secure, port 988
      Lustre: DEBUG MARKER: == upgrade-downgrade End == 11:31:52 (1386963112)
      LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: 
      LustreError: 33847:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10
      LustreError: 33847:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22)
      LustreError: 33847:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22
      Lustre:    cmd=cf003 0:lustre-OST0000  1:dev  2:0  3:f  
      LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
      LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
      LustreError: 33730:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22
      LustreError: 33730:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22
      LustreError: 33730:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client.
      LustreError: 33730:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2)
      LustreError: 33730:0:(obd_config.c:619:class_cleanup()) Device 3 not setup
      Lustre: server umount lustre-OST0000 complete
      LustreError: 33730:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22)
      [root@wtm-88 ~]# 
      

      Attachments

        Issue Links

          Activity

            [LU-4384] Hit unsupported incompat filesystem feature error after downgrade system from 2.6 to 2.5.0

            I think all three issues are still blockers for 2.6, and #1 back porting the fix to 2.4.3 and 2.5.1 is a blocker there, so that users can downgrade again.

            adilger Andreas Dilger added a comment - I think all three issues are still blockers for 2.6, and #1 back porting the fix to 2.4.3 and 2.5.1 is a blocker there, so that users can downgrade again.

            Ok, are we safe to reduce this from blocker at this point? Or do we need to continue to track as a 2.6 blocker until these remaining tasks are completed?

            jlevi Jodi Levi (Inactive) added a comment - Ok, are we safe to reduce this from blocker at this point? Or do we need to continue to track as a 2.6 blocker until these remaining tasks are completed?

            I think some more work is still needed to make this handling correct:

            • add a patch for b2_4, b2_5, and master to add OBD_INCOMPAT_FID to OFD_INCOMAP_SUPP
            • fix checking of OFD_INCOMAT_SUPP in b2_5 and master
            • set OBD_INCOMPAT_FID on OSTs when FID_SEQ_NORMAL objects are created
            adilger Andreas Dilger added a comment - I think some more work is still needed to make this handling correct: add a patch for b2_4, b2_5, and master to add OBD_INCOMPAT_FID to OFD_INCOMAP_SUPP fix checking of OFD_INCOMAT_SUPP in b2_5 and master set OBD_INCOMPAT_FID on OSTs when FID_SEQ_NORMAL objects are created

            Can this ticket be closed now that Change, 8810 has landed or is more work needed in this ticket?

            jlevi Jodi Levi (Inactive) added a comment - Can this ticket be closed now that Change, 8810 has landed or is more work needed in this ticket?

            http://review.whamcloud.com/8810

            patch to remove unconditional set of OBD_INCOMPAT_FID for all types of filesystems. Now it is set for MDT only as before.

            tappro Mikhail Pershin added a comment - http://review.whamcloud.com/8810 patch to remove unconditional set of OBD_INCOMPAT_FID for all types of filesystems. Now it is set for MDT only as before.
            sarah Sarah Liu added a comment -

            rolling downgrade also hit this problem on OST

            LNet: Added LNI 10.10.19.53@tcp [8/256/0/180]
            LNet: Accept secure, port 988
            LNet: 9095:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
            LNet: 9096:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release.
            Lustre: 48 MB is too small for debug buffer size, setting it to 128 MB.
            LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: 
            LustreError: 9242:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10
            LustreError: 9242:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22)
            LustreError: 9242:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22
            Lustre:    cmd=cf003 0:lustre-OST0000  1:dev  2:0  3:f  
            LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22).  Make sure this client and the MGS are running compatible versions of Lustre.
            LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
            LustreError: 9125:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22
            LustreError: 9125:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22
            LustreError: 9125:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client.
            LustreError: 9125:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2)
            LustreError: 9125:0:(obd_config.c:619:class_cleanup()) Device 3 not setup
            Lustre: server umount lustre-OST0000 complete
            LustreError: 9125:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22)
            [root@wtm-88 ~]# 
            
            sarah Sarah Liu added a comment - rolling downgrade also hit this problem on OST LNet: Added LNI 10.10.19.53@tcp [8/256/0/180] LNet: Accept secure, port 988 LNet: 9095:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. LNet: 9096:0:(debug.c:218:libcfs_debug_str2mask()) You are trying to use a numerical value for the mask - this will be deprecated in a future release. Lustre: 48 MB is too small for debug buffer size, setting it to 128 MB. LDISKFS-fs (sdb1): mounted filesystem with ordered data mode. quota=on. Opts: LustreError: 9242:0:(ofd_fs.c:588:ofd_server_data_init()) lustre-OST0000: unsupported incompat filesystem feature(s) 10 LustreError: 9242:0:(obd_config.c:572:class_setup()) setup lustre-OST0000 failed (-22) LustreError: 9242:0:(obd_config.c:1591:class_config_llog_handler()) MGC10.10.19.62@tcp: cfg command failed: rc = -22 Lustre: cmd=cf003 0:lustre-OST0000 1:dev 2:0 3:f LustreError: 15b-f: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000'failed from the MGS (-22). Make sure this client and the MGS are running compatible versions of Lustre. LustreError: 15c-8: MGC10.10.19.62@tcp: The configuration from log 'lustre-OST0000' failed (-22). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. LustreError: 9125:0:(obd_mount_server.c:1257:server_start_targets()) failed to start server lustre-OST0000: -22 LustreError: 9125:0:(obd_mount_server.c:1732:server_fill_super()) Unable to start targets: -22 LustreError: 9125:0:(obd_mount_server.c:848:lustre_disconnect_lwp()) lustre-MDT0000-lwp-OST0000: Can't end config log lustre-client. LustreError: 9125:0:(obd_mount_server.c:1426:server_put_super()) lustre-OST0000: failed to disconnect lwp. (rc=-2) LustreError: 9125:0:(obd_config.c:619:class_cleanup()) Device 3 not setup Lustre: server umount lustre-OST0000 complete LustreError: 9125:0:(obd_mount.c:1311:lustre_fill_super()) Unable to mount /dev/sdb1 (-22) [root@wtm-88 ~]#

            Actually, it looks like tgt_server_data_init() is unconditionally setting OBD_INCOMPAT_FID on all filesystems. This should be limited to MDT filesystems. What is the meaning of OBD_INCOMPAT_FID on an OST filesystem anyway? Is that for FID_SEQ_NORMAL objects being allocated there?

            adilger Andreas Dilger added a comment - Actually, it looks like tgt_server_data_init() is unconditionally setting OBD_INCOMPAT_FID on all filesystems. This should be limited to MDT filesystems. What is the meaning of OBD_INCOMPAT_FID on an OST filesystem anyway? Is that for FID_SEQ_NORMAL objects being allocated there?

            Sorry, I misunderstood the problem above. This is not an EXT4_INCOMPAT_META_BG flag, this is actually OBD_INCOMPAT_FID being set in the last_rcvd header by 2.6.

            /** FID is enabled */
            #define OBD_INCOMPAT_FID        0x00000010
            

            Strangely, I don't see OBD_INCOMPAT_FID being set in OFD_INCOMPAT_SUPP on master, nor OFD_INCOMPAT_SUPP being used anywhere. It seems this checking has moved over to tgt_lastrcvd.c::tgt_scd[] as part of the unified target patches in LU-3467 (http://review.whamcloud.com/7330). I can't yet see how OBD_INCOMPAT_FID is being set, but this is a definite problem for downgrade.

            adilger Andreas Dilger added a comment - Sorry, I misunderstood the problem above. This is not an EXT4_INCOMPAT_META_BG flag, this is actually OBD_INCOMPAT_FID being set in the last_rcvd header by 2.6. /** FID is enabled */ #define OBD_INCOMPAT_FID 0x00000010 Strangely, I don't see OBD_INCOMPAT_FID being set in OFD_INCOMPAT_SUPP on master, nor OFD_INCOMPAT_SUPP being used anywhere. It seems this checking has moved over to tgt_lastrcvd.c::tgt_scd[] as part of the unified target patches in LU-3467 ( http://review.whamcloud.com/7330 ). I can't yet see how OBD_INCOMPAT_FID is being set, but this is a definite problem for downgrade.
            sarah Sarah Liu added a comment - the kernel used in 2.6 is 2.6.32-358.23.2 http://build.whamcloud.com/job/lustre-master/1791/arch=x86_64,build_type=server,distro=el6,ib_stack=inkernel/artifact/artifacts/RPMS/x86_64/

            The EXT4_FEATURE_INCOMPAT_METABG = 0x10. I don't know why this would be set when upgrading to 2.6. This feature relates to filesystem resizing but should not be enabled by default. What kernel is used for 2.6? It looks like 2.6.32-358.18.1 is used for 2.5.

            adilger Andreas Dilger added a comment - The EXT4_FEATURE_INCOMPAT_METABG = 0x10. I don't know why this would be set when upgrading to 2.6. This feature relates to filesystem resizing but should not be enabled by default. What kernel is used for 2.6? It looks like 2.6.32-358.18.1 is used for 2.5.

            People

              tappro Mikhail Pershin
              sarah Sarah Liu
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: