Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-3574

MDT cannot sync with OST after upgrade from 2.1 to 2.4 then downgrade

    XMLWordPrintable

Details

    • Bug
    • Resolution: Duplicate
    • Critical
    • Lustre 2.1.6
    • Lustre 2.4.0, Lustre 2.1.3
    • None
    • 3
    • 9042

    Description

      Hi,

      I format and start a Lustre file system with Lustre 2.1.
      Then I stop my file system, and upgrade to Lustre 2.4.
      I restart my Lustre file system, and everything is fine.
      Then I stop my file system, and decide to downgrade to Lustre 2.1.

      The MGT restarts fine, as well as the OSTs.
      The mount of the MDT returns successfully, but when I look in the dmesg I see:

      [root@yosemite2 ~]# dmesg 
      LDISKFS-fs warning (device loop1): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
      LDISKFS-fs (loop1): barriers disabled
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      LDISKFS-fs warning (device loop1): ldiskfs_fill_super: extents feature not enabled on this filesystem, use tune2fs.
      LDISKFS-fs (loop1): barriers disabled
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      Lustre: Enabling ACL
      Lustre: Enabling user_xattr
      Lustre: migrate-MDT0000: used disk, loading
      LustreError: 3273:0:(llog_lvfs.c:199:llog_lvfs_read_header()) bad log / header magic: 0x2e (expected 0x10645539)
      LustreError: 3273:0:(llog_obd.c:218:llog_setup_named()) obd migrate-OST0001-osc-MDT0000 ctxt 2 lop_setup=ffffffffa0530cb0 failed -5
      LustreError: 3273:0:(osc_request.c:4198:__osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
      LustreError: 3273:0:(osc_request.c:4215:__osc_llog_init()) osc 'migrate-OST0001-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' catid ffff88003da1b8d0 rc=-5
      LustreError: 3273:0:(osc_request.c:4217:__osc_llog_init()) logid 0x2:0x0
      LustreError: 3273:0:(osc_request.c:4245:osc_llog_init()) rc: -5
      LustreError: 3273:0:(lov_log.c:248:lov_llog_init()) error osc_llog_init idx 1 osc 'migrate-OST0001-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' (rc=-5)
      LustreError: 3273:0:(llog_lvfs.c:616:llog_lvfs_create()) error looking up logfile 0x4:0x0: rc -116
      LustreError: 3273:0:(llog_obd.c:218:llog_setup_named()) obd migrate-OST0000-osc-MDT0000 ctxt 2 lop_setup=ffffffffa0530cb0 failed -116
      LustreError: 3273:0:(osc_request.c:4198:__osc_llog_init()) failed LLOG_MDS_OST_ORIG_CTXT
      LustreError: 3273:0:(osc_request.c:4215:__osc_llog_init()) osc 'migrate-OST0000-osc-MDT0000' tgt 'mdd_obd-migrate-MDT0000' catid ffff88003da1b8d0 rc=-116
      LustreError: 3302:0:(lov_log.c:160:lov_llog_origin_connect()) error osc_llog_connect tgt 0 (-107)
      LustreError: 3302:0:(mds_lov.c:870:__mds_lov_synchronize()) migrate-OST0000_UUID failed at llog_origin_connect: -107
      LustreError: 3302:0:(mds_lov.c:901:__mds_lov_synchronize()) migrate-OST0000_UUID sync failed -107, deactivating
      

      lctl dl shows that the connection to the OSTs is not operational:

      [root@yosemite2 ~]# lctl dl
        0 UP mgs MGS MGS 7
        1 UP mgc MGC10.9.0.2@tcp efd0d467-a35f-6aed-8db6-089a2d2584ed 5
        2 UP lov migrate-MDT0000-mdtlov migrate-MDT0000-mdtlov_UUID 4
        3 UP mdt migrate-MDT0000 migrate-MDT0000_UUID 3
        4 UP mds mdd_obd-migrate-MDT0000 mdd_obd_uuid-migrate-MDT0000 3
        5 IN osc migrate-OST0001-osc-MDT0000 migrate-MDT0000-mdtlov_UUID 5
        6 IN osc migrate-OST0000-osc-MDT0000 migrate-MDT0000-mdtlov_UUID 5
      

      In the OSS's dmesg, I have:

      [root@yosemite3 ~]# dmesg 
      LDISKFS-fs (loop0): barriers disabled
      LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: 
      LDISKFS-fs (loop0): barriers disabled
      LDISKFS-fs (loop0): mounted filesystem with ordered data mode. Opts: 
      Lustre: MGC10.9.0.2@tcp: Reactivating import
      Lustre: migrate-OST0001: Now serving migrate-OST0001 on /dev/loop0 with recovery enabled
      LDISKFS-fs (loop1): barriers disabled
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      LDISKFS-fs (loop1): barriers disabled
      LDISKFS-fs (loop1): mounted filesystem with ordered data mode. Opts: 
      Lustre: migrate-OST0000: Now serving migrate-OST0000 on /dev/loop1 with recovery enabled
      Lustre: 3221:0:(ldlm_lib.c:946:target_handle_connect()) migrate-OST0001: connection from migrate-MDT0000-mdtlov_UUID@10.9.0.2@tcp t0 exp (null) cur 1373508570 last 0
      Lustre: migrate-OST0000: received MDS connection from 10.9.0.2@tcp
      

      lctl gives:

      [root@yosemite3 ~]# lctl dl
        0 UP mgc MGC10.9.0.2@tcp 93f173c1-3222-0f46-4337-5fe54f34d51d 5
        1 UP ost OSS OSS_uuid 3
        2 UP obdfilter migrate-OST0001 migrate-OST0001_UUID 5
        3 UP obdfilter migrate-OST0000 migrate-OST0000_UUID 5
      

      Surprisingly I can mount Lustre on my client, but have no write access to the file system:

      [root@yosemite4 ~]# cat /migrate/hi
      hey
      [root@yosemite4 ~]# touch /migrate/hello
      touch: cannot touch `/migrate/hello': Input/output error
      

      As a consequence my file system is not usable.

      Sebastien.

      Attachments

        Issue Links

          Activity

            People

              bfaccini Bruno Faccini (Inactive)
              sebastien.buisson Sebastien Buisson (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: