Uploaded image for project: 'Lustre'
  1. Lustre
  2. LU-14695

New OST not visible by MDTs. MGS problem or corrupt catalog llog?

Details

    • Bug
    • Resolution: Unresolved
    • Critical
    • None
    • Lustre 2.12.6
    • None
    • CentOS 7.9
    • 3
    • 9223372036854775807

    Description

      On our Oak filesystem, running 2.12.6, we have a problem with either the MGS or a corrupt catalog somewhere.

      Active OSTs on this filesystem are from OST000c (12) to OST0137 (311). Today, we tried to add OST index 312 oak-OST0138. The new OST is visible from client, but not from MDTs: we have 6 MDTs (oak-MDT0000 to oak-MDT0005).

      Full disclosure... older OSTs 0-11 were previously removed with the experimental command lctl del_ost from LU-7668.

      The server logs when I started the new OST are available in servers-logs.txt

      What is weird is the following:

      May 20 14:06:05 oak-md1-s2 kernel: Lustre: 108193:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
      

      and that it complains about other OSTs (not OST0138):

      May 20 14:06:05 oak-md1-s2 kernel: LustreError: 108193:0:(genops.c:556:class_register_device()) oak-OST0134-osc-MDT0003: already exists, won't add
      May 20 14:06:05 oak-md1-s2 kernel: LustreError: 108193:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
      May 20 14:06:05 oak-md1-s2 kernel: Lustre:    cmd=cf001 0:oak-OST0134-osc-MDT0003  1:osp  2:oak-MDT0003-mdtlov_UUID  
      May 20 14:06:05 oak-md1-s2 kernel: LustreError: 4061:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
      May 20 14:06:05 oak-md1-s2 kernel: Lustre:    cmd=cf001 0:oak-OST0134-osc-MDT0000  1:osp  2:oak-MDT0000-mdtlov_UUID  
      May 20 14:06:07 oak-md2-s2 kernel: Lustre: 14846:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
      May 20 14:06:07 oak-md2-s2 kernel: LustreError: 14846:0:(genops.c:556:class_register_device()) oak-OST0136-osc-MDT0005: already exists, won't add
      May 20 14:06:07 oak-md2-s2 kernel: LustreError: 14846:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
      May 20 14:06:07 oak-md2-s2 kernel: Lustre:    cmd=cf001 0:oak-OST0136-osc-MDT0005  1:osp  2:oak-MDT0005-mdtlov_UUID  
      May 20 14:06:07 oak-md2-s2 kernel: LustreError: 4291:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
      

      If I check the llog catalogs on the MGS, the new OST oak-OST0138 seems to have been added though:

      Client catalog on MGS:

      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-client | grep OST0138
      - { index: 2716, event: attach, device: oak-OST0138-osc, type: osc, UUID: oak-clilov_UUID }
      - { index: 2717, event: setup, device: oak-OST0138-osc, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 2719, event: add_conn, device: oak-OST0138-osc, node: 10.0.2.104@o2ib5 }
      - { index: 2720, event: add_osc, device: oak-clilov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      

      MDS catalogs on MGS:

      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0000 | grep OST0138
      - { index: 2785, event: attach, device: oak-OST0138-osc-MDT0000, type: osc, UUID: oak-MDT0000-mdtlov_UUID }
      - { index: 2786, event: setup, device: oak-OST0138-osc-MDT0000, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 2788, event: add_conn, device: oak-OST0138-osc-MDT0000, node: 10.0.2.104@o2ib5 }
      - { index: 2789, event: add_osc, device: oak-MDT0000-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      
      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0001 | grep OST0138
      - { index: 2930, event: attach, device: oak-OST0138-osc-MDT0001, type: osc, UUID: oak-MDT0001-mdtlov_UUID }
      - { index: 2931, event: setup, device: oak-OST0138-osc-MDT0001, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 2933, event: add_conn, device: oak-OST0138-osc-MDT0001, node: 10.0.2.104@o2ib5 }
      - { index: 2934, event: add_osc, device: oak-MDT0001-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      
      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0002 | grep OST0138
      - { index: 3063, event: attach, device: oak-OST0138-osc-MDT0002, type: osc, UUID: oak-MDT0002-mdtlov_UUID }
      - { index: 3064, event: setup, device: oak-OST0138-osc-MDT0002, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 3066, event: add_conn, device: oak-OST0138-osc-MDT0002, node: 10.0.2.104@o2ib5 }
      - { index: 3067, event: add_osc, device: oak-MDT0002-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      
      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0003 | grep OST0138
      - { index: 3079, event: attach, device: oak-OST0138-osc-MDT0003, type: osc, UUID: oak-MDT0003-mdtlov_UUID }
      - { index: 3080, event: setup, device: oak-OST0138-osc-MDT0003, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 3082, event: add_conn, device: oak-OST0138-osc-MDT0003, node: 10.0.2.104@o2ib5 }
      - { index: 3083, event: add_osc, device: oak-MDT0003-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      
      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0004 | grep OST0138
      - { index: 3255, event: attach, device: oak-OST0138-osc-MDT0004, type: osc, UUID: oak-MDT0004-mdtlov_UUID }
      - { index: 3256, event: setup, device: oak-OST0138-osc-MDT0004, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 3258, event: add_conn, device: oak-OST0138-osc-MDT0004, node: 10.0.2.104@o2ib5 }
      - { index: 3259, event: add_osc, device: oak-MDT0004-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      
      [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0005 | grep OST0138
      - { index: 3255, event: attach, device: oak-OST0138-osc-MDT0005, type: osc, UUID: oak-MDT0005-mdtlov_UUID }
      - { index: 3256, event: setup, device: oak-OST0138-osc-MDT0005, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
      - { index: 3258, event: add_conn, device: oak-OST0138-osc-MDT0005, node: 10.0.2.104@o2ib5 }
      - { index: 3259, event: add_osc, device: oak-MDT0005-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
      

      However, this new OST is NOT visible from the MDTs:

      [root@oak-md1-s2 CONFIGS]# llog_reader /mnt/ldiskfs/mdt/0/CONFIGS/oak-MDT0000 | grep 0138
      [root@oak-md1-s2 CONFIGS]# 
      
      [root@oak-md1-s2 ~]# lctl dl | grep OST0138
      [root@oak-md1-s2 ~]# 
      

       

      From a client, we can see the new OST but it's not filling up, which makes sense if the MDTs are not aware of it:

      oak-OST0133_UUID     108461852548 37418203104 69949699416  35% /oak[OST:307]
      oak-OST0134_UUID     108461852548 38597230784 68770659804  36% /oak[OST:308]
      oak-OST0135_UUID     108461852548 38483562644 68884328272  36% /oak[OST:309]
      oak-OST0136_UUID     108461852548 41312045604 66055819468  39% /oak[OST:310]
      oak-OST0137_UUID     108461852548 43196874132 64170973596  41% /oak[OST:311]
      oak-OST0138_UUID     108461852548        1828 107368054308   1% /oak[OST:312]
      
      

      Right now, we're up and running in that weird situation... not ideal.

      I'm attaching the catalogs found on the 6 MDTs as oak-MDT-CONFIGS-llog.tar and a tarball of the CONFIGS directory on the MGS as oak-MGS-CONFIGS.tar.gz

      Any idea of what is wrong or corrupt? We would really appreciate any help to avoid doing a full writeconf.

      Attachments

        1. oak-md1-s2_dk_config+info-1.log
          10.35 MB
        2. oak-md1-s2_dk_config+info-2.log
          1.64 MB
        3. oak-MDT-CONFIGS-llog.tar
          2.71 MB
        4. oak-MGS-CONFIGS.tar.gz
          465 kB
        5. servers-logs.txt
          6 kB

        Issue Links

          Activity

            [LU-14695] New OST not visible by MDTs. MGS problem or corrupt catalog llog?

            Stephane,  as I see from config logs local copies on MDTs were not updated from main config on MGS, I am not sure why, so it would still be valuable to get server log during mount, it can be related somehow to the servers order in log - there are MDT0004 and MDT0005 were added after last OST0137, so probably that is log processing/copying bug, I am checking that

            As for solution, you could just try to remove (better move to other location just in case) local MDT log of one MDT, say 0003 and remount it. The config log should be copied from MGS and MDT0003 might see OST0138. I worry about that -17 error during log processing, maybe it will interfere, but config log on MGS looks OK and has OST0138

            tappro Mikhail Pershin added a comment - Stephane,  as I see from config logs local copies on MDTs were not updated from main config on MGS, I am not sure why, so it would still be valuable to get server log during mount, it can be related somehow to the servers order in log - there are MDT0004 and MDT0005 were added after last OST0137, so probably that is log processing/copying bug, I am checking that As for solution, you could just try to remove (better move to other location just in case) local MDT log of one MDT, say 0003 and remount it. The config log should be copied from MGS and MDT0003 might see OST0138. I worry about that -17 error during log processing, maybe it will interfere, but config log on MGS looks OK and has OST0138

            Stephane, just couple questions, you've mentioned before that you have added other OSTs previously, e.g. 0136, and these additions went well, right? Any chance to know what Lustre version was used at that time? My proposal right now is to collect MDT Lustre log on server start to see config llog processing in more details, is that possible? Please add 'config' and 'info' levels to debug.

            tappro Mikhail Pershin added a comment - Stephane, just couple questions, you've mentioned before that you have added other OSTs previously, e.g. 0136, and these additions went well, right? Any chance to know what Lustre version was used at that time? My proposal right now is to collect MDT Lustre log on server start to see config llog processing in more details, is that possible? Please add 'config' and 'info' levels to debug.
            sthiell Stephane Thiell added a comment - - edited

            Hi Mike! Yes, Lustre 2.12.6 is used here on Oak, on all servers, including newly added OSTs. But yes, older OSTs were added using previous versions of Lustre. We started this filesystem with 2.9 in early 2017, then 2.10 for several years and upgraded to 2.12.x in October 2020. Then we have been upgrading Oak to the latest 2.12.x.

            I've started to see random weird behaviors when adding the previous OSTs (like oak-OST0136). One other thing, we have seen a crash similar to LU-9699 "ASSERTION( osp->opd_connects == 1 ) failed" once or twice. I guess some llog corruption and/or bad llog buffer handling could be the cause but I can't find what. I wonder if there is a way to simulate the llog config processing.

            Otherwise, a drastic solution would be to do a full writeconf and remount all targets to regenerate a clean config, but I guess we would also need to stop all clients, which means a long down time as Oak is mounted on several clusters.

            sthiell Stephane Thiell added a comment - - edited Hi Mike! Yes, Lustre 2.12.6 is used here on Oak, on all servers, including newly added OSTs. But yes, older OSTs were added using previous versions of Lustre. We started this filesystem with 2.9 in early 2017, then 2.10 for several years and upgraded to 2.12.x in October 2020. Then we have been upgrading Oak to the latest 2.12.x. I've started to see random weird behaviors when adding the previous OSTs (like oak-OST0136). One other thing, we have seen a crash similar to  LU-9699 "ASSERTION( osp->opd_connects == 1 ) failed" once or twice. I guess some llog corruption and/or bad llog buffer handling could be the cause but I can't find what. I wonder if there is a way to simulate the llog config processing. Otherwise, a drastic solution would be to do a full writeconf and remount all targets to regenerate a clean config, but I guess we would also need to stop all clients, which means a long down time as Oak is mounted on several clusters.

            Stephane, this looks like bug for me too, though can't say for sure is that corrupted llog or something else. I am still checking logs and existent tickets for something similar.  You've said that Lustre 2.12.6 is used, is that so also for newly added OST? Also I suppose that older servers were updated to 2.12.6 from older versions, am I right?

            tappro Mikhail Pershin added a comment - Stephane, this looks like bug for me too, though can't say for sure is that corrupted llog or something else. I am still checking logs and existent tickets for something similar.  You've said that Lustre 2.12.6 is used, is that so also for newly added OST? Also I suppose that older servers were updated to 2.12.6 from older versions, am I right?

            Also... lctl dk from a MDS (oak-md2-s1 serving oak-MDT0004):

            00000100:02000000:4.0:1621544759.322774:0:5321:0:(import.c:1597:ptlrpc_import_recovery_state_machine()) oak-MDT0004: Connection restored to oak-MDT0004-lwp-OST0136_UUID (at 10.0.2.103@o2ib5)
            00000020:00000400:51.0:1621544769.080003:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
            00000020:00000400:51.0:1621544769.093563:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4)
            00000020:00020000:51.0:1621544769.093605:0:37195:0:(genops.c:556:class_register_device()) oak-OST0136-osc-MDT0004: already exists, won't add
            00000020:00020000:51.0:1621544769.104827:0:37195:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
            00000020:02000400:51.0:1621544769.116657:0:37195:0:(obd_config.c:2068:class_config_dump_handler())    cmd=cf001 0:oak-OST0136-osc-MDT0004  1:osp  2:oak-MDT0004-mdtlov_UUID
            
            10000000:00020000:19.0:1621544769.127093:0:4304:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
            00010000:02000400:22.0:1621544973.442080:0:4310:0:(ldlm_lib.c:816:target_handle_reconnect()) oak-MDT0004: Client 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3) reconnecting
            

            What generates this error is:

                            if (!(cfg->cfg_flags & CFG_F_MARKER) &&
                                (lcfg->lcfg_command != LCFG_MARKER)) {
                                    CWARN("Skip config outside markers, (inst: %016lx, uuid: %s, flags: %#x)\n",
                                            cfg->cfg_instance,
                                            cfg->cfg_uuid.uuid, cfg->cfg_flags);
                                    cfg->cfg_flags |= CFG_F_SKIP;
            

            but cfg->cfg_instance is NULL and cfg->cfg_uuid.uuid empty. Bug?

            sthiell Stephane Thiell added a comment - Also... lctl dk from a MDS (oak-md2-s1 serving oak-MDT0004): 00000100:02000000:4.0:1621544759.322774:0:5321:0:(import.c:1597:ptlrpc_import_recovery_state_machine()) oak-MDT0004: Connection restored to oak-MDT0004-lwp-OST0136_UUID (at 10.0.2.103@o2ib5) 00000020:00000400:51.0:1621544769.080003:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0) 00000020:00000400:51.0:1621544769.093563:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4) 00000020:00020000:51.0:1621544769.093605:0:37195:0:(genops.c:556:class_register_device()) oak-OST0136-osc-MDT0004: already exists, won't add 00000020:00020000:51.0:1621544769.104827:0:37195:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17 00000020:02000400:51.0:1621544769.116657:0:37195:0:(obd_config.c:2068:class_config_dump_handler()) cmd=cf001 0:oak-OST0136-osc-MDT0004 1:osp 2:oak-MDT0004-mdtlov_UUID 10000000:00020000:19.0:1621544769.127093:0:4304:0:(mgc_request.c:599:do_requeue()) failed processing log: -17 00010000:02000400:22.0:1621544973.442080:0:4310:0:(ldlm_lib.c:816:target_handle_reconnect()) oak-MDT0004: Client 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3) reconnecting What generates this error is: if (!(cfg->cfg_flags & CFG_F_MARKER) && (lcfg->lcfg_command != LCFG_MARKER)) { CWARN("Skip config outside markers, (inst: %016lx, uuid: %s, flags: %#x)\n", cfg->cfg_instance, cfg->cfg_uuid.uuid, cfg->cfg_flags); cfg->cfg_flags |= CFG_F_SKIP; but cfg->cfg_instance is NULL and cfg->cfg_uuid.uuid empty. Bug?
            pjones Peter Jones added a comment -

            Mike

            Could you please advise?

            Thanks

            Peter

            pjones Peter Jones added a comment - Mike Could you please advise? Thanks Peter
            pjones Peter Jones added a comment -

            Sorry- wrong ticket

            pjones Peter Jones added a comment - Sorry- wrong ticket
            pjones Peter Jones added a comment -

            Serguei

            Can you please advise?

            Thanks

            Peter

            pjones Peter Jones added a comment - Serguei Can you please advise? Thanks Peter

            People

              tappro Mikhail Pershin
              sthiell Stephane Thiell
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: