[LU-14695] New OST not visible by MDTs. MGS problem or corrupt catalog llog? Created: 20/May/21 Updated: 27/Sep/21 |
|
| Status: | Open |
| Project: | Lustre |
| Component/s: | None |
| Affects Version/s: | Lustre 2.12.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical |
| Reporter: | Stephane Thiell | Assignee: | Mikhail Pershin |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Environment: |
CentOS 7.9 |
||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Severity: | 3 | ||||||||
| Rank (Obsolete): | 9223372036854775807 | ||||||||
| Description |
|
On our Oak filesystem, running 2.12.6, we have a problem with either the MGS or a corrupt catalog somewhere. Active OSTs on this filesystem are from OST000c (12) to OST0137 (311). Today, we tried to add OST index 312 oak-OST0138. The new OST is visible from client, but not from MDTs: we have 6 MDTs (oak-MDT0000 to oak-MDT0005). Full disclosure... older OSTs 0-11 were previously removed with the experimental command lctl del_ost from The server logs when I started the new OST are available in servers-logs.txt What is weird is the following: May 20 14:06:05 oak-md1-s2 kernel: Lustre: 108193:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0) and that it complains about other OSTs (not OST0138): May 20 14:06:05 oak-md1-s2 kernel: LustreError: 108193:0:(genops.c:556:class_register_device()) oak-OST0134-osc-MDT0003: already exists, won't add May 20 14:06:05 oak-md1-s2 kernel: LustreError: 108193:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17 May 20 14:06:05 oak-md1-s2 kernel: Lustre: cmd=cf001 0:oak-OST0134-osc-MDT0003 1:osp 2:oak-MDT0003-mdtlov_UUID May 20 14:06:05 oak-md1-s2 kernel: LustreError: 4061:0:(mgc_request.c:599:do_requeue()) failed processing log: -17 May 20 14:06:05 oak-md1-s2 kernel: Lustre: cmd=cf001 0:oak-OST0134-osc-MDT0000 1:osp 2:oak-MDT0000-mdtlov_UUID May 20 14:06:07 oak-md2-s2 kernel: Lustre: 14846:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0) May 20 14:06:07 oak-md2-s2 kernel: LustreError: 14846:0:(genops.c:556:class_register_device()) oak-OST0136-osc-MDT0005: already exists, won't add May 20 14:06:07 oak-md2-s2 kernel: LustreError: 14846:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17 May 20 14:06:07 oak-md2-s2 kernel: Lustre: cmd=cf001 0:oak-OST0136-osc-MDT0005 1:osp 2:oak-MDT0005-mdtlov_UUID May 20 14:06:07 oak-md2-s2 kernel: LustreError: 4291:0:(mgc_request.c:599:do_requeue()) failed processing log: -17 If I check the llog catalogs on the MGS, the new OST oak-OST0138 seems to have been added though: Client catalog on MGS: [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-client | grep OST0138
- { index: 2716, event: attach, device: oak-OST0138-osc, type: osc, UUID: oak-clilov_UUID }
- { index: 2717, event: setup, device: oak-OST0138-osc, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 2719, event: add_conn, device: oak-OST0138-osc, node: 10.0.2.104@o2ib5 }
- { index: 2720, event: add_osc, device: oak-clilov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
MDS catalogs on MGS: [root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0000 | grep OST0138
- { index: 2785, event: attach, device: oak-OST0138-osc-MDT0000, type: osc, UUID: oak-MDT0000-mdtlov_UUID }
- { index: 2786, event: setup, device: oak-OST0138-osc-MDT0000, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 2788, event: add_conn, device: oak-OST0138-osc-MDT0000, node: 10.0.2.104@o2ib5 }
- { index: 2789, event: add_osc, device: oak-MDT0000-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
[root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0001 | grep OST0138
- { index: 2930, event: attach, device: oak-OST0138-osc-MDT0001, type: osc, UUID: oak-MDT0001-mdtlov_UUID }
- { index: 2931, event: setup, device: oak-OST0138-osc-MDT0001, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 2933, event: add_conn, device: oak-OST0138-osc-MDT0001, node: 10.0.2.104@o2ib5 }
- { index: 2934, event: add_osc, device: oak-MDT0001-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
[root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0002 | grep OST0138
- { index: 3063, event: attach, device: oak-OST0138-osc-MDT0002, type: osc, UUID: oak-MDT0002-mdtlov_UUID }
- { index: 3064, event: setup, device: oak-OST0138-osc-MDT0002, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 3066, event: add_conn, device: oak-OST0138-osc-MDT0002, node: 10.0.2.104@o2ib5 }
- { index: 3067, event: add_osc, device: oak-MDT0002-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
[root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0003 | grep OST0138
- { index: 3079, event: attach, device: oak-OST0138-osc-MDT0003, type: osc, UUID: oak-MDT0003-mdtlov_UUID }
- { index: 3080, event: setup, device: oak-OST0138-osc-MDT0003, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 3082, event: add_conn, device: oak-OST0138-osc-MDT0003, node: 10.0.2.104@o2ib5 }
- { index: 3083, event: add_osc, device: oak-MDT0003-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
[root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0004 | grep OST0138
- { index: 3255, event: attach, device: oak-OST0138-osc-MDT0004, type: osc, UUID: oak-MDT0004-mdtlov_UUID }
- { index: 3256, event: setup, device: oak-OST0138-osc-MDT0004, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 3258, event: add_conn, device: oak-OST0138-osc-MDT0004, node: 10.0.2.104@o2ib5 }
- { index: 3259, event: add_osc, device: oak-MDT0004-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
[root@oak-md1-s1 ~]# lctl --device MGS llog_print oak-MDT0005 | grep OST0138
- { index: 3255, event: attach, device: oak-OST0138-osc-MDT0005, type: osc, UUID: oak-MDT0005-mdtlov_UUID }
- { index: 3256, event: setup, device: oak-OST0138-osc-MDT0005, UUID: oak-OST0138_UUID, node: 10.0.2.103@o2ib5 }
- { index: 3258, event: add_conn, device: oak-OST0138-osc-MDT0005, node: 10.0.2.104@o2ib5 }
- { index: 3259, event: add_osc, device: oak-MDT0005-mdtlov, ost: oak-OST0138_UUID, index: 312, gen: 1 }
However, this new OST is NOT visible from the MDTs: [root@oak-md1-s2 CONFIGS]# llog_reader /mnt/ldiskfs/mdt/0/CONFIGS/oak-MDT0000 | grep 0138 [root@oak-md1-s2 CONFIGS]# [root@oak-md1-s2 ~]# lctl dl | grep OST0138 [root@oak-md1-s2 ~]#
From a client, we can see the new OST but it's not filling up, which makes sense if the MDTs are not aware of it: oak-OST0133_UUID 108461852548 37418203104 69949699416 35% /oak[OST:307] oak-OST0134_UUID 108461852548 38597230784 68770659804 36% /oak[OST:308] oak-OST0135_UUID 108461852548 38483562644 68884328272 36% /oak[OST:309] oak-OST0136_UUID 108461852548 41312045604 66055819468 39% /oak[OST:310] oak-OST0137_UUID 108461852548 43196874132 64170973596 41% /oak[OST:311] oak-OST0138_UUID 108461852548 1828 107368054308 1% /oak[OST:312] Right now, we're up and running in that weird situation... not ideal. I'm attaching the catalogs found on the 6 MDTs as oak-MDT-CONFIGS-llog.tar Any idea of what is wrong or corrupt? We would really appreciate any help to avoid doing a full writeconf. |
| Comments |
| Comment by Peter Jones [ 21/May/21 ] |
|
Serguei Can you please advise? Thanks Peter |
| Comment by Peter Jones [ 21/May/21 ] |
|
Sorry- wrong ticket |
| Comment by Peter Jones [ 21/May/21 ] |
|
Mike Could you please advise? Thanks Peter |
| Comment by Stephane Thiell [ 26/May/21 ] |
|
Also... lctl dk from a MDS (oak-md2-s1 serving oak-MDT0004): 00000100:02000000:4.0:1621544759.322774:0:5321:0:(import.c:1597:ptlrpc_import_recovery_state_machine()) oak-MDT0004: Connection restored to oak-MDT0004-lwp-OST0136_UUID (at 10.0.2.103@o2ib5) 00000020:00000400:51.0:1621544769.080003:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0) 00000020:00000400:51.0:1621544769.093563:0:37195:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4) 00000020:00020000:51.0:1621544769.093605:0:37195:0:(genops.c:556:class_register_device()) oak-OST0136-osc-MDT0004: already exists, won't add 00000020:00020000:51.0:1621544769.104827:0:37195:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17 00000020:02000400:51.0:1621544769.116657:0:37195:0:(obd_config.c:2068:class_config_dump_handler()) cmd=cf001 0:oak-OST0136-osc-MDT0004 1:osp 2:oak-MDT0004-mdtlov_UUID 10000000:00020000:19.0:1621544769.127093:0:4304:0:(mgc_request.c:599:do_requeue()) failed processing log: -17 00010000:02000400:22.0:1621544973.442080:0:4310:0:(ldlm_lib.c:816:target_handle_reconnect()) oak-MDT0004: Client 9458049c-ca8d-335b-3531-2606964e11c0 (at 10.51.2.31@o2ib3) reconnecting What generates this error is: if (!(cfg->cfg_flags & CFG_F_MARKER) &&
(lcfg->lcfg_command != LCFG_MARKER)) {
CWARN("Skip config outside markers, (inst: %016lx, uuid: %s, flags: %#x)\n",
cfg->cfg_instance,
cfg->cfg_uuid.uuid, cfg->cfg_flags);
cfg->cfg_flags |= CFG_F_SKIP;
but cfg->cfg_instance is NULL and cfg->cfg_uuid.uuid empty. Bug? |
| Comment by Mikhail Pershin [ 26/May/21 ] |
|
Stephane, this looks like bug for me too, though can't say for sure is that corrupted llog or something else. I am still checking logs and existent tickets for something similar. You've said that Lustre 2.12.6 is used, is that so also for newly added OST? Also I suppose that older servers were updated to 2.12.6 from older versions, am I right? |
| Comment by Stephane Thiell [ 26/May/21 ] |
|
Hi Mike! Yes, Lustre 2.12.6 is used here on Oak, on all servers, including newly added OSTs. But yes, older OSTs were added using previous versions of Lustre. We started this filesystem with 2.9 in early 2017, then 2.10 for several years and upgraded to 2.12.x in October 2020. Then we have been upgrading Oak to the latest 2.12.x. I've started to see random weird behaviors when adding the previous OSTs (like oak-OST0136). One other thing, we have seen a crash similar to Otherwise, a drastic solution would be to do a full writeconf and remount all targets to regenerate a clean config, but I guess we would also need to stop all clients, which means a long down time as Oak is mounted on several clusters. |
| Comment by Mikhail Pershin [ 31/May/21 ] |
|
Stephane, just couple questions, you've mentioned before that you have added other OSTs previously, e.g. 0136, and these additions went well, right? Any chance to know what Lustre version was used at that time? My proposal right now is to collect MDT Lustre log on server start to see config llog processing in more details, is that possible? Please add 'config' and 'info' levels to debug. |
| Comment by Mikhail Pershin [ 02/Jun/21 ] |
|
Stephane, as I see from config logs local copies on MDTs were not updated from main config on MGS, I am not sure why, so it would still be valuable to get server log during mount, it can be related somehow to the servers order in log - there are MDT0004 and MDT0005 were added after last OST0137, so probably that is log processing/copying bug, I am checking that As for solution, you could just try to remove (better move to other location just in case) local MDT log of one MDT, say 0003 and remount it. The config log should be copied from MGS and MDT0003 might see OST0138. I worry about that -17 error during log processing, maybe it will interfere, but config log on MGS looks OK and has OST0138 |
| Comment by Stephane Thiell [ 07/Jun/21 ] |
|
Hi Mike, Thanks! I will try to gather config llog processing in more details after disabling local MDT config log, at a next scheduled maintenance so I can restart MGS and MDTs. (I think the MGS might have a problem somehow, so better to restart it too). It might not be before 2 weeks though. I can see the version of Lustre in the MGS's oak-MDT0000 config llog, for example: #2735 (224)marker 4700 (flags=0x01, v2.12.6.0) oak-OST0136 'add osc' Thu Feb 18 11:22:03 2021- |
| Comment by Stephane Thiell [ 15/Jun/21 ] |
|
We had an opportunity to reboot the MDS in question, so both MDT0000 and MDT0003 restarted, which is a bit confusing in the log. I renamed the config for MDT0003 prior to mounting, but unfortunately I was only able to capture the config for MDT0000, I think, with an error on duplicate OST, this time OST0135 (super weird...). Anyway, we can see the part when the config is loaded (see oak-md1-s2_dk_config+info-1.log 00000020:00000080:0.0:1623351596.339966:0:57227:0:(obd_config.c:1128:class_process_config()) processing cmd: cf010 00000020:00000080:0.0:1623351596.339966:0:57227:0:(obd_config.c:1198:class_process_config()) marker 4694 (0x1) oak-OST0135 add osc 00000020:00000080:0.0:1623351596.339967:0:57227:0:(obd_config.c:1128:class_process_config()) processing cmd: cf005 00000020:00000080:0.0:1623351596.339968:0:57227:0:(obd_config.c:1139:class_process_config()) adding mapping from uuid 10.0.2.104@o2ib5 to nid 0x500050a000268 (10.0.2.104@o2ib5) 00000100:00000040:0.0:1623351596.339969:0:57227:0:(lustre_peer.c:122:class_add_uuid()) found uuid 10.0.2.104@o2ib5 10.0.2.104@o2ib5 cnt=1 00000020:01000000:0.0:1623351596.339969:0:57227:0:(obd_config.c:1695:class_config_llog_handler()) For 2.x interoperability, rename obd type from osc to osp (oak-MDT0000) 00000020:00000080:0.0:1623351596.339970:0:57227:0:(obd_config.c:1128:class_process_config()) processing cmd: cf001 00000020:00000080:0.0:1623351596.339972:0:57227:0:(genops.c:451:class_newdev()) Allocate new device oak-OST0135-osc-MDT0000 (ffffa0ba056920f0) 00000020:00000040:0.0:1623351596.339972:0:57227:0:(lustre_handles.c:99:class_handle_hash()) added object ffffa0ab7c744c00 with handle 0x60ebddc04fb89991 to hash 00000020:00000040:0.0:1623351596.339973:0:57227:0:(genops.c:1018:class_export_put()) PUTting export ffffa0ab7c744c00 : new refcount 1 00000100:00000040:3.0:1623351596.339975:0:57124:0:(niobuf.c:905:ptl_send_rpc()) @@@ send flg=0 req@ffffa0ba05689b00 x1702207520042944/t0(0) o8->oak-OST0134-osc-MDT0000@10.0.2.103@o2ib5:28/4 lens 520/544 e 0 to 0 dl 1623351601 ref 2 fl Rpc:N/0/ffffffff rc 0/-1 00000100:00000040:3.0:1623351596.339978:0:57124:0:(niobuf.c:57:ptl_send_buf()) peer_id 12345-10.0.2.103@o2ib5 00000020:00000080:0.0:1623351596.339993:0:57227:0:(obd_config.c:431:class_attach()) OBD: dev 307 attached type osp with refcount 1 and later when it tries to register duplicate OST (this time, it was OST0135), see oak-md1-s2_dk_config+info-2.log 00000020:00000400:14.0:1623352197.724522:0:58995:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0) 00000020:00000400:14.0:1623352197.739473:0:58995:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4) 00000020:00000400:14.0:1623352197.739474:0:58995:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4) 00000020:00020000:14.0:1623352197.739539:0:58995:0:(genops.c:556:class_register_device()) oak-OST0135-osc-MDT0000: already exists, won't add 00000020:00020000:14.0:1623352197.751872:0:58995:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17 00000020:02000400:14.0:1623352197.764886:0:58995:0:(obd_config.c:2068:class_config_dump_handler()) cmd=cf001 0:oak-OST0135-osc-MDT0000 1:osp 2:oak-MDT0000-mdtlov_UUID 10000000:00020000:23.0:1623352197.776197:0:57194:0:(mgc_request.c:599:do_requeue()) failed processing log: -17 |