<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:14:35 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15000] MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&gt;opd_connects == 1 ) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-15000</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I previously reported some issues when adding OSTs in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14695&quot; title=&quot;New OST not visible by MDTs. MGS problem or corrupt catalog llog?&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14695&quot;&gt;LU-14695&lt;/a&gt;, and I also guess we got hit by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13356&quot; title=&quot;lctl conf_param hung on the MGS node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13356&quot;&gt;&lt;del&gt;LU-13356&lt;/del&gt;&lt;/a&gt; and the MGS was not unable to process new OSTs. So we recently upgraded all Oak servers to 2.12.7 + single patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13356&quot; title=&quot;lctl conf_param hung on the MGS node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13356&quot;&gt;&lt;del&gt;LU-13356&lt;/del&gt;&lt;/a&gt;. I&apos;m opening this ticket in this new context where the MGS seems to work better so far. But now the issue is more on the MDS side apparently.&lt;/p&gt;

&lt;p&gt;We have 300+ OSTs. After adding new OSTs, the first ones succeed, but after 3 or 4 new OSTs, all MDTs crash with the following backtrace (you can see the new OST being initialized on an OSS and then a MDS crash:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Sep 09 07:59:33 oak-io6-s2 kernel: md/raid:md45: raid level 6 active with 10 out of 10 devices, algorithm 2
Sep 09 07:59:33 oak-io6-s2 kernel: md45: detected capacity change from 0 to 112003075014656
Sep 09 07:59:33 oak-io6-s2 kernel: LDISKFS-fs (md45): file extents enabled, maximum tree depth=5
Sep 09 07:59:34 oak-io6-s2 kernel: LDISKFS-fs (md45): mounted filesystem with ordered data mode. Opts: errors=remount-ro
Sep 09 07:59:34 oak-io6-s2 kernel: LDISKFS-fs (md45): file extents enabled, maximum tree depth=5
Sep 09 07:59:34 oak-io6-s2 kernel: LDISKFS-fs (md45): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc
Sep 09 07:59:38 oak-io6-s2 kernel: Lustre: oak-OST013d: new disk, initializing
Sep 09 07:59:38 oak-io6-s2 kernel: Lustre: srv-oak-OST013d: No data found on store. Initialize space
Sep 09 07:59:38 oak-io6-s2 kernel: Lustre: oak-OST013d: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
Sep 09 07:59:43 oak-md1-s1 kernel: LustreError: 24251:0:(osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed: 
Sep 09 07:59:43 oak-md1-s1 kernel: LustreError: 24251:0:(osp_dev.c:1404:osp_obd_connect()) LBUG
Sep 09 07:59:43 oak-md1-s1 kernel: Pid: 24251, comm: llog_process_th 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020
Sep 09 07:59:43 oak-md1-s1 kernel: Call Trace:
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0cac7cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0cac87c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc1a2beb6&amp;gt;] osp_obd_connect+0x3c6/0x400 [osp]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc194b04e&amp;gt;] lod_add_device+0xa8e/0x19a0 [lod]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc1946895&amp;gt;] lod_process_config+0x13b5/0x1510 [lod]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0ef18b2&amp;gt;] class_process_config+0x2142/0x2830 [obdclass]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0ef3b79&amp;gt;] class_config_llog_handler+0x819/0x1520 [obdclass]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0eb65af&amp;gt;] llog_process_thread+0x85f/0x1a10 [obdclass]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffc0eb8174&amp;gt;] llog_process_thread_daemonize+0xa4/0xe0 [obdclass]
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffa62c5c21&amp;gt;] kthread+0xd1/0xe0
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffa6994ddd&amp;gt;] ret_from_fork_nospec_begin+0x7/0x21
Sep 09 07:59:43 oak-md1-s1 kernel:  [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;This is another example when we can see llog errors, for OSTs that actually are added (they are properly registered on the MGS),  before the crash of the MDS:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Sep  9 08:21:17 oak-md1-s1 kernel: Lustre: oak-MDT0001: Client 59a1cc5a-bdc6-102a-1ccb-b551d993bacc (at 10.51.3.32@o2ib3) reconnecting
Sep  9 08:21:18 oak-md1-s1 kernel: Lustre: oak-MDT0001: Client ea26edda-b4c7-1625-cd1a-c65d4487d437 (at 10.51.3.68@o2ib3) reconnecting
Sep  9 08:21:18 oak-md1-s1 kernel: Lustre: Skipped 5 previous similar messages
Sep  9 08:21:20 oak-md1-s1 kernel: Lustre: oak-MDT0001: Client 38d27508-3ea0-383f-8501-005bb0842ebd (at 10.51.4.12@o2ib3) reconnecting
Sep  9 08:21:20 oak-md1-s1 kernel: Lustre: Skipped 2 previous similar messages
Sep  9 08:21:35 oak-md1-s1 kernel: Lustre: 7555:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
Sep  9 08:21:35 oak-md1-s1 kernel: LustreError: 7555:0:(genops.c:556:class_register_device()) oak-OST0144-osc-MDT0001: already exists, won&apos;t add
Sep  9 08:21:35 oak-md1-s1 kernel: LustreError: 7555:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
Sep  9 08:21:35 oak-md1-s1 kernel: Lustre:    cmd=cf001 0:oak-OST0144-osc-MDT0001  1:osp  2:oak-MDT0001-mdtlov_UUID  
Sep  9 08:21:35 oak-md1-s1 kernel: LustreError: 3775:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
Sep  9 08:21:35 oak-md1-s1 kernel: Lustre:    cmd=cf001 0:oak-OST0144-osc-MDT0002  1:osp  2:oak-MDT0002-mdtlov_UUID  
Sep  9 08:21:44 oak-md1-s1 kernel: LustreError: 7569:0:(obd_config.c:461:class_setup()) Device 635 already setup (type osp)
Sep  9 08:21:44 oak-md1-s1 kernel: LustreError: 7569:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
Sep  9 08:21:44 oak-md1-s1 kernel: LustreError: 7569:0:(obd_config.c:1835:class_config_llog_handler()) Skipped 1 previous similar message
Sep  9 08:21:44 oak-md1-s1 kernel: Lustre:    cmd=cf003 0:oak-OST0144-osc-MDT0001  1:oak-OST0144_UUID  2:10.0.2.103@o2ib5  
Sep  9 08:21:44 oak-md1-s1 kernel: LustreError: 3775:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
Sep  9 08:21:44 oak-md1-s1 kernel: LustreError: 3775:0:(mgc_request.c:599:do_requeue()) Skipped 1 previous similar message
Sep  9 08:21:48 oak-md1-s1 kernel: LustreError: 7571:0:(obd_config.c:461:class_setup()) Device 316 already setup (type osp)
Sep  9 08:21:48 oak-md1-s1 kernel: LustreError: 7571:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
Sep  9 08:21:48 oak-md1-s1 kernel: Lustre:    cmd=cf003 0:oak-OST0144-osc-MDT0002  1:oak-OST0144_UUID  2:10.0.2.103@o2ib5  
Sep  9 08:21:48 oak-md1-s1 kernel: LustreError: 3775:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
Sep  9 08:21:55 oak-md1-s1 kernel: LustreError: 7574:0:(osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed: 
Sep  9 08:21:55 oak-md1-s1 kernel: LustreError: 7574:0:(osp_dev.c:1404:osp_obd_connect()) LBUG
Sep  9 08:21:55 oak-md1-s1 kernel: Pid: 7574, comm: llog_process_th 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020
Sep  9 08:21:56 oak-md1-s1 kernel: Call Trace:
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0b057cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0b0587c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc18baeb6&amp;gt;] osp_obd_connect+0x3c6/0x400 [osp]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc17da04e&amp;gt;] lod_add_device+0xa8e/0x19a0 [lod]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc17d5895&amp;gt;] lod_process_config+0x13b5/0x1510 [lod]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0d4f8b2&amp;gt;] class_process_config+0x2142/0x2830 [obdclass]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0d51b79&amp;gt;] class_config_llog_handler+0x819/0x1520 [obdclass]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0d145af&amp;gt;] llog_process_thread+0x85f/0x1a10 [obdclass]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffc0d16174&amp;gt;] llog_process_thread_daemonize+0xa4/0xe0 [obdclass]
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffff9dcc5c21&amp;gt;] kthread+0xd1/0xe0
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffff9e394ddd&amp;gt;] ret_from_fork_nospec_begin+0x7/0x21
Sep  9 08:21:56 oak-md1-s1 kernel: [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In the trace below, we can see that the MDS crashes are happening at the same time:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;oak-io6-s2: Sep 09 08:37:59 oak-io6-s2 kernel: md/raid:md53: device dm-494 operational as raid disk 1
oak-io6-s2: Sep 09 08:37:59 oak-io6-s2 kernel: md/raid:md53: raid level 6 active with 10 out of 10 devices, algorithm 2
oak-io6-s2: Sep 09 08:37:59 oak-io6-s2 kernel: md53: detected capacity change from 0 to 112003075014656
oak-io6-s2: Sep 09 08:38:00 oak-io6-s2 kernel: LDISKFS-fs (md53): file extents enabled, maximum tree depth=5
oak-io6-s2: Sep 09 08:38:00 oak-io6-s2 kernel: LDISKFS-fs (md53): mounted filesystem with ordered data mode. Opts: errors=remount-ro
oak-io6-s2: Sep 09 08:38:01 oak-io6-s2 kernel: LDISKFS-fs (md53): file extents enabled, maximum tree depth=5
oak-io6-s2: Sep 09 08:38:01 oak-io6-s2 kernel: LDISKFS-fs (md53): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc
oak-io6-s2: Sep 09 08:38:06 oak-io6-s2 kernel: Lustre: oak-OST0145: new disk, initializing
oak-io6-s2: Sep 09 08:38:06 oak-io6-s2 kernel: Lustre: srv-oak-OST0145: No data found on store. Initialize space
oak-io6-s2: Sep 09 08:38:06 oak-io6-s2 kernel: Lustre: oak-OST0145: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-900
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel: LustreError: 7744:0:(osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed: 
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel: LustreError: 7744:0:(osp_dev.c:1404:osp_obd_connect()) LBUG
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel: Pid: 7744, comm: llog_process_th 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel: LustreError: 5232:0:(osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed: 
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel: LustreError: 5232:0:(osp_dev.c:1404:osp_obd_connect()) LBUG
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel: Pid: 5232, comm: llog_process_th 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel: Call Trace:
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc097b7cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel: Call Trace:
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc097b87c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc06667cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc066687c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc172deb6&amp;gt;] osp_obd_connect+0x3c6/0x400 [osp]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc1311eb6&amp;gt;] osp_obd_connect+0x3c6/0x400 [osp]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc1bd404e&amp;gt;] lod_add_device+0xa8e/0x19a0 [lod]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc164d04e&amp;gt;] lod_add_device+0xa8e/0x19a0 [lod]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc1648895&amp;gt;] lod_process_config+0x13b5/0x1510 [lod]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc1bcf895&amp;gt;] lod_process_config+0x13b5/0x1510 [lod]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc13df8b2&amp;gt;] class_process_config+0x2142/0x2830 [obdclass]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc0c3a8b2&amp;gt;] class_process_config+0x2142/0x2830 [obdclass]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc0c3cb79&amp;gt;] class_config_llog_handler+0x819/0x1520 [obdclass]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc13e1b79&amp;gt;] class_config_llog_handler+0x819/0x1520 [obdclass]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc13a45af&amp;gt;] llog_process_thread+0x85f/0x1a10 [obdclass]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc0bff5af&amp;gt;] llog_process_thread+0x85f/0x1a10 [obdclass]
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffc0c01174&amp;gt;] llog_process_thread_daemonize+0xa4/0xe0 [obdclass]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffc13a6174&amp;gt;] llog_process_thread_daemonize+0xa4/0xe0 [obdclass]
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffff9bec5c21&amp;gt;] kthread+0xd1/0xe0
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffbd8c5c21&amp;gt;] kthread+0xd1/0xe0
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffbdf94ddd&amp;gt;] ret_from_fork_nospec_begin+0x7/0x21
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffff9c594ddd&amp;gt;] ret_from_fork_nospec_begin+0x7/0x21
oak-md2-s2: Sep 09 08:38:11 oak-md2-s2 kernel:  [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
oak-md1-s2: Sep 09 08:38:11 oak-md1-s2 kernel:  [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel: LustreError: 5224:0:(osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed: 
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel: LustreError: 5224:0:(osp_dev.c:1404:osp_obd_connect()) LBUG
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel: Pid: 5224, comm: llog_process_th 3.10.0-1160.6.1.el7_lustre.pl1.x86_64 #1 SMP Mon Dec 14 21:25:04 PST 2020
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel: Call Trace:
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc09f07cc&amp;gt;] libcfs_call_trace+0x8c/0xc0 [libcfs]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc09f087c&amp;gt;] lbug_with_loc+0x4c/0xa0 [libcfs]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc1b38eb6&amp;gt;] osp_obd_connect+0x3c6/0x400 [osp]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc1a5804e&amp;gt;] lod_add_device+0xa8e/0x19a0 [lod]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc1a53895&amp;gt;] lod_process_config+0x13b5/0x1510 [lod]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc13228b2&amp;gt;] class_process_config+0x2142/0x2830 [obdclass]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc1324b79&amp;gt;] class_config_llog_handler+0x819/0x1520 [obdclass]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc12e75af&amp;gt;] llog_process_thread+0x85f/0x1a10 [obdclass]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffc12e9174&amp;gt;] llog_process_thread_daemonize+0xa4/0xe0 [obdclass]
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffae4c5c21&amp;gt;] kthread+0xd1/0xe0
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffaeb94ddd&amp;gt;] ret_from_fork_nospec_begin+0x7/0x21
oak-md2-s1: Sep 09 08:38:12 oak-md2-s1 kernel:  [&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;After rebooting the crashed MDS, the last added OSTs are working fine (after MDS recovery, they allocate their &quot;super-sequence&quot; and are then operational. But every time I add more OSTs to the filesystem, all MDS crash with this backtrace. This looks similar to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9699&quot; title=&quot;osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9699&quot;&gt;&lt;del&gt;LU-9699&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I&apos;m attaching the output of&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl --device MGS llog_print &amp;lt;FILE&amp;gt;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;as &amp;lt;FILE&amp;gt;.llog.txt for FILE = oak-client and oak-MDT0000 to oak-MDT0005&lt;br/&gt;
in case anyone can see if there is a problem somewhere.&lt;/p&gt;</description>
                <environment>CentOS 7.9</environment>
        <key id="65994">LU-15000</key>
            <summary>MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&gt;opd_connects == 1 ) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="tappro">Mikhail Pershin</assignee>
                                    <reporter username="sthiell">Stephane Thiell</reporter>
                        <labels>
                    </labels>
                <created>Fri, 10 Sep 2021 05:19:44 +0000</created>
                <updated>Mon, 1 May 2023 23:12:38 +0000</updated>
                            <resolved>Mon, 30 May 2022 21:46:44 +0000</resolved>
                                    <version>Lustre 2.12.7</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                    <fixVersion>Lustre 2.15.3</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="312486" author="pjones" created="Fri, 10 Sep 2021 17:31:58 +0000"  >&lt;p&gt;Mike&lt;/p&gt;

&lt;p&gt;It looks like some work was started but not completed relating to this area under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9699&quot; title=&quot;osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9699&quot;&gt;&lt;del&gt;LU-9699&lt;/del&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="312716" author="tappro" created="Tue, 14 Sep 2021 12:22:35 +0000"  >&lt;p&gt;I&apos;ve just updated patch under &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9699&quot; title=&quot;osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9699&quot;&gt;&lt;del&gt;LU-9699&lt;/del&gt;&lt;/a&gt;, preparing its port to 2.12&#160;&lt;/p&gt;</comment>
                            <comment id="312762" author="sthiell" created="Tue, 14 Sep 2021 16:21:42 +0000"  >&lt;p&gt;Thank you!&lt;/p&gt;</comment>
                            <comment id="313815" author="sthiell" created="Thu, 23 Sep 2021 17:11:16 +0000"  >&lt;p&gt;Hi Mike... so I have some update!&lt;br/&gt;
We have applied the backported patch on MGS and all MDS, on top of 2.12.7.&lt;br/&gt;
Then this morning we tried to add an OST. The MGS propagated the config to the clients, but not to the MDTs. The local config llog for oak-MDT* on the MGS seem all corrupt now (but I do have backups and they are not corrupt on MDT). This is new.&lt;/p&gt;

&lt;p&gt;This is the log from the MGS when we started new OST index 331 (oak-OST014b):&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00000020:00000400:7.0:1632410256.389569:0:127696:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4)
00000020:00020000:19.0:1632410256.389727:0:127696:0:(genops.c:556:class_register_device()) oak-OST0141-osc-MDT0001: already exists, won&apos;t add
00000020:00020000:19.0:1632410256.402255:0:127696:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
00000020:02000400:19.0:1632410256.415459:0:127696:0:(obd_config.c:2068:class_config_dump_handler())    cmd=cf001 0:oak-OST0141-osc-MDT0001  1:osp  2:oak-MDT0001-mdtlov_UUID

10000000:00020000:9.0:1632410256.426947:0:3915:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
00000020:00000400:14.0F:1632410256.437628:0:127698:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
00000020:00000400:14.0:1632410256.437631:0:127698:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x4)
00000020:00020000:14.0:1632410256.437734:0:127698:0:(genops.c:556:class_register_device()) oak-OST0141-osc-MDT0002: already exists, won&apos;t add
00000020:00020000:14.0:1632410256.437736:0:127698:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.0.2.51@o2ib5: cfg command failed: rc = -17
00000020:02000400:14.0:1632410256.437739:0:127698:0:(obd_config.c:2068:class_config_dump_handler())    cmd=cf001 0:oak-OST0141-osc-MDT0002  1:osp  2:oak-MDT0002-mdtlov_UUID

10000000:00020000:21.0:1632410256.449208:0:3915:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;New OST is not visible from MDTs (no logs) and not shown by lctl dl:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# clush -w@mds &apos;lctl dl | grep oak-OST014b&apos;
clush: oak-md2-s2: exited with exit code 1
clush: oak-md1-s1: exited with exit code 1
clush: oak-md2-s1: exited with exit code 1
clush: oak-md1-s2: exited with exit code 1
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;New OST is visible from clients but not filling up (which makes sense):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@oak-h01v19 ~]# lfs df /oak | grep OST:331
oak-OST014b_UUID     108461852548        1868 107368054268   1% /oak[OST:331]
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;System is still running fine so there is that, no crash. But now the problem is that the config llogs seem corrupt so I&apos;m worried we will need to restore them from backups:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;llog_reader oak-MDT0000
rec #1 type=10620000 len=224 offset 8192
rec #2 type=10620000 len=128 offset 8416
rec #3 type=10620000 len=176 offset 8544
...&amp;lt;skip&amp;gt;...
rec #2943 type=10620000 len=112 offset 435072
rec #2944 type=10620000 len=136 offset 435184
rec #2945 type=10620000 len=224 offset 435320
off 435544 skip 6824 to next chunk.
Previous index is 2945, current 0, offset 435544
The log is corrupt (too big at 2826)
llog_reader: Could not pack buffer.: Invalid argument (22)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;same for CONFIGS/oak-MDT0001 to 5 on the MGS.&lt;/p&gt;

&lt;p&gt;If I run llog_print on MGS, we can see the new OST (which explains why clients can see it):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@oak-md1-s1 CONFIGS]# lctl --device MGS llog_print oak-client | grep oak-OST014b
- { index: 2879, event: attach, device: oak-OST014b-osc, type: osc, UUID: oak-clilov_UUID }
- { index: 2880, event: setup, device: oak-OST014b-osc, UUID: oak-OST014b_UUID, node: 10.0.2.104@o2ib5 }
- { index: 2882, event: add_conn, device: oak-OST014b-osc, node: 10.0.2.103@o2ib5 }
- { index: 2883, event: add_osc, device: oak-clilov, ost: oak-OST014b_UUID, index: 331, gen: 1 }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But it seems to be in memory only as it is not to be found in the MGS oak-client file:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@oak-md1-s1 CONFIGS]# llog_reader oak-client | grep 014b
[root@oak-md1-s1 CONFIGS]# 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I&apos;m attaching oak-MDT0000 before (ok but without OST 331) as  &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/40644/40644_oak-MDT0000.backup-20210923&quot; title=&quot;oak-MDT0000.backup-20210923 attached to LU-15000&quot;&gt;oak-MDT0000.backup-20210923&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt; and after (corrupted) as  &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/40645/40645_oak-MDT0000.after-ost331&quot; title=&quot;oak-MDT0000.after-ost331 attached to LU-15000&quot;&gt;oak-MDT0000.after-ost331&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;. Do you see what could have caused this (and how to fix it)? Thanks!&lt;/p&gt;</comment>
                            <comment id="313834" author="sthiell" created="Thu, 23 Sep 2021 20:26:46 +0000"  >&lt;p&gt;Hi Mike,&lt;br/&gt;
My bad. I should have used debugfs to check those config files, as properly documented in &lt;tt&gt;man llog_reader&lt;/tt&gt;! They are not corrupted when I extract them from the MGS via debugfs:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@oak-md1-s1 ~]# debugfs -c -R &apos;dump CONFIGS/oak-MDT0000 /tmp/oak-MDT0000&apos; /dev/mapper/md1-rbod1-mgt
debugfs 1.45.6.wc5 (09-Feb-2021)
/dev/mapper/md1-rbod1-mgt: catastrophic mode - not reading inode or group bitmaps
[root@oak-md1-s1 ~]# llog_reader /tmp/oak-MDT0000  | grep 014b
#2946 (224)marker 6183 (flags=0x01, v2.12.7.0) oak-OST014b     &apos;add osc&apos; Thu Sep 23 08:15:05 2021-
#2948 (128)attach    0:oak-OST014b-osc-MDT0000  1:osc  2:oak-MDT0000-mdtlov_UUID  
#2949 (144)setup     0:oak-OST014b-osc-MDT0000  1:oak-OST014b_UUID  2:10.0.2.104@o2ib5  
#2951 (112)add_conn  0:oak-OST014b-osc-MDT0000  1:10.0.2.103@o2ib5  
#2952 (136)lov_modify_tgts add 0:oak-MDT0000-mdtlov  1:oak-OST014b_UUID  2:331  3:1  
#2953 (224)END   marker 6183 (flags=0x02, v2.12.7.0) oak-OST014b     &apos;add osc&apos; Thu Sep 23 08:15:05 2021-
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And the OST has been properly added to it, this is a diff of before/after for oak-MDT0000:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;--- /tmp/before 2021-09-23 12:59:43.058307733 -0700
+++ /tmp/after  2021-09-23 13:00:15.115469245 -0700
@@ -2825,9 +2825,17 @@
 rec #2943 type=10620000 len=112 offset 435072
 rec #2944 type=10620000 len=136 offset 435184
 rec #2945 type=10620000 len=224 offset 435320
+rec #2946 type=10620000 len=224 offset 435544
+rec #2947 type=10620000 len=88 offset 435768
+rec #2948 type=10620000 len=128 offset 435856
+rec #2949 type=10620000 len=144 offset 435984
+rec #2950 type=10620000 len=88 offset 436128
+rec #2951 type=10620000 len=112 offset 436216
+rec #2952 type=10620000 len=136 offset 436328
+rec #2953 type=10620000 len=224 offset 436464
 Header size : 8192  llh_size : 80
 Time : Mon Feb 13 12:37:27 2017
-Number of records: 2827    cat_idx: 0  last_idx: 2945
+Number of records: 2835    cat_idx: 0  last_idx: 2953
 Target uuid : config_uuid
 -----------------------
 #01 (224)marker   2 (flags=0x01, v2.9.0.0) oak-MDT0000-mdtlov &apos;lov setup&apos; Mon Feb 13 12:37:27 2017-
@@ -5658,3 +5666,11 @@
 #2943 (112)add_conn  0:oak-OST014a-osc-MDT0000  1:10.0.2.104@o2ib5
 #2944 (136)lov_modify_tgts add 0:oak-MDT0000-mdtlov  1:oak-OST014a_UUID  2:330  3:1
 #2945 (224)END   marker 6175 (flags=0x02, v2.12.7.0) oak-OST014a     &apos;add osc&apos; Thu Sep  9 08:54:11 2021-
+#2946 (224)marker 6183 (flags=0x01, v2.12.7.0) oak-OST014b     &apos;add osc&apos; Thu Sep 23 08:15:05 2021-
+#2947 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.0.2.104@o2ib5  
+#2948 (128)attach    0:oak-OST014b-osc-MDT0000  1:osc  2:oak-MDT0000-mdtlov_UUID  
+#2949 (144)setup     0:oak-OST014b-osc-MDT0000  1:oak-OST014b_UUID  2:10.0.2.104@o2ib5  
+#2950 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.0.2.103@o2ib5  
+#2951 (112)add_conn  0:oak-OST014b-osc-MDT0000  1:10.0.2.103@o2ib5  
+#2952 (136)lov_modify_tgts add 0:oak-MDT0000-mdtlov  1:oak-OST014b_UUID  2:331  3:1  
+#2953 (224)END   marker 6183 (flags=0x02, v2.12.7.0) oak-OST014b     &apos;add osc&apos; Thu Sep 23 08:15:05 2021-
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So the main question remains, why are MDTs not aware of this new OST? How can I troubleshoot that? Thanks and sorry for the confusion.&lt;/p&gt;</comment>
                            <comment id="314014" author="tappro" created="Mon, 27 Sep 2021 11:39:12 +0000"  >&lt;p&gt;Stephane, such problem can be result of &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13356&quot; title=&quot;lctl conf_param hung on the MGS node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13356&quot;&gt;&lt;del&gt;LU-13356&lt;/del&gt;&lt;/a&gt; issue, could you check if related patch is in your tree?&lt;/p&gt;</comment>
                            <comment id="314034" author="eaujames" created="Mon, 27 Sep 2021 14:52:04 +0000"  >&lt;p&gt;Hello,&lt;br/&gt;
This ticket could be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14802&quot; title=&quot;MGS configuration problems - cannot add new OST, change parameters, hanging&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14802&quot;&gt;&lt;del&gt;LU-14802&lt;/del&gt;&lt;/a&gt;, this is also a configuration issue with the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13356&quot; title=&quot;lctl conf_param hung on the MGS node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13356&quot;&gt;&lt;del&gt;LU-13356&lt;/del&gt;&lt;/a&gt; on the background.&lt;/p&gt;</comment>
                            <comment id="314057" author="sthiell" created="Mon, 27 Sep 2021 18:17:57 +0000"  >&lt;p&gt;Thank you both! We do have the patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-13356&quot; title=&quot;lctl conf_param hung on the MGS node&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-13356&quot;&gt;&lt;del&gt;LU-13356&lt;/del&gt;&lt;/a&gt; on the Lustre servers but not on all clients yet. It is being deployed.&lt;/p&gt;</comment>
                            <comment id="315249" author="sthiell" created="Mon, 11 Oct 2021 21:42:09 +0000"  >&lt;p&gt;Hi Mike and Etienne,&lt;/p&gt;

&lt;p&gt;We still think that in our case, we have another problem when adding new OSTs. With the patch from &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-9699&quot; title=&quot;osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-9699&quot;&gt;&lt;del&gt;LU-9699&lt;/del&gt;&lt;/a&gt;, MDS don&apos;t crash anymore on this assert, but we still can&apos;t add OSTs without restarting all MDS. I believe that the following messages might give us a hint:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;---------------
oak-md1-s1
---------------
Oct 04 00:53:44 oak-md1-s1 kernel: Lustre: 6206:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
Oct 04 00:53:44 oak-md1-s1 kernel: LustreError: 6206:0:(genops.c:556:class_register_device()) oak-OST0142-osc-MDT0001: already exists, won&apos;t add
---------------
oak-md1-s2
---------------
Oct 04 00:53:42 oak-md1-s2 kernel: Lustre: 6253:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
Oct 04 00:53:42 oak-md1-s2 kernel: LustreError: 6253:0:(genops.c:556:class_register_device()) oak-OST0142-osc-MDT0003: already exists, won&apos;t add
---------------
oak-md2-s1
---------------
Oct 04 00:53:41 oak-md2-s1 kernel: Lustre: 5413:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
Oct 04 00:53:41 oak-md2-s1 kernel: LustreError: 5413:0:(genops.c:556:class_register_device()) oak-OST0145-osc-MDT0004: already exists, won&apos;t add
---------------
oak-md2-s2
---------------
Oct 04 00:33:18 oak-md2-s2 kernel: Lustre: 48951:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
Oct 04 00:33:18 oak-md2-s2 kernel: LustreError: 48951:0:(genops.c:556:class_register_device()) oak-OST0145-osc-MDT0005: already exists, won&apos;t add
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Reminder: these OSTs are not the ones being added.&lt;/p&gt;

&lt;p&gt;When a new OST is added, the MGS is updated (and that is OK) revokes the config lock which triggers an update on clients (this is OK too), and also on MDTs, and class_config_llog_handler() where this message is produced is called with cfg-&amp;gt;cfg_instance == NULL. In my understanding, cfg_instance should be defined so that new llog can be appended and so the MDTs can then add the new OST at the end without re-processing the full config. My current guess is that this works for a certain length of config llog (until OST0142 / OST0145 depending on the MDT, this is reproducible - do we reach a buffer limit here?), and when the second part of the config buffer is processed, we are losing the instance (and thus also cfg_last_idx?) so logs are processed at the beginning of the buffer which here happens to be OST0142 / OST0145 depending on the MDT as their config llog is not exactly the same. I&apos;ve spent some time trying to look for a possible defect but I haven&apos;t found anything yet.&lt;/p&gt;

&lt;p&gt;Another thing I am suspecting is that it could have started when we removed a few OSTs: we previously disabled 12 OSTs on this system, we used the experimental del_ost which behaves like llog_cancel on specific indexes, and I noticed they are marked as EXCLUDE and not SKIP. So even a lctl clear_conf wouldn&apos;t help. This is an issue too if we want to definitely get rid of them.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;#28 (224)marker 323 (flags=0x01, v2.10.4.0) oak-MDT0001     &apos;add osc(copied)&apos; Thu Oct 18 11:45:30 2018-
#29 (224)EXCLUDE START marker 324 (flags=0x11, v2.10.4.0) oak-OST0001     &apos;add osc&apos; Thu Oct 18 11:45:30 2018-Fri Mar 19 15:29:22 2021
#30 (088)add_uuid  nid=10.0.2.102@o2ib5(0x500050a000266)  0:  1:10.0.2.102@o2ib5
#33 (088)add_uuid  nid=10.0.2.101@o2ib5(0x500050a000265)  0:  1:10.0.2.101@o2ib5
#36 (224)END   EXCLUDE END   marker 324 (flags=0x12, v2.10.4.0) oak-OST0001     &apos;add osc&apos; Thu Oct 18 11:45:30 2018-Fri Mar 19 15:29:22 2021
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;To come back to our &apos;Skip config outside markers&apos; issue,&#160; perhaps there is a llog buffer issue when EXCLUDE markers are previously used. I will try to reproduce to create a large config like this on a test system, but I wanted to share that with you in case you have some insights. Thanks!&lt;/p&gt;</comment>
                            <comment id="326305" author="asmadeus" created="Mon, 14 Feb 2022 22:29:36 +0000"  >&lt;p&gt;Hello,&lt;/p&gt;

&lt;p&gt;it&apos;s been a while but it&apos;s still a problem when adding new OSTs, I&apos;ve taken a fresh look and after some efforts I was able to reproduce using the catalogs from the MGS on a dummy test filesystem, this should make it easy to get more infos out of it.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;So first for the symptoms we were chasing, St&#233;phane was correct: the &lt;tt&gt;cld_cfg.cfg_last_idx&lt;/tt&gt; for the oak-MDTxxxx catalogs are wrong on MDS, so old events are being processed when they already had been applied.&lt;/p&gt;

&lt;p&gt;My reproducing occurence had the following dmesg:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
[ 3486.702628] Lustre: 4657:0:(obd_config.c:1641:class_config_llog_handler()) Skip config outside markers, (inst: 0000000000000000, uuid: , flags: 0x0)
[ 3486.710037] LustreError: 4657:0:(genops.c:556:class_register_device()) oak-OST0147-osc-MDT0002: already exists, won&apos;t add
[ 3486.714672] LustreError: 4657:0:(obd_config.c:1835:class_config_llog_handler()) MGC10.42.17.51@tcp: cfg command failed: rc = -17
[ 3486.719065] Lustre:    cmd=cf001 0:oak-OST0147-osc-MDT0002  1:osp  2:oak-MDT0002-mdtlov_UUID  
[ 3486.722883] LustreError: 2907:0:(mgc_request.c:599:do_requeue()) failed processing log: -17
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And with loglevels enabled we can see this in dk&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;mgc_request.c:2058:mgc_process_log()) &lt;span class=&quot;code-object&quot;&gt;Process&lt;/span&gt; log oak-MDT0002-0000000000000000 from 3203&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;With the catalog as follow on mgs:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
... last_idx: 3259
...
#3203 (144)setup     0:oak-OST0147-osc-MDT0002  1:oak-OST0147_UUID
...
#3259 (last line)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So: the last_idx here was 3202 when it should have been 3259 to start processing from 3260 and the discreptancy made the MDS reprocess &lt;tt&gt;oak-OST0147-osc-MDT0002&lt;/tt&gt;&apos;s setup, leading to the error message, but the real problem is that last_idx doesn&apos;t match.&lt;/p&gt;


&lt;p&gt;This can also be confirmed with crash for manual inspection (I assume it got +1&apos;d parsing that erroneous line):&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
crash&amp;gt; mod -s mgc
crash&amp;gt; list -H config_llog_list -s config_llog_data.cld_cfg,cld_logname -l config_llog_data3.cld_list_chain
...
ffff8bdcb6d3fa70
  cld_cfg = {
    cfg_instance = 0, 
    cfg_sb = 0xffff8bdc48767000, 
    cfg_uuid = {
      uuid = &lt;span class=&quot;code-quote&quot;&gt;&quot;\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000&quot;&lt;/span&gt;
    }, 
    cfg_callback = 0xffffffffc0949980 &amp;lt;class_config_llog_handler&amp;gt;, 
    cfg_last_idx = 3203, 
    cfg_flags = 2, 
    cfg_lwp_idx = 327, 
    cfg_sub_clds = 31
  }
  cld_logname = 0xffff8bdcb6d3fae5 &lt;span class=&quot;code-quote&quot;&gt;&quot;oak-MDT0002&quot;&lt;/span&gt;
...
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;


&lt;p&gt;The previous fix was also targetting side effects, but what really needs fixing is the index so we don&apos;t reprocess events.&lt;/p&gt;

&lt;p&gt;Now, why is that index off?&lt;br/&gt;
This is just a supposition for now but upon a closer look, it turns out the &lt;tt&gt;CONFIGS/oak-MDT0002&lt;/tt&gt; catalog offsets don&apos;t keep up between the original on MGT and the copy on the MDT.&lt;/p&gt;

&lt;p&gt;Here&apos;s what it looks like on MGT:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
...
rec #3258 type=10620000 len=88 offset 508600
rec #3259 type=10620000 len=224 offset 508688
Header size : 8192       llh_size : 80
Time : Thu Sep 26 05:36:36 2019
&lt;span class=&quot;code-object&quot;&gt;Number&lt;/span&gt; of records: 3132 cat_idx: 0      last_idx: 3259
Target uuid : config_uuid
-----------------------
#01 (224)marker 2015 (flags=0x01, v2.10.8.0) oak-MDT0002-mdtlov &lt;span class=&quot;code-quote&quot;&gt;&apos;lov setup&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#02 (128)attach    0:oak-MDT0002-mdtlov  1:lov  2:oak-MDT0002-mdtlov_UUID  
#03 (176)lov_setup 0:oak-MDT0002-mdtlov  1:(struct lov_desc)
                uuid=oak-MDT0002-mdtlov_UUID  stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
...
#18 (224)marker 2018 (flags=0x01, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#19 (224)EXCLUDE START marker 2019 (flags=0x11, v2.10.8.0) oak-OST0003     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
#20 (088)add_uuid  nid=10.0.2.102@o2ib5(0x500050a000266)  0:  1:10.42.17.114@tcp  
#23 (088)add_uuid  nid=10.0.2.101@o2ib5(0x500050a000265)  0:  1:10.42.17.114@tcp  
#26 (224)END   EXCLUDE END   marker 2019 (flags=0x12, v2.10.8.0) oak-OST0003     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
#27 (224)END   marker 2019 (flags=0x02, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#28 (224)marker 2020 (flags=0x01, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#29 (224)EXCLUDE START marker 2021 (flags=0x11, v2.10.8.0) oak-OST0001     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:22 2021
...
#3172 (112)add_conn  0:oak-OST0142-osc-MDT0002  1:10.42.17.114@tcp  
#3173 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST0142_UUID  2:322  3:1  
#3174 (224)END   marker 6129 (flags=0x02, v2.12.7.0) oak-OST0142     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:37:12 2021-
#3175 (224)marker 6137 (flags=0x01, v2.12.7.0) oak-OST0143     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:37:38 2021-
#3176 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3177 (128)attach    0:oak-OST0143-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3178 (144)setup     0:oak-OST0143-osc-MDT0002  1:oak-OST0143_UUID  2:10.42.17.114@tcp  
#3179 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3180 (112)add_conn  0:oak-OST0143-osc-MDT0002  1:10.42.17.114@tcp  
#3181 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST0143_UUID  2:323  3:1  
#3182 (224)END   marker 6137 (flags=0x02, v2.12.7.0) oak-OST0143     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:37:38 2021-
#3183 (224)marker 6145 (flags=0x01, v2.12.7.0) oak-OST0145     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:38:03 2021-
#3184 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3185 (128)attach    0:oak-OST0145-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3186 (144)setup     0:oak-OST0145-osc-MDT0002  1:oak-OST0145_UUID  2:10.42.17.114@tcp  
#3187 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3188 (112)add_conn  0:oak-OST0145-osc-MDT0002  1:10.42.17.114@tcp  
#3189 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST0145_UUID  2:325  3:1  
#3190 (224)END   marker 6145 (flags=0x02, v2.12.7.0) oak-OST0145     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:38:03 2021-
#3191 (224)marker 6153 (flags=0x01, v2.12.7.0) oak-OST0146     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep  9 15:53:39 2021-
...
#3240 (224)marker 6201 (flags=0x01, v2.12.7.0) oak-OST014d     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:10:30 2021-
#3241 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3242 (128)attach    0:oak-OST014d-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3243 (144)setup     0:oak-OST014d-osc-MDT0002  1:oak-OST014d_UUID  2:10.42.17.114@tcp  
#3244 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3245 (112)add_conn  0:oak-OST014d-osc-MDT0002  1:10.42.17.114@tcp  
#3246 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST014d_UUID  2:333  3:1  
#3247 (224)END   marker 6201 (flags=0x02, v2.12.7.0) oak-OST014d     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:10:30 2021-
#3248 (224)marker 6209 (flags=0x01, v2.12.7.0) oak-OST014e     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:13:45 2021-
#3249 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3250 (128)attach    0:oak-OST014e-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3251 (144)setup     0:oak-OST014e-osc-MDT0002  1:oak-OST014e_UUID  2:10.42.17.114@tcp  
#3252 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3254 (112)add_conn  0:oak-OST014e-osc-MDT0002  1:10.42.17.114@tcp  
#3255 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST014e_UUID  2:334  3:1  
#3256 (224)END   marker 6209 (flags=0x02, v2.12.7.0) oak-OST014e     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:13:45 2021-
#3257 (224)marker 6362 (flags=0x01, v2.12.7.0) oak             &lt;span class=&quot;code-quote&quot;&gt;&apos;quota.mdt&apos;&lt;/span&gt; Thu Jan 20 21:23:29 2022-
#3258 (088)param 0:oak  1:quota.mdt=gp  
#3259 (224)END   marker 6362 (flags=0x02, v2.12.7.0) oak             &lt;span class=&quot;code-quote&quot;&gt;&apos;quota.mdt&apos;&lt;/span&gt; Thu Jan 20 21:23:29 2022-
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And here&apos;s the MDT copy:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
rec #3188 type=10620000 len=224 offset 500632
rec #3189 type=10620000 len=224 offset 500856
rec #3190 type=10620000 len=88 offset 501080
rec #3191 type=10620000 len=224 offset 501168
Header size : 8192       llh_size : 80
Time : Fri Jan 21 18:36:56 2022
&lt;span class=&quot;code-object&quot;&gt;Number&lt;/span&gt; of records: 3132 cat_idx: 0      last_idx: 3191
Target uuid : 
-----------------------
#01 (224)marker 2015 (flags=0x01, v2.10.8.0) oak-MDT0002-mdtlov &lt;span class=&quot;code-quote&quot;&gt;&apos;lov setup&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#02 (128)attach    0:oak-MDT0002-mdtlov  1:lov  2:oak-MDT0002-mdtlov_UUID  
#03 (176)lov_setup 0:oak-MDT0002-mdtlov  1:(struct lov_desc)
                uuid=oak-MDT0002-mdtlov_UUID  stripe:cnt=1 size=1048576 offset=18446744073709551615 pattern=0x1
...
#18 (224)marker 2018 (flags=0x01, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#19 (224)EXCLUDE START marker 2019 (flags=0x11, v2.10.8.0) oak-OST0003     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
#20 (088)add_uuid  nid=10.0.2.102@o2ib5(0x500050a000266)  0:  1:10.42.17.114@tcp  
#21 (088)add_uuid  nid=10.0.2.101@o2ib5(0x500050a000265)  0:  1:10.42.17.114@tcp  
#22 (224)END   EXCLUDE END   marker 2019 (flags=0x12, v2.10.8.0) oak-OST0003     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
#23 (224)END   marker 2019 (flags=0x02, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#24 (224)marker 2020 (flags=0x01, v2.10.8.0) oak-MDT0002     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc(copied)&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-
#25 (224)EXCLUDE START marker 2021 (flags=0x11, v2.10.8.0) oak-OST0001     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:22 2021
..
#3172 (224)marker 6201 (flags=0x01, v2.12.7.0) oak-OST014d     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:10:30 2021-
#3173 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3174 (128)attach    0:oak-OST014d-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3175 (144)setup     0:oak-OST014d-osc-MDT0002  1:oak-OST014d_UUID  2:10.42.17.114@tcp  
#3176 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3177 (112)add_conn  0:oak-OST014d-osc-MDT0002  1:10.42.17.114@tcp  
#3178 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST014d_UUID  2:333  3:1  
#3179 (224)END   marker 6201 (flags=0x02, v2.12.7.0) oak-OST014d     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:10:30 2021-
#3181 (224)marker 6209 (flags=0x01, v2.12.7.0) oak-OST014e     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:13:45 2021-
#3182 (088)add_uuid  nid=10.0.2.103@o2ib5(0x500050a000267)  0:  1:10.42.17.114@tcp  
#3183 (128)attach    0:oak-OST014e-osc-MDT0002  1:osc  2:oak-MDT0002-mdtlov_UUID  
#3184 (144)setup     0:oak-OST014e-osc-MDT0002  1:oak-OST014e_UUID  2:10.42.17.114@tcp  
#3185 (088)add_uuid  nid=10.0.2.104@o2ib5(0x500050a000268)  0:  1:10.42.17.114@tcp  
#3186 (112)add_conn  0:oak-OST014e-osc-MDT0002  1:10.42.17.114@tcp  
#3187 (136)lov_modify_tgts add 0:oak-MDT0002-mdtlov  1:oak-OST014e_UUID  2:334  3:1  
#3188 (224)END   marker 6209 (flags=0x02, v2.12.7.0) oak-OST014e     &lt;span class=&quot;code-quote&quot;&gt;&apos;add osc&apos;&lt;/span&gt; Mon Oct  4 08:13:45 2021-
#3189 (224)marker 6362 (flags=0x01, v2.12.7.0) oak             &lt;span class=&quot;code-quote&quot;&gt;&apos;quota.mdt&apos;&lt;/span&gt; Thu Jan 20 21:23:29 2022-
#3190 (088)param 0:oak  1:quota.mdt=gp  
#3191 (224)END   marker 6362 (flags=0x02, v2.12.7.0) oak             &lt;span class=&quot;code-quote&quot;&gt;&apos;quota.mdt&apos;&lt;/span&gt; Thu Jan 20 21:23:29 2022-
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In particular:&lt;/p&gt;
&lt;ul&gt;
	&lt;li&gt;the last_idx doesn&apos;t match&lt;/li&gt;
	&lt;li&gt;from 20 onwards, the copy llog content is the same but the index drifts off: it goes up for EXCLUDE blocks then oscillate between 48/49 (some holes on mdt that aren&apos;t on mgs), then climbs again when there are records pertaining OST&lt;span class=&quot;error&quot;&gt;&amp;#91;0000-000b&amp;#93;&lt;/span&gt; which had been excluded previously showing up again later. I&apos;ve attached a post-processed catalog with diff/index on mgt/index on mdt: &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/42358/42358_llogdiff&quot; title=&quot;llogdiff attached to LU-15000&quot;&gt;llogdiff&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;From my understanding, the catalog should be identical on both nodes as the local copy will be used during mds start, so differences there will likely cause problems. It&apos;s also safe to say the difference is probably due to the EXCLUDE blocks as suspected from the start...&lt;/p&gt;

&lt;p&gt;So we now have two ways forward:&lt;/p&gt;
&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;try to fix the copy so excluded indexes don&apos;t affect overall index (e.g. copy holes too)&lt;/li&gt;
	&lt;li&gt;remove EXCLUDED rows on catalogs and pretend this all never happened.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;since exclude has never been merged I&apos;d be tempted to put a rug on this and try to fix catalogs, but unfortunately I can&apos;t think of any way of fixing them nicely without having to shut down the whole filesystem (e.g. nuke the MGS and have targets reregister is most certainly not safe with thousands of clients mounted)&lt;br/&gt;
Do you have any suggestion?&lt;/p&gt;

&lt;p&gt;Thanks!&lt;/p&gt;</comment>
                            <comment id="326526" author="eaujames" created="Wed, 16 Feb 2022 20:26:19 +0000"  >&lt;p&gt;umounting/mounting the MDT target should sync the local configuration with the MGS if the MDT successfully get a lock on llog config ressource.&lt;br/&gt;
The job is done by mgc_llog_local_copy().&lt;/p&gt;

&lt;p&gt;If the MDT fail to sync the configuration the following message should appear (with debug=+mgc):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Failed to get MGS log %s, using local copy for now, will try to update later.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can get more information from : &lt;a href=&quot;https://review.whamcloud.com/40448/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/40448/&lt;/a&gt; (&quot;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-14090&quot; title=&quot;lctl replace_nids and starting target with local copy of logs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-14090&quot;&gt;&lt;del&gt;LU-14090&lt;/del&gt;&lt;/a&gt; mgs: no local logs flag&quot;)&lt;/p&gt;

&lt;p&gt;If the MDT local configuration is different from the MGS configuration after the copy, it can mean that there is a bug in llog_backup().&lt;/p&gt;</comment>
                            <comment id="326540" author="asmadeus" created="Wed, 16 Feb 2022 21:56:28 +0000"  >&lt;p&gt;Hi Etienne! Thanks for the reply.&lt;/p&gt;

&lt;p&gt;Yes, in my case the copy was successful, so there definitely is a bug in the copy as I pointed out.&lt;br/&gt;
I&apos;m not sure if it&apos;s worth fixing this bug for a feature that never was released (there probably isn&apos;t anyone else with EXCLUDE catalog entries...) but it might actually be easier to fix the copy than to try to fix the catalogs &amp;#8211; I&apos;ll admit I haven&apos;t looked yet and it&apos;s worth a shot.&lt;/p&gt;</comment>
                            <comment id="326620" author="asmadeus" created="Thu, 17 Feb 2022 14:47:11 +0000"  >&lt;p&gt;Well, it looks like we can have EXCLUDE entries, so it&apos;s probably worth a check anyway...&lt;/p&gt;

&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;first, a confirmation that this is a problem:&lt;br/&gt;
we use the same &lt;tt&gt;cld-&amp;gt;cld_cfg&lt;/tt&gt; (which stores the last_idx processed) whether we look at the remote catalog (in &lt;tt&gt;mgc_process_cfg_log&lt;/tt&gt;, &lt;tt&gt;rctxt&lt;/tt&gt;) or local one (&lt;tt&gt;ctxt&lt;/tt&gt;) , so if the indexes don&apos;t match (at least the last index) we&apos;ll get bad behaviour as observed: we need copy to preserve the indices.&lt;/li&gt;
&lt;/ul&gt;


&lt;ul class=&quot;alternate&quot; type=&quot;square&quot;&gt;
	&lt;li&gt;looking at &lt;tt&gt;llog_backup&lt;/tt&gt;:&lt;br/&gt;
probing on &lt;tt&gt;llog_osd_write_rec&lt;/tt&gt; I can see that the &lt;tt&gt;rec-&amp;gt;lrh_index&lt;/tt&gt; is correct (mgs index#), but the writing idx argument is -1 so we&apos;re just appending, basically incrementing the record# by 1 everytime and not skipping things like the original record does. So as observed it basically copies the right content but doesn&apos;t care about the index.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Unfortunately, the code doesn&apos;t let us modify an unset index, so we&apos;d need to keep track of the last written index, compare with the one we want to write and create fake records and cancel these to have the index artificially match?&lt;/p&gt;


&lt;p&gt;FWIW I&apos;ve kludged llog_reader to take a peek at what the &quot;holes&quot; are about and it&apos;s cancelled llogs from the exluded block, as well as padding log entries (which are also present on the copy at different offset when they are needed) &amp;#8211; unfortunately &lt;tt&gt;llog_osd_pad&lt;/tt&gt; is not exposed so we can&apos;t abuse it but we could just write entries with &lt;tt&gt;LLOG_PAD_MAGIC&lt;/tt&gt; type empty or minimal size and immediately cancel it to unset the header bit...? Looking back I don&apos;t see any clean way of clearing the bit actually, it might not be so easy :|&lt;/p&gt;

&lt;p&gt;Anyway, ETIMEDOUT for today &amp;#8211; will keep digging. Please speak up if you have better ideas than what I&apos;m rambling about &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;</comment>
                            <comment id="326630" author="eaujames" created="Thu, 17 Feb 2022 15:21:22 +0000"  >&lt;p&gt;Hi Dominique,&lt;/p&gt;

&lt;p&gt;I think issue is that the mgc_llog_local_copy/llog_backup do not reproduce gap from the original config llog. The copy is done via llog_process that ignore cancel record (from bitmap llog header).&lt;/p&gt;

&lt;p&gt;So your original: &lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;#23 (088)add_uuid  nid=10.0.2.101@o2ib5(0x500050a000265)  0:  1:10.42.17.114@tcp  
#26 (224)END   EXCLUDE END   marker 2019 (flags=0x12, v2.10.8.0) oak-OST0003     &apos;add osc&apos; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;is remplaced in local copy by:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;#21 (088)add_uuid  nid=10.0.2.101@o2ib5(0x500050a000265)  0:  1:10.42.17.114@tcp  
#22 (224)END   EXCLUDE END   marker 2019 (flags=0x12, v2.10.8.0) oak-OST0003     &apos;add osc&apos; Thu Sep 26 05:36:36 2019-Fri Mar 19 22:29:33 2021
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The copy don&apos;t keep the &quot;holes&quot; in the indexes created by &quot;del_ost&quot;, that why you do not have the same cfg_last_idx.&lt;br/&gt;
So it mess up the config update mechanism because the MDT rely on the local last index to get new record from the MGS config.&lt;/p&gt;

&lt;p&gt;So if you try to remove the EXCLUDE record it will increase the index difference between local and original. This will not help.&lt;/p&gt;

&lt;p&gt;I am working on a patch to reproduce index gap on the local copy.&lt;/p&gt;</comment>
                            <comment id="326651" author="gerrit" created="Thu, 17 Feb 2022 17:57:36 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/46545&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46545&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: check for index gaps in llog_backup()&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: fa9c281196d7683e76096aa20b3f08b7ccaca0be&lt;/p&gt;</comment>
                            <comment id="326653" author="eaujames" created="Thu, 17 Feb 2022 18:44:44 +0000"  >&lt;p&gt;The patch above implement what you have described (without LLOG_PAD_MAGIC). But like you said this is not totaly accurate because of the chunk padding.&lt;br/&gt;
Maybe the easiest way is to take the local backup contents (without &quot;holes&quot;) and correct the MGS original with it.&lt;/p&gt;</comment>
                            <comment id="326660" author="sthiell" created="Thu, 17 Feb 2022 20:12:23 +0000"  >&lt;p&gt;Hi Etienne,&lt;br/&gt;
Thank you!&lt;br/&gt;
Re: your last comment, we were thinking of doing that (do a copy of each MDT&apos;s local backup to the MGS), which could fix the immediate issue on the MDTs when adding new OSTs, but wouldn&apos;t that mess up mounted clients in that case, especially after we add more OSTs? Are clients&apos; config based on indexes?  Our main goal is to avoid re-mounting all clients but still be able to add new OSTs, which is tricky due to this index problem.&lt;/p&gt;</comment>
                            <comment id="326666" author="asmadeus" created="Thu, 17 Feb 2022 21:33:53 +0000"  >&lt;p&gt;Hi Etienne, thanks for the patch!&lt;/p&gt;

&lt;p&gt;That&apos;s pretty much what I had in mind, yes. This came back haunting me a bit while I sleep and yes it&apos;s not perfect because of padding: we can&apos;t control when each side will add their padding (because length differences), so the last index written might be 1 too big and we might have the opposite problem now (last index bigger means we&apos;ll skip one record we should have processed)&lt;/p&gt;

&lt;p&gt;So I&apos;ve been thinking it&apos;s safer to keep the holes, and just append dummy small logs at the end until we get the right last index to fake it up. It&apos;s safer because the dummy logs being small would reduce the need to add padding, but unfortunately it&apos;s still not perfect because if it does require padding at the last dummy log appended we&apos;ll overrun the last index again &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/sad.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;I can&apos;t think of any safe copy mechanism that preserves indices right now as we don&apos;t know the side of skipped records northe offset within the changelog of the entry immediately after a hole, and unless we can reproduce that we won&apos;t be able to match the automatic padding. I plan to look further this way this weekend.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;St&#233;phane, this is not a problem: clients don&apos;t have a copy of the MDT changelogs, so we can just restart the MDT and they&apos;ll get a new copy which will fix all our problems. If we want to play safe we can also backport the tunefs --nolocallogs patch, but even without it the servers will try to get a new copy and throw away the old one if copy worked, so it should work as far as I understand. That won&apos;t impact clients more than a normal failover, so if we can get this right this is a good solution.&lt;/p&gt;</comment>
                            <comment id="326668" author="asmadeus" created="Thu, 17 Feb 2022 22:04:11 +0000"  >&lt;p&gt;Hi again, sorry for the spam &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt; Talking with St&#233;phane gave me another idea: if we can&apos;t copy the offsets reliably, we can just get rid of them.&lt;/p&gt;

&lt;p&gt;In &lt;tt&gt;mgc_llog_local_copy&lt;/tt&gt; the MDS will make a local copy first with &lt;tt&gt;llog_backup(env, obd, lctxt, lctxt, logname, temp_log);&lt;/tt&gt; (note they use lctxt twice which is &lt;tt&gt;LLOG_CONFIG_ORIG_CTXT&lt;/tt&gt;).&lt;/p&gt;

&lt;p&gt;This process is identical and will get rid of any cancelled changelog, so if we just do the same on the MGS for any log that has holes when it starts we won&apos;t need to worry about these anymore.&lt;/p&gt;

&lt;p&gt;It&apos;s not perfect either though: we still have the problem while the MGS is running immediately after cancel events, so I guess we need a better time to do that.. asynchronously a while after any cancel event? hmm... Well, just a wild idea if we can&apos;t get copy right. I still intend on looking further this weekend.&lt;/p&gt;</comment>
                            <comment id="326693" author="eaujames" created="Fri, 18 Feb 2022 09:01:40 +0000"  >&lt;p&gt;@Stephane Thiell the issue with correcting the MGS configs is that the target copies have last indexes lower that the original ones. Clients/targets are notified only for &quot;new&quot; records in mounted state, so they will ignore indexes between original and the copy.&lt;/p&gt;

&lt;p&gt;One solution could be to artificially generate rec after correcting MGS (with &quot;lctl config_param &amp;lt;param&amp;gt;&quot; and &quot;lctl config_param -d &amp;lt;param&amp;gt;&quot;) until the correct configs index reach the old configs &quot;last indexes&quot;. Then clients/targets should take into account new records (e.g adding new OSTs).&lt;/p&gt;</comment>
                            <comment id="326703" author="gerrit" created="Fri, 18 Feb 2022 12:36:31 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/46552&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46552&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: read canceled records in llog_backup&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 0c5471748b18d40893309aad3943b8aed5e91f1a&lt;/p&gt;</comment>
                            <comment id="326943" author="sthiell" created="Tue, 22 Feb 2022 16:53:52 +0000"  >&lt;p&gt;Hi Etienne,&lt;br/&gt;
Today, we applied your patch on MGS/MDS (on top of 2.12.7) and we were then able to add 16 OSTs without issue. So your patch worked for us.  This will also allow us to consider using llog_cancel/del_ost in the future again.&lt;br/&gt;
Thank you!&lt;/p&gt;</comment>
                            <comment id="327590" author="eaujames" created="Mon, 28 Feb 2022 11:39:08 +0000"  >&lt;p&gt;Hi Stephane,&lt;br/&gt;
Great for you.&lt;br/&gt;
I will add some reviewers to land the patch.&lt;/p&gt;</comment>
                            <comment id="336329" author="gerrit" created="Mon, 30 May 2022 19:02:23 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/46552/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46552/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: read canceled records in llog_backup&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: d8e2723b4e9409954846939026c599b0b1170e6e&lt;/p&gt;</comment>
                            <comment id="336354" author="pjones" created="Mon, 30 May 2022 21:46:44 +0000"  >&lt;p&gt;Landed for 2.16&lt;/p&gt;</comment>
                            <comment id="336610" author="gerrit" created="Thu, 2 Jun 2022 14:50:07 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/47515&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/47515&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: read canceled records in llog_backup&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 57b085cb1068c1a7a7a35376ccf1db10f489c6a9&lt;/p&gt;</comment>
                            <comment id="338706" author="sthiell" created="Fri, 24 Jun 2022 17:40:54 +0000"  >&lt;p&gt;@Peter Jones could you please consider this patch for inclusion in LTS releases (at least for 2.15)?&lt;/p&gt;</comment>
                            <comment id="349932" author="gerrit" created="Mon, 17 Oct 2022 23:12:32 +0000"  >&lt;p&gt;&quot;Jian Yu &amp;lt;yujian@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/48898&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/48898&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: read canceled records in llog_backup&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_15&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 51bf31c545e2bc67b214de7ae8fd415c1776c3bf&lt;/p&gt;</comment>
                            <comment id="369842" author="gerrit" created="Wed, 19 Apr 2023 03:32:20 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/48898/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/48898/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15000&quot; title=&quot;MDS crashes with (osp_dev.c:1404:osp_obd_connect()) ASSERTION( osp-&amp;gt;opd_connects == 1 ) failed&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15000&quot;&gt;&lt;del&gt;LU-15000&lt;/del&gt;&lt;/a&gt; llog: read canceled records in llog_backup&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_15&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 7ba5dc8e895c693f68d49e7ffc46483710d67beb&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10120">
                    <name>Blocker</name>
                                            <outwardlinks description="is blocking">
                                        <issuelink>
            <issuekey id="34119">LU-7668</issuekey>
        </issuelink>
                            </outwardlinks>
                                                        </issuelinktype>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="64355">LU-14695</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="46797">LU-9699</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="42358" name="llogdiff" size="304449" author="asmadeus" created="Mon, 14 Feb 2022 22:22:32 +0000"/>
                            <attachment id="40645" name="oak-MDT0000.after-ost331" size="436688" author="sthiell" created="Thu, 23 Sep 2021 17:10:43 +0000"/>
                            <attachment id="40644" name="oak-MDT0000.backup-20210923" size="435544" author="sthiell" created="Thu, 23 Sep 2021 17:10:35 +0000"/>
                            <attachment id="40460" name="oak-MDT0000.llog.txt" size="205582" author="sthiell" created="Fri, 10 Sep 2021 05:15:43 +0000"/>
                            <attachment id="40461" name="oak-MDT0001.llog.txt" size="204129" author="sthiell" created="Fri, 10 Sep 2021 05:15:43 +0000"/>
                            <attachment id="40462" name="oak-MDT0002.llog.txt" size="203450" author="sthiell" created="Fri, 10 Sep 2021 05:15:44 +0000"/>
                            <attachment id="40463" name="oak-MDT0003.llog.txt" size="203227" author="sthiell" created="Fri, 10 Sep 2021 05:15:43 +0000"/>
                            <attachment id="40464" name="oak-MDT0004.llog.txt" size="202539" author="sthiell" created="Fri, 10 Sep 2021 05:15:43 +0000"/>
                            <attachment id="40465" name="oak-MDT0005.llog.txt" size="202539" author="sthiell" created="Fri, 10 Sep 2021 05:15:42 +0000"/>
                            <attachment id="40459" name="oak-client.llog.txt" size="191540" author="sthiell" created="Fri, 10 Sep 2021 05:15:42 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i023wn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>