<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:23:58 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16096] recovery: handle  compatibility during upgrade for new replay data format</title>
                <link>https://jira.whamcloud.com/browse/LU-16096</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;As batched RPC protocol will change the disk format of the client reply data &quot;REPLY_DATA&quot; for recovery, thus we need to handle compatibility during upgrade carefully for this new replay data format.&lt;/p&gt;

&lt;p&gt;The new format is introduced in &lt;a href=&quot;https://review.whamcloud.com/#/c/46799/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/#/c/46799/&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The new format is as follow:&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
struct lsd_reply_data
{ 
__u64 lrd_transno; &lt;span class=&quot;code-comment&quot;&gt;/* transaction number */&lt;/span&gt;
__u64 lrd_xid; &lt;span class=&quot;code-comment&quot;&gt;/* transmission id */&lt;/span&gt;
__u64 lrd_data; &lt;span class=&quot;code-comment&quot;&gt;/* per-operation data */&lt;/span&gt;
__u32 lrd_result; &lt;span class=&quot;code-comment&quot;&gt;/* request result */&lt;/span&gt;
__u32 lrd_client_gen; &lt;span class=&quot;code-comment&quot;&gt;/* client generation */&lt;/span&gt;
+__u32 lrd_batch_idx; &lt;span class=&quot;code-comment&quot;&gt;/* sub request index in a batched RPC */&lt;/span&gt;
+__u32 lrd_padding[7]; &lt;span class=&quot;code-comment&quot;&gt;/* unused fields. */&lt;/span&gt; 
};
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The proposed solution is as follows:&lt;/p&gt;

&lt;p&gt;Add several flags in the magic number field of the reply data header:&lt;/p&gt;

&lt;p&gt;LRH_MAGIC_V1: 0xbdabda01 - the magic number of the old format for client reply data.&lt;/p&gt;

&lt;p&gt;LRH_MAGIC: 0xbdabda02 - the magic number of the new format for the client reply data.&lt;/p&gt;

&lt;p&gt;LRH_FLAG_BACKUP_DONE: 0x00000004 - indicate the target has finished to backup the &quot;REPLY_DATA&quot; with old format.&lt;/p&gt;

&lt;p&gt;&#160;&lt;/p&gt;

&lt;p&gt;During the target setup, it will initialize the reply data in @tgt_init()-&amp;gt;tgt_reply_data_init().&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;if found that the &quot;REPLY_DATA&quot; is old format (according to the magic number in the reply data header &quot;LRH_MAGIC&quot;),&#160; the target starts to backup the &quot;REPLY_DATA&quot; file into the file &quot;REPLY_DATA_BAK&quot;.&lt;/li&gt;
	&lt;li&gt;After finished the backup, the target will change the magic number field of the reply data header with LRH_MAGIC_V1 | LRH_FLAG_BACKUP_DONE, and sync the magic flag change into the persistent storage.&lt;/li&gt;
	&lt;li&gt;The target starts to convert the old format reply data from the backup file &quot;REPLY_DATA_BAK&quot; into the original reply data file &quot;REPLY_DATA&quot;.&lt;/li&gt;
	&lt;li&gt;After finished the conversion, the target changes the magic number @lrh_magic of the reply data header with LRH_MAGIC and @lrh_reply_size with new format, and sync the change to the disk. After that delete the backup file &quot;REPLY_DATA_BAK&quot;.&lt;/li&gt;
	&lt;li&gt;After that, the target starts the recovery. processing as normal with the new format reply data.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;&#160;&lt;/p&gt;</description>
                <environment></environment>
        <key id="71871">LU-16096</key>
            <summary>recovery: handle  compatibility during upgrade for new replay data format</summary>
                <type id="4" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11310&amp;avatarType=issuetype">Improvement</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="qian_wc">Qian Yingjin</assignee>
                                    <reporter username="qian_wc">Qian Yingjin</reporter>
                        <labels>
                            <label>LTS15</label>
                            <label>statahead</label>
                    </labels>
                <created>Tue, 16 Aug 2022 03:42:35 +0000</created>
                <updated>Tue, 20 Jun 2023 23:42:26 +0000</updated>
                                            <version>Lustre 2.16.0</version>
                                    <fixVersion>Lustre 2.16.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="343671" author="adilger" created="Tue, 16 Aug 2022 05:55:37 +0000"  >&lt;p&gt;I don&apos;t understand why there is a need to make a backup of the repay_data file?  It would seem better to complete the replay of all records in the file (if possible), or wait until the clients are evicted, and then reset the file to the new format. &lt;/p&gt;</comment>
                            <comment id="343679" author="qian_wc" created="Tue, 16 Aug 2022 07:13:05 +0000"  >&lt;p&gt;The reason we need to make a backup of the replay_data file is as follows:&lt;/p&gt;
&lt;ol&gt;
	&lt;li&gt;First, we want to extend the data structure @lsd_reply_data, thus we can not convert the old records with new format from the original replay_data file, It would better to make a backup.&lt;/li&gt;
	&lt;li&gt;The target may reboot repeatedly during the recovery.&lt;/li&gt;
	&lt;li&gt;&#160;The target does not free the reply data corresponding to the highest transno of each export. This ensures on-disk reply data is kept and last committed transno can be restored form disk in case of target recovery.&lt;/li&gt;
	&lt;li&gt;&#160;Although some clients get evicted, but we still need to keep the client replay data for other successful replay clients.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Due to above reason, I think we would better to complete the format conversion before the recovery.&lt;/p&gt;</comment>
                            <comment id="344041" author="gerrit" created="Fri, 19 Aug 2022 02:55:33 +0000"  >&lt;p&gt;&lt;del&gt;&quot;Yingjin Qian &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch:&lt;/del&gt; &lt;a href=&quot;https://review.whamcloud.com/48260&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/48260&lt;/a&gt;&lt;br/&gt;
&lt;del&gt;Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; recovery: upgrade compatibility for new reply data&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Project: fs/lustre-release&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Branch: master&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Current Patch Set: 1&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Commit: d1480f69b71fe647af47e4ec4b502e76caf96695&lt;/del&gt;&lt;/p&gt;</comment>
                            <comment id="344042" author="qian_wc" created="Fri, 19 Aug 2022 03:31:40 +0000"  >&lt;p&gt;After discussed with Lai, he think when found that the reply data is in the old format, we can directly drop the reply data (via truncate the size with 0), and rewrite the reply data header with the new format.&lt;/p&gt;

&lt;p&gt;This approach is much simpler, but will result in the recovery failure and the clients are evicted.&lt;/p&gt;</comment>
                            <comment id="344055" author="adilger" created="Fri, 19 Aug 2022 05:04:51 +0000"  >&lt;p&gt;the &lt;tt&gt;reply_data&lt;/tt&gt; file is &lt;b&gt;not&lt;/b&gt; the primary recovery state, since clients are listed in the &lt;tt&gt;last_rcvd&lt;/tt&gt; and only &quot;extra&quot; recovery records are in &lt;tt&gt;reply_data&lt;/tt&gt;.  It should be possible to complete the client recovery using existing file (without conversion), and then truncate the file after recovery has finished and rewrite the header to use the new magic and size.&lt;/p&gt;

&lt;p&gt;The clients will still be listed in the &lt;tt&gt;last_rcvd&lt;/tt&gt; file and do not need to be evicted, and then new records will be written in the new format.&lt;/p&gt;</comment>
                            <comment id="344063" author="gerrit" created="Fri, 19 Aug 2022 08:04:45 +0000"  >&lt;p&gt;&quot;Yingjin Qian &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/48261&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/48261&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; recovery: upgrade reply data after recovery finish&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: ab15fa93d32ec1629041c1df4a773d946759f648&lt;/p&gt;</comment>
                            <comment id="354468" author="gerrit" created="Tue, 29 Nov 2022 09:49:58 +0000"  >&lt;p&gt;&lt;del&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch:&lt;/del&gt; &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/49268&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/49268&lt;/a&gt;&lt;br/&gt;
&lt;del&gt;Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; recovery: upgrade reply data after recovery finish&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Project: fs/lustre-release&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Branch: master&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Current Patch Set: 1&lt;/del&gt;&lt;br/&gt;
&lt;del&gt;Commit: 235730ade05896237d3b6fafae6d8db07fea0283&lt;/del&gt;&lt;/p&gt;</comment>
                            <comment id="360977" author="gerrit" created="Tue, 31 Jan 2023 02:33:40 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/48261/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/48261/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; recovery: upgrade reply data after recovery finish&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: bbf0017fdea52f094c190f14fd82b9f5d0902c90&lt;/p&gt;</comment>
                            <comment id="361988" author="adilger" created="Tue, 7 Feb 2023 23:00:59 +0000"  >&lt;p&gt;I&apos;ve been bitten a few times recently by the landing of patch &lt;a href=&quot;https://review.whamcloud.com/48261&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/48261&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; recovery: upgrade reply data after recovery finish&lt;/tt&gt;&quot; when switching between branches without reformatting the filesystem (with added debugging to show why the mount was failing):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[Tue Feb  7 15:26:54 2023] LustreError: 1230:0:(tgt_lastrcvd.c:2206:tgt_reply_data_init()) testfs-MDT0000: invalid reply_data header size: 64 != 32
[Tue Feb  7 15:26:54 2023] LustreError: 1230:0:(obd_config.c:776:class_setup()) setup testfs-MDT0000 failed (-22)
[Tue Feb  7 15:26:54 2023] LustreError: 1230:0:(obd_config.c:2024:class_config_llog_handler()) MGC192.168.10.99@tcp: cfg command failed: rc = -22
[Tue Feb  7 15:26:54 2023] Lustre:    cmd=cf003 0:testfs-MDT0000  1:testfs-MDT0000_UUID  2:0  3:testfs-MDT0000-mdtlov  4:f  
[Tue Feb  7 15:26:54 2023] LustreError: 15b-f: MGC192.168.10.99@tcp: Configuration from log testfs-MDT0000 failed from MGS -22. Check client and MGS are on compatible version.
[Tue Feb  7 15:26:54 2023] LustreError: 1173:0:(tgt_mount.c:1444:server_start_targets()) failed to start server testfs-MDT0000: -22
[Tue Feb  7 15:26:54 2023] LustreError: 1173:0:(tgt_mount.c:2081:server_fill_super()) Unable to start targets: -22
[Tue Feb  7 15:26:54 2023] LustreError: 1173:0:(obd_config.c:829:class_cleanup()) Device 5 not setup
[Tue Feb  7 15:26:55 2023] Lustre: server umount testfs-MDT0000 complete
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It would be useful to backport a patch to b2_15 and b_es6_0 to allow mounting the filesystem with the new &lt;tt&gt;reply_data&lt;/tt&gt; record size during recovery, if that is possible.  It should certainly be possible if &lt;tt&gt;lrd_batch_idx == 0&lt;/tt&gt; is in the records (i.e. no clients doing WBC), which should at least be true for 2.16 clients.  This would avoid lots of support problems in the field if the MDS is upgraded to 2.16+ and then downgraded because of problems with WBC or some other new feature.&lt;/p&gt;

&lt;p&gt;I&apos;m not sure whether it would be possible for a 2.15 server to finish recovery with actual WBC client records (seems unlikely), but that becomes less critical if at least the 2.15-&amp;gt;2.16-&amp;gt;2.15 upgrade/downgrade path is handled.  It may be necessary to land support into 2.16 for the MDS to properly handle WBC record recovery, even if the WBC feature is not yet implemented there, again to allow upgrade/downgrade to work.&lt;/p&gt;

&lt;p&gt;At a minimum, the 2.16 MDS should be able to ignore such records (e.g. with &quot;&lt;tt&gt;abort_recov_mdt&lt;/tt&gt;&quot;) if it doesn&apos;t understand the format of the record, so that it isn&apos;t necessary for the user to manually mount the MDT and truncate &lt;tt&gt;reply_data&lt;/tt&gt; to recover from this problem.&lt;/p&gt;</comment>
                            <comment id="361996" author="adilger" created="Wed, 8 Feb 2023 00:11:10 +0000"  >&lt;p&gt;Yingjin, I guess the separate question is whether 2.16 with batched statahead actually &lt;b&gt;needs&lt;/b&gt; the larger &lt;tt&gt;reply_data&lt;/tt&gt; format with &lt;tt&gt;lrd_batch_idx&lt;/tt&gt;?  If not, then we should strongly consider disabling the automatic update of &lt;tt&gt;reply_data&lt;/tt&gt; to the new format in 2.16, and then only enable it in 2.17 when WBC is actually using it.  That would allow 2.16 to be able to downgrade from 2.17+ (and do batch RPC recovery) without  causing unnecessary incompatibility.  I think we would still need some basic interop in 2.15.x to handle the larger record size, but if batched statahead isn&apos;t using &lt;tt&gt;lrd_batch_idx&lt;/tt&gt; it should be quite simple (allow reading the larger records, then truncating the &lt;tt&gt;reply_data&lt;/tt&gt; file and reverting to the V1 record size, or staying with the same record size if that is simpler).&lt;/p&gt;</comment>
                            <comment id="361997" author="gerrit" created="Wed, 8 Feb 2023 00:21:57 +0000"  >&lt;p&gt;&quot;Andreas Dilger &amp;lt;adilger@whamcloud.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/49939&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/49939&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; tgt: improve messages for reply_data&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: bdbc114438ae79e5e3f7fff30808c9a5f9096158&lt;/p&gt;</comment>
                            <comment id="364137" author="adilger" created="Fri, 24 Feb 2023 22:38:50 +0000"  >&lt;p&gt;Hi Yingjin, could you please look into this.&lt;/p&gt;</comment>
                            <comment id="364173" author="qian_wc" created="Mon, 27 Feb 2023 02:33:23 +0000"  >&lt;p&gt;Hi Andreas,&lt;br/&gt;
This only happened when downgraded a MDT server of a Lustre file system from the latest master to b_es6_0 or b2_15.&lt;br/&gt;
Batched statahead does not need the larger reply_data format with lrd_batch_idx.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;we would still need some basic interop in 2.15.x to handle the larger record size&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;This need to patch 2.15.x to recognize the replay_data V2 format, and then convert new format data into the old V1 format. (Please note we can not simply truncate the new format data, we must convert it first as their record size are different)&lt;br/&gt;
If this is acceptable, I will make a patch. Then the automatic update of reply_data to the new format in 2.16 can be kept, I think.&lt;/p&gt;</comment>
                            <comment id="364179" author="adilger" created="Mon, 27 Feb 2023 05:38:04 +0000"  >&lt;p&gt;The reason this is important is to allow downgrade from a newer version of Lustre to 2.15. &lt;/p&gt;

&lt;p&gt;I think if batched statahead does not require the use of the new replay data format, then it would make sense to allow the new format to be*read* but not actually do the upgrade until the version that &lt;b&gt;requires&lt;/b&gt; it to be enabled (I guess when actual WBC is enabled).&lt;/p&gt;

&lt;p&gt;For master, I think that just means disabling the replay_data upgrade, and then re-enabling it after the statahead patches land, and b2_16 is branched. This will at least allow downgrade from 2.16 to 2.15 (without reply_data upgrade), and from 2.17+ to 2.16, but downgrading from 2.17+ to 2.15.2 would not be possible. &lt;/p&gt;

&lt;p&gt;Separately, I think it would be less complex to patch the older maintenance branches to understand the new format but not the code to do the upgrade. I don&apos;t think it would be hard to read the new format and ignore the added fields. &lt;/p&gt;</comment>
                            <comment id="364186" author="qian_wc" created="Mon, 27 Feb 2023 08:23:09 +0000"  >&lt;p&gt;Please note that in the master branch (b2_16) we write the reply_data with new format (in @tgt_reply_data_write) and the record size written by tgt_reply_data_write is enlarged in the current master branch...&lt;br/&gt;
This means that we must upgrade reply_data also unless we patch the master to write reply_data records with old V1 format...&lt;/p&gt;

&lt;p&gt;Thus I think the better solution here may be that we do not change the current master, but add downgrade support when switch from the master to 2.15:&lt;/p&gt;
&lt;blockquote&gt;&lt;p&gt;This need to patch 2.15.x to recognize the replay_data V2 format, and then convert new format data into the old V1 format. (Please note we can not simply truncate the new format data, we must convert it first as their record size are different)&lt;/p&gt;&lt;/blockquote&gt;

</comment>
                            <comment id="364275" author="adilger" created="Mon, 27 Feb 2023 20:36:00 +0000"  >&lt;p&gt;Since master does not actually &lt;b&gt;need&lt;/b&gt; the &lt;tt&gt;lrd_batch_idx&lt;/tt&gt; field for statahead, it makes sense to me that the &lt;tt&gt;lsd_reply_data_v2&lt;/tt&gt; format only be enabled &lt;b&gt;after&lt;/b&gt; the 2.16 release is made.  That does mean that master/2.16 would be writing the &lt;tt&gt;lsd_reply_data_v1&lt;/tt&gt; format for now, but able to mount a filesystem that has &lt;tt&gt;lsd_reply_data_v2&lt;/tt&gt; records (based on the magic).&lt;/p&gt;</comment>
                            <comment id="369474" author="gerrit" created="Fri, 14 Apr 2023 08:55:03 +0000"  >&lt;p&gt;&quot;Qian Yingjin &amp;lt;qian@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/50636&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/50636&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; target: use lsd_reply_data_v1 format by default&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4b4c1ea61bcc2029b4db9e4ca106d42eac2257a4&lt;/p&gt;</comment>
                            <comment id="369711" author="gerrit" created="Tue, 18 Apr 2023 03:22:36 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/49939/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/49939/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; tgt: improve messages for reply_data&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: b2f05051c4239e845434ea9e183d889e74a5db57&lt;/p&gt;</comment>
                            <comment id="374135" author="simmonsja" created="Thu, 1 Jun 2023 14:57:03 +0000"  >&lt;p&gt;Is this work complete?&lt;/p&gt;</comment>
                            <comment id="376040" author="adilger" created="Tue, 20 Jun 2023 23:42:26 +0000"  >&lt;p&gt;James,&lt;br/&gt;
the lack of compatibility is causing the &lt;tt&gt;clean-downgrade&lt;/tt&gt; and &lt;tt&gt;clean-downgrade-zfs&lt;/tt&gt; tests to fail with &quot;&lt;tt&gt;invalid header in reply_data&lt;/tt&gt;&quot;:&lt;br/&gt;
&lt;a href=&quot;https://testing.whamcloud.com/test_sets/26ed0085-1e4b-4c6d-acc7-ca63a5775066&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sets/26ed0085-1e4b-4c6d-acc7-ca63a5775066&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 74648:0:(tgt_lastrcvd.c:2070:tgt_reply_data_init()) lustre-MDT0000: invalid header in reply_data
LustreError: 74648:0:(obd_config.c:774:class_setup()) setup lustre-MDT0000 failed (-22)
 LustreError: 74648:0:(obd_config.c:2029:class_config_llog_handler()) MGC10.240.26.9@tcp: cfg command failed: rc = -22
Lustre:    cmd=cf003 0:lustre-MDT0000  1:lustre-MDT0000_UUID  2:0  3:lustre-MDT0000-mdtlov  4:f  

LustreError: 15b-f: MGC10.240.26.9@tcp: Configuration from log lustre-MDT0000 failed from MGS -22. Check client and MGS are on compatible version.
LustreError: 74442:0:(obd_mount_server.c:1425:server_start_targets()) failed to start server lustre-MDT0000: -22
LustreError: 74442:0:(obd_mount_server.c:2027:server_fill_super()) Unable to start targets: -22
LustreError: 74442:0:(obd_config.c:827:class_cleanup()) Device 5 not setup
LustreError: 74521:0:(ldlm_lockd.c:2500:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1686822829 with bad export cookie 6862479036220380163
LustreError: 166-1: MGC10.240.26.9@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail
Lustre: server umount lustre-MDT0000 complete
LustreError: 74442:0:(super25.c:176:lustre_fill_super()) llite: Unable to mount &amp;lt;unknown&amp;gt;: rc = -22
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The patch patch: &lt;a href=&quot;https://review.whamcloud.com/50636&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/50636&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16096&quot; title=&quot;recovery: handle  compatibility during upgrade for new replay data format&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16096&quot;&gt;LU-16096&lt;/a&gt; target: use lsd_reply_data_v1 format by default&lt;/tt&gt;&quot; still needs to land so that upgrade/downgrade continues to work.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="70916">LU-15975</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="61685">LU-14139</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i02x6v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>