<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:42:03 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-11227] client process hangs when lod_sync accesses deactivated OSTs</title>
                <link>https://jira.whamcloud.com/browse/LU-11227</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hi,&lt;/p&gt;

&lt;p&gt;we have a ldiskfs Lustre install where one OST is permanently deactivated with&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;lctl conf_param lustre-OST005a.osc.active=0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;we were in IEEL and saw no issues there, but are now in 2.10.4 to try and get better compatibility for file transfers to the new cluster.&lt;/p&gt;

&lt;p&gt;the problem is that in 2.10.4 a chgrp on the client hangs forever as it re-tries infinitely. MDS load is significant too.&lt;/p&gt;

&lt;p&gt;message from the MDS when the chgrp hangs is 1000&apos;s of these per second&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Aug  7 19:45:41 metadata01 kernel: LustreError: 4502:0:(lod_dev.c:1400:lod_sync()) lustre-MDT0000-mdtlov: can&apos;t sync ost 90: -107
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;looking at that lod_sync() code, it doesn&apos;t check for OSTs being deactivated. the below seems to work as a quick fix so that we don&apos;t have to reboot back into IEEL.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;diff --git a/lustre/lod/lod_dev.c b/lustre/lod/lod_dev.c
index d61ad2d..2194110 100644
--- a/lustre/lod/lod_dev.c
+++ b/lustre/lod/lod_dev.c
@@ -1409,6 +1409,8 @@ static int lod_sync(const struct lu_env *env, struct dt_device *dev)
        lod_foreach_ost(lod, i) {
                ost = OST_TGT(lod, i);
                LASSERT(ost &amp;amp;&amp;amp; ost-&amp;gt;ltd_ost);
+               if (!ost-&amp;gt;ltd_active)
+                       continue;
                rc = dt_sync(env, ost-&amp;gt;ltd_ost);
                if (rc) {
                        CERROR(&quot;%s: can&apos;t sync ost %u: %d\n&quot;,
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;the same fix would also seem to be appropriate for a deactivated MDT a few lines lower down.&lt;/p&gt;

&lt;p&gt;please let me know if this is totally the wrong thing to do, or if alternatively if it&apos;s useful and you&apos;d like me to upload a patch to Gerrit.&lt;/p&gt;

&lt;p&gt;also dry-run of lfsck in 2.10.4 wants to correct layout for 15% of mdt files, and namespace  (linkea_inconsistent) for about 15k files out of 225M. presumably because of this fs being old and/or a bit damaged from bugs and hardware failures and/or created under IEEL. OSTs all look ok in lfsck. we have not run a lfsck for real because this all looks too intrusive.&lt;/p&gt;

&lt;p&gt;cheers,&lt;br/&gt;
robin&lt;/p&gt;</description>
                <environment>x64_64, ldiskfs, not DNE</environment>
        <key id="52928">LU-11227</key>
            <summary>client process hangs when lod_sync accesses deactivated OSTs</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="rjh">Robin Humble</assignee>
                                    <reporter username="rjh">Robin Humble</reporter>
                        <labels>
                    </labels>
                <created>Wed, 8 Aug 2018 08:01:44 +0000</created>
                <updated>Tue, 11 Sep 2018 20:44:36 +0000</updated>
                            <resolved>Thu, 23 Aug 2018 13:01:51 +0000</resolved>
                                    <version>Lustre 2.11.0</version>
                    <version>Lustre 2.10.4</version>
                                    <fixVersion>Lustre 2.12.0</fixVersion>
                    <fixVersion>Lustre 2.10.6</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>6</watches>
                                                                            <comments>
                            <comment id="231622" author="rjh" created="Wed, 8 Aug 2018 08:24:19 +0000"  >&lt;p&gt;hmm, I can&apos;t figure out how to edit the above, but ignore the Lustre-1.x stuff - I&apos;ve realised that&apos;s just the mgt. mdt is dirdata.&lt;/p&gt;

&lt;p&gt;cheers,&lt;br/&gt;
robin&lt;/p&gt;</comment>
                            <comment id="231665" author="adilger" created="Wed, 8 Aug 2018 19:08:02 +0000"  >&lt;p&gt;I&apos;ve deleted the MGS dirdata part of your comment, since this is already fixed in 2.11 via patch &lt;a href=&quot;https://review.whamcloud.com/29274&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/29274&lt;/a&gt; &quot;&lt;tt&gt;&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4923&quot; title=&quot;lfsck statistics are inconsistent&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4923&quot;&gt;&lt;del&gt;LU-4923&lt;/del&gt;&lt;/a&gt; osd-ldiskfs: dirdata is not needed on MGS&lt;/tt&gt;&quot;.&lt;/p&gt;

&lt;p&gt;Could you please submit a patch to Gerrit with your changes so that we can get it reviewed and landed.&lt;/p&gt;

&lt;p&gt;The LFSCK layout and &lt;tt&gt;linkea_inconsistent&lt;/tt&gt; issue may be holdovers from older versions of Lustre if you have migrated files between OSTs, but the parent FID did not get updated.  If you have a list of FIDs that LFSCK would repair, you could examine some of them manually to see what kind of issues there are using &quot;&lt;tt&gt;lfs fid2path&lt;/tt&gt;&quot; to see if the reported pathname matches the actual pathname of the file, &quot;&lt;tt&gt;lfs getstripe $MOUNT/.lustre/fid/&amp;lt;fid&amp;gt;&lt;/tt&gt;&quot; to see that the FID stored in the layout is matches that in the LMA and directory, and  &quot;&lt;tt&gt;debugfs -c -R &apos;stat O/d$(($objid % 32))/$objid&apos; /dev/&amp;lt;ostdev&amp;gt;&lt;/tt&gt;&quot; to see if the objects have the correct parent (MDT) FID.&lt;/p&gt;

&lt;p&gt;Whether you fix these with LFSCK or not is up to you.  You should to consider making an MDT device-level backup, whether you run the LFSCK or not.  Not repairing these issues reported by LFSCK means that if you do have some kind of corruption that LFSCK may not be able to recover as much, or may recover the filesystem incorrectly based on stale data.&lt;/p&gt;</comment>
                            <comment id="231692" author="gerrit" created="Thu, 9 Aug 2018 05:52:17 +0000"  >&lt;p&gt;Robin Humble (plaguedbypenguins@gmail.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/32964&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32964&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11227&quot; title=&quot;client process hangs when lod_sync accesses deactivated OSTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11227&quot;&gt;&lt;del&gt;LU-11227&lt;/del&gt;&lt;/a&gt; lod: lod_sync: don&apos;t attempt sync to inactive targets&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 4d06197738c6a29b98082949478837041f83a14c&lt;/p&gt;</comment>
                            <comment id="231887" author="gerrit" created="Mon, 13 Aug 2018 21:17:08 +0000"  >&lt;p&gt;John L. Hammond (jhammond@whamcloud.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/32991&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32991&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11227&quot; title=&quot;client process hangs when lod_sync accesses deactivated OSTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11227&quot;&gt;&lt;del&gt;LU-11227&lt;/del&gt;&lt;/a&gt; lod: lod_sync: don&apos;t attempt sync to inactive targets&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 576f1ee0cfb6f56028b82e28cea289576d0472fc&lt;/p&gt;</comment>
                            <comment id="232483" author="gerrit" created="Thu, 23 Aug 2018 07:18:50 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/32964/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32964/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11227&quot; title=&quot;client process hangs when lod_sync accesses deactivated OSTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11227&quot;&gt;&lt;del&gt;LU-11227&lt;/del&gt;&lt;/a&gt; lod: lod_sync: don&apos;t attempt sync to inactive targets&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 7c099467eab7b64c00e6a14c9cab7f09153571c1&lt;/p&gt;</comment>
                            <comment id="232498" author="pjones" created="Thu, 23 Aug 2018 13:01:51 +0000"  >&lt;p&gt;Landed for 2.12&lt;/p&gt;</comment>
                            <comment id="233349" author="gerrit" created="Tue, 11 Sep 2018 20:16:39 +0000"  >&lt;p&gt;John L. Hammond (jhammond@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/32991/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/32991/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-11227&quot; title=&quot;client process hangs when lod_sync accesses deactivated OSTs&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-11227&quot;&gt;&lt;del&gt;LU-11227&lt;/del&gt;&lt;/a&gt; lod: lod_sync: don&apos;t attempt sync to inactive targets&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_10&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 2898c1765adfd6d966be6e5c24511519dede0188&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                        <issuelink>
            <issuekey id="25048">LU-5152</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="53159">LU-11303</issuekey>
        </issuelink>
                            </outwardlinks>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="52952">LU-11236</issuekey>
        </issuelink>
            <issuelink>
            <issuekey id="52642">LU-11119</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i000gn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>