<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:22:38 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2132] 2.1.3 client hangs on &apos;df&apos;</title>
                <link>https://jira.whamcloud.com/browse/LU-2132</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;A front end node pfe3 hangs on df. Nagios reporting nbp1 unmounted.&lt;/p&gt;

&lt;p&gt;/var/log/messages showed lustre errors before hang are as below. We needed to reboot pfe3 to get nbp1 mounted again.&lt;/p&gt;

&lt;p&gt;...Oct  7 15:08:25 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490266.072028&amp;#93;&lt;/span&gt; Lustre: 7366:0:(client.c:1780:ptlrpc_expire_one_request()) @@@ Request  sent has timed out for slow reply: &lt;span class=&quot;error&quot;&gt;&amp;#91;sent 1349646649/real 1349646649&amp;#93;&lt;/span&gt;  req@ffff880024adb800 x1414695477195603/t0(0) o6-&amp;gt;nbp1-OST003c-osc-ffff880073e9d400@10.151.26.31@o2ib:6/4 lens 512/400 e 2 to 1 dl 1349647705 ref 1 fl Rpc:X/0/ffffffff rc 0/-1&lt;br/&gt;
Oct  7 15:08:25 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490266.158936&amp;#93;&lt;/span&gt; Lustre: 7366:0:(client.c:1780:ptlrpc_expire_one_request()) Skipped 41 previous similar messages&lt;br/&gt;
Oct  7 15:08:25 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490266.188374&amp;#93;&lt;/span&gt; Lustre: nbp1-OST003c-osc-ffff880073e9d400: Connection to nbp1-OST003c (at 10.151.26.31@o2ib) was lost; in progress operations using this service will wait for recovery to complete&lt;br/&gt;
Oct  7 15:08:25 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490266.252863&amp;#93;&lt;/span&gt; LustreError: 11-0: an error occurred while communicating with 10.151.26.31@o2ib. The ost_connect operation failed with -16&lt;br/&gt;
Oct  7 15:08:25 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490266.289310&amp;#93;&lt;/span&gt; LustreError: Skipped 20 previous similar messages&lt;br/&gt;
Oct  7 15:10:05 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490366.252622&amp;#93;&lt;/span&gt; LustreError: 11-0: an error occurred while communicating with 10.151.26.31@o2ib. The ost_connect operation failed with -16&lt;br/&gt;
Oct  7 15:10:05 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490366.289087&amp;#93;&lt;/span&gt; LustreError: Skipped 3 previous similar messages&lt;br/&gt;
Oct  7 15:10:39 pfe3 ntpd&lt;span class=&quot;error&quot;&gt;&amp;#91;5013&amp;#93;&lt;/span&gt;: kernel time sync status change 2001&lt;br/&gt;
Oct  7 15:11:36 pfe3 envmodule: bkup load nas&lt;br/&gt;
Oct  7 15:11:37 pfe3 envmodule: bkup load nas&lt;br/&gt;
Oct  7 15:12:15 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.137877&amp;#93;&lt;/span&gt; Lustre: nbp1-OST005f-osc-ffff880073e9d400: Connection to nbp1-OST005f (at 10.151.26.34@o2ib) was lost; in progress operations using this service will wait for recovery to complete&lt;br/&gt;
Oct  7 15:12:15 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.236273&amp;#93;&lt;/span&gt; LustreError: 167-0: This client was evicted by nbp1-OST005f; in progress operations using this service will fail.&lt;br/&gt;
Oct  7 15:12:15 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.270510&amp;#93;&lt;/span&gt; LustreError: 9848:0:(osc_lock.c:809:osc_ldlm_completion_ast()) lock@ffff88005932a358&lt;span class=&quot;error&quot;&gt;&amp;#91;2 3 0 1 1 00000000&amp;#93;&lt;/span&gt; R(1):&lt;span class=&quot;error&quot;&gt;&amp;#91;0, 18446744073709551615&amp;#93;&lt;/span&gt;@&lt;span class=&quot;error&quot;&gt;&amp;#91;0x1005f0000:0x2fa3e62:0x0&amp;#93;&lt;/span&gt; &lt;/p&gt;
{
Oct  7 15:12:15 pfe3 kernel: [490496.318170] LustreError: 9848:0:(osc_lock.c:809:osc_ldlm_completion_ast())     lovsub@ffff88001513c820: [0 ffff880070c99c28 R(1):[2304, 18446744073709551615]@[0x44855761059:0x128f3:0x0]] [9 ffff880069152e98 P(0):[0, 18446744073709551615]@[0x44855761059:0x128f3:0x0]]
Oct  7 15:12:16 pfe3 kernel: [490496.389224] LustreError: 9848:0:(osc_lock.c:809:osc_ldlm_completion_ast())     osc@ffff880070f789e0: ffff880052ef3480 40160002 0x3bf5f5ecface519f 3 ffff880029da22b8 size: 859986 mtime: 1349647704 atime: 1349647702 ctime: 1349647704 blocks: 1688
Oct  7 15:12:16 pfe3 kernel: [490496.389229] LustreError: 9848:0:(osc_lock.c:809:osc_ldlm_completion_ast()) }
&lt;p&gt; lock@ffff88005932a358&lt;br/&gt;
Oct  7 15:12:16 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.389231&amp;#93;&lt;/span&gt; LustreError: 9848:0:(osc_lock.c:809:osc_ldlm_completion_ast()) dlmlock returned -5&lt;br/&gt;
Oct  7 15:12:16 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.389268&amp;#93;&lt;/span&gt; LustreError: 9848:0:(ldlm_resource.c:749:ldlm_resource_complain()) Namespace nbp1-OST005f-osc-ffff880073e9d400 resource refcount nonzero (1) after lock cleanup; forcing cleanup.&lt;br/&gt;
Oct  7 15:12:16 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.389271&amp;#93;&lt;/span&gt; LustreError: 9848:0:(ldlm_resource.c:755:ldlm_resource_complain()) Resource: ffff8800641f5c00 (49954402/0/0/0) (rc: 1)&lt;br/&gt;
Oct  7 15:12:16 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490496.389299&amp;#93;&lt;/span&gt; Lustre: nbp1-OST005f-osc-ffff880073e9d400: Connection restored to nbp1-OST005f (at 10.151.26.34@o2ib)&lt;br/&gt;
Oct  7 15:13:00 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490541.188284&amp;#93;&lt;/span&gt; LustreError: 11-0: an error occurred while communicating with 10.151.26.31@o2ib. The ost_connect operation failed with -16&lt;br/&gt;
Oct  7 15:13:00 pfe3 kernel: &lt;span class=&quot;error&quot;&gt;&amp;#91;490541.224748&amp;#93;&lt;/span&gt; LustreError: Skipped 7 previous similar messages&lt;br/&gt;
...&lt;/p&gt;


&lt;p&gt;The messages on ldlm_resource_complain() seems to carry the same signature of ORI-735, but it happens to 2.1.3 on our production systems.&lt;/p&gt;
</description>
                <environment>Client: kernel: sles11sp1 2.6.32.54-0.3.1.20120223-nasa&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;lustre-client-2.1.3-1nasC_2.6.32.54_0.3.1.20120223_nasa&lt;br/&gt;
Server kernel: centos 6.2 2.6.32-220.4.1.el6.20120607.x86_64.lustre212&lt;br/&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;lustre-2.1.2-2nasS_ofed154_2.6.32_220.4.1.el6.20120607.x86_64.lustre212.x86_64&lt;br/&gt;
&lt;br/&gt;
&lt;a href=&quot;https://github.com/jlan/lustre-nas/tree/nas-2.1.3/&quot;&gt;https://github.com/jlan/lustre-nas/tree/nas-2.1.3/&lt;/a&gt;&lt;br/&gt;
</environment>
        <key id="16305">LU-2132</key>
            <summary>2.1.3 client hangs on &apos;df&apos;</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="4">Incomplete</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="jaylan">Jay Lan</reporter>
                        <labels>
                    </labels>
                <created>Tue, 9 Oct 2012 15:06:57 +0000</created>
                <updated>Sat, 15 Mar 2014 01:09:58 +0000</updated>
                            <resolved>Sat, 15 Mar 2014 01:09:58 +0000</resolved>
                                    <version>Lustre 2.1.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="46286" author="pjones" created="Tue, 9 Oct 2012 18:09:12 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="46303" author="bobijam" created="Wed, 10 Oct 2012 00:56:50 +0000"  >&lt;p&gt;Can you please upload the logs of affected client node (pfe3 I guess) and OSS node which contains the OST005f? Thank you.&lt;/p&gt;</comment>
                            <comment id="46354" author="jaylan" created="Wed, 10 Oct 2012 21:05:17 +0000"  >&lt;p&gt;Attached are console log of pfe3 and &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2132&quot; title=&quot;2.1.3 client hangs on &amp;#39;df&amp;#39;&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2132&quot;&gt;&lt;del&gt;LU-2132&lt;/del&gt;&lt;/a&gt;.OST005f.tgz as request.&lt;/p&gt;</comment>
                            <comment id="78792" author="jfc" created="Sat, 8 Mar 2014 01:19:56 +0000"  >&lt;p&gt;Hi Bobijam,&lt;br/&gt;
Is this ticket going anywhere, or did we reach a dead end?&lt;br/&gt;
Should I mark it as resolved?&lt;br/&gt;
Thanks,&lt;br/&gt;
~ jfc.&lt;/p&gt;</comment>
                            <comment id="79392" author="jfc" created="Sat, 15 Mar 2014 01:09:58 +0000"  >&lt;p&gt;Looks like we will not make any further progress on this issue.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="11965" name="LU-2132.OST005f.tgz" size="16405" author="jaylan" created="Wed, 10 Oct 2012 21:05:17 +0000"/>
                            <attachment id="11964" name="console.pfe3" size="3772" author="jaylan" created="Wed, 10 Oct 2012 21:05:17 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzv9vj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>5134</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>