<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:45:14 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-4717] (rw.c:128:ll_cl_init()) husk1: [0x280000f70:0x11c59:0x0] no active IO, please file a ticket.</title>
                <link>https://jira.whamcloud.com/browse/LU-4717</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Filing a ticket as instructed. Log file for a client is filled with stack traces from the following error. All stack traces are the same.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 11692:0:(rw.c:128:ll_cl_init()) husk1: [0x280000f70:0x11c59:0x0] no active IO, please file a ticket.
 Pid: 11692, comm: ksh_so_hack.bin
 Trace:
 [&amp;lt;ffffffff81005eb9&amp;gt;] try_stack_unwind+0x169/0x1b0
 [&amp;lt;ffffffff81004919&amp;gt;] dump_trace+0x89/0x450
 [&amp;lt;ffffffffa02158d7&amp;gt;] libcfs_debug_dumpstack+0x57/0x80 [libcfs]
 [&amp;lt;ffffffffa07f33ae&amp;gt;] ll_cl_init+0x21e/0x320 [lustre]
 [&amp;lt;ffffffffa07f34f8&amp;gt;] ll_readpage+0x48/0x1b0 [lustre]
 [&amp;lt;ffffffff81106418&amp;gt;] __do_page_cache_readahead+0x1e8/0x260
 [&amp;lt;ffffffff81106538&amp;gt;] force_page_cache_readahead+0x78/0xa0
 [&amp;lt;ffffffff810ff30d&amp;gt;] sys_fadvise64_64+0xdd/0x230
 [&amp;lt;ffffffff810ff46e&amp;gt;] sys_fadvise64+0xe/0x10
 [&amp;lt;ffffffff8145376b&amp;gt;] system_call_fastpath+0x16/0x1b
 [&amp;lt;00002aaaaaac11bd&amp;gt;] 0x2aaaaaac11bd
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Also see these messages an hour prior to those above (in case there&apos;s a relationship):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;LustreError: 4943:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x4:0x0] at 4753935872275117037: rc -5

LustreError: 5003:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x7:0x0] at 4753935872275117037: rc -5

LustreError: 5984:0:(mdc_request.c:1580:mdc_read_page()) husk1-MDT0000-mdc-ffff88044bc53800: read cache page: [0x280000f14:0x17fe9:0x0] at 6497832999440693922: rc -5
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Attaching log file. Dump is available if you want it.&lt;/p&gt;</description>
                <environment>Master on SLES11 SP3</environment>
        <key id="23480">LU-4717</key>
            <summary>(rw.c:128:ll_cl_init()) husk1: [0x280000f70:0x11c59:0x0] no active IO, please file a ticket.</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="bobijam">Zhenyu Xu</assignee>
                                    <reporter username="amk">Ann Koehler</reporter>
                        <labels>
                    </labels>
                <created>Wed, 5 Mar 2014 20:26:27 +0000</created>
                <updated>Wed, 5 Apr 2017 17:18:40 +0000</updated>
                            <resolved>Thu, 8 May 2014 13:19:10 +0000</resolved>
                                    <version>Lustre 2.6.0</version>
                                    <fixVersion>Lustre 2.6.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>9</watches>
                                                                            <comments>
                            <comment id="79225" author="pjones" created="Thu, 13 Mar 2014 12:52:21 +0000"  >&lt;p&gt;Bobijam&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="79244" author="amk" created="Thu, 13 Mar 2014 15:34:29 +0000"  >&lt;p&gt;I&apos;ve uploaded the kdumps of 2 nodes that exhibited this bug to:&lt;/p&gt;

&lt;p&gt;ftp.whamcloud.com:/uploads/&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-4717&quot; title=&quot;(rw.c:128:ll_cl_init()) husk1: [0x280000f70:0x11c59:0x0] no active IO, please file a ticket.&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-4717&quot;&gt;&lt;del&gt;LU-4717&lt;/del&gt;&lt;/a&gt;/LU4717_no_active_io.tgz&lt;/p&gt;

&lt;p&gt;I&apos;m not sure how much they will help. The processes issuing the errors had terminated by the time the dumps were taken, but I&apos;m passing them along in case there might be cached data structures with useful info.&lt;/p&gt;</comment>
                            <comment id="79322" author="bobijam" created="Fri, 14 Mar 2014 07:53:17 +0000"  >&lt;p&gt;please try patch &lt;a href=&quot;http://review.whamcloud.com/9658&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9658&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="79514" author="amk" created="Mon, 17 Mar 2014 15:56:23 +0000"  >&lt;p&gt;The patch is scheduled for testing this week. Will let you know the results when available.&lt;/p&gt;</comment>
                            <comment id="80072" author="jlevi" created="Mon, 24 Mar 2014 15:29:22 +0000"  >&lt;p&gt;Is the testing of Change, 9658 still in progress or have you gotten results yet?&lt;/p&gt;</comment>
                            <comment id="80095" author="amk" created="Mon, 24 Mar 2014 18:35:17 +0000"  >&lt;p&gt;Testing is still in progress.&lt;/p&gt;</comment>
                            <comment id="80249" author="mmansk" created="Tue, 25 Mar 2014 18:40:46 +0000"  >&lt;p&gt;Finished testing this afternoon.  &lt;/p&gt;

&lt;p&gt;Looks as though this patch fixed the issue, after running IOSTRESS for 5 hours I haven&apos;t seen the issue.  Previously it occurred with in an hour of running.&lt;/p&gt;
</comment>
                            <comment id="80250" author="pjones" created="Tue, 25 Mar 2014 18:51:39 +0000"  >&lt;p&gt;Thanks Mark!&lt;/p&gt;</comment>
                            <comment id="80261" author="jay" created="Tue, 25 Mar 2014 23:44:41 +0000"  >&lt;p&gt;Is it really necessary for Cray to have fadvise() support? I would like to return an error value in this case so that fadivse() will be actually disabled.&lt;/p&gt;</comment>
                            <comment id="80327" author="spitzcor" created="Wed, 26 Mar 2014 18:24:56 +0000"  >&lt;p&gt;One way or another this bug should get cleaned up.  If fadvise won&apos;t be supported in CLIO then we should update the Ops Manual with a discussion about that in the API section.  But wouldn&apos;t it be better in the long run to actually use input from fadvise() to make good decisions about what Linux should do with page cache, even if CLIO can&apos;t (currently) make better use of the advise?&lt;/p&gt;</comment>
                            <comment id="80328" author="jay" created="Wed, 26 Mar 2014 18:47:00 +0000"  >&lt;p&gt;The major problem with fadvise() is that it doesn&apos;t have a callback for file system, therefore Lustre can only provide limited support.&lt;/p&gt;

&lt;p&gt;However, Lustre can easily support POSIX_FADV_WILLNEED and I believe this is the most frequent option for fadvise(). We can just check if a lock is already existing on the client side and if this is the case, we can read ahead pages as requested by fadvise(). How does this sound?&lt;/p&gt;</comment>
                            <comment id="80338" author="spitzcor" created="Wed, 26 Mar 2014 21:25:29 +0000"  >&lt;p&gt;That sounds ok to me.  I guess we&apos;ll have to wait for more use and exposure to see what the application writers will want.  We can track those needs in new tickets.&lt;/p&gt;</comment>
                            <comment id="83495" author="mmansk" created="Thu, 8 May 2014 13:10:29 +0000"  >&lt;p&gt;We&apos;ve hit this error again, repeatedly, running Sanity - test 54c this time against &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3321&quot; title=&quot;2.x single thread/process throughput degraded from 1.8&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3321&quot;&gt;&lt;del&gt;LU-3321&lt;/del&gt;&lt;/a&gt; built into our 2.6 branch.  Patch &lt;a href=&quot;http://review.whamcloud.com/9658&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/9658&lt;/a&gt; was not yet in our 2.6 build.  &lt;/p&gt;

&lt;p&gt;We had not seen this error until recently, are there changes that are bringing this to light more?  &lt;br/&gt;
In test 54c this fails when attempting to mount the loop device created, with the following in dmesgs:&lt;br/&gt;
Buffer I/O error on device loop3, logical block 0&lt;br/&gt;
lost page write due to I/O error on loop3&lt;/p&gt;

&lt;p&gt;in log file: &lt;br/&gt;
mount: wrong fs type, bad option, bad superblock on /tmp/dal/loop54c,&lt;br/&gt;
       missing codepage or helper program, or other error&lt;br/&gt;
       In some cases useful info is found in syslog - try&lt;br/&gt;
       dmesg | tail  or so&lt;/p&gt;</comment>
                            <comment id="83496" author="jlevi" created="Thu, 8 May 2014 13:19:10 +0000"  >&lt;p&gt;Patch landed to Master. Please reopen ticket if more work is needed&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                            <attachment id="14227" name="console.c0-0c0s10n2" size="2357094" author="amk" created="Wed, 5 Mar 2014 20:26:27 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzwgun:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>12967</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>