<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:25:18 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16245] __osd_init_iobuf()) ASSERTION( iobuf-&gt;dr_elapsed_valid == 0 )</title>
                <link>https://jira.whamcloud.com/browse/LU-16245</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[ 8753.247529] LustreError: 37772:0:(osd_io.c:79:__osd_init_iobuf()) ASSERTION( iobuf-&amp;gt;dr_elapsed_valid == 0 ) failed: iobuf 000000006eba9531, reqs 0, rw 1, line 1633
[ 8753.262771] LustreError: 37772:0:(osd_io.c:79:__osd_init_iobuf()) LBUG
[ 8753.269970] Pid: 37772, comm: mdt_io05_022 5.10.0-60.18.0.50.aarch64 #1 SMP Wed Oct 5 10:58:08 CST 2022
[ 8753.280021] Call Trace TBD:
[ 8753.283505] Kernel panic - not syncing: LBUG
[ 8753.288454] CPU: 59 PID: 37772 Comm: mdt_io05_022 Kdump: loaded Tainted: P &#160; &#160; &#160; &#160; &#160; OE &#160; &#160; 5.10.0-60.18.0.50.aarch64 #1
[ 8753.299963] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.38 07/04/2020
[ 8753.308881] Call trace:
[ 8753.312014] &#160;dump_backtrace+0x0/0x1e0
[ 8753.316352] &#160;show_stack+0x20/0x30
[ 8753.320347] &#160;dump_stack+0xe0/0x148
[ 8753.324426] &#160;panic+0x170/0x398
[ 8753.328188] &#160;param_set_delay_minmax.isra.1+0x0/0xd0 [libcfs]
[ 8753.334552] &#160;__osd_init_iobuf+0x2e8/0x408 [osd_ldiskfs]
[ 8753.340454] &#160;osd_write_prep+0xec/0x330 [osd_ldiskfs]
[ 8753.346149] &#160;mdt_obd_preprw+0xaa0/0xc38 [mdt]
[ 8753.351294] &#160;tgt_brw_write+0x1208/0x2f30 [ptlrpc]
[ 8753.351367] &#160;tgt_handle_request0+0xd4/0x9b0 [ptlrpc]
[ 8753.362369] &#160;tgt_request_handle+0x7cc/0x1a30 [ptlrpc]
[ 8753.368148] &#160;ptlrpc_server_handle_request+0x3bc/0x1218 [ptlrpc]
[ 8753.374791] &#160;ptlrpc_main+0xdfc/0x16c8 [ptlrpc]
[ 8753.379910] &#160;kthread+0x130/0x138
[ 8753.383818] &#160;ret_from_fork+0x10/0x18
[ 8753.388121] SMP: stopping secondary CPUs
[ 8753.395179] Starting crashdump kernel...
[ 8753.399781] Bye!
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>lustre servers: &lt;br/&gt;
10 nodes ,each node has kunpeng920 96core *2, memory 512GB,nvme 3.2T*4&lt;br/&gt;
centos 8.4.2105&lt;br/&gt;
kernel 5.10.0-60.18.0.50.aarch64 &#65288;openeuler 22.03 kernel&#65289;&lt;br/&gt;
lustre 0c68b13a5eeb408862bad795aaf9a24a11a14b6a&lt;br/&gt;
&lt;br/&gt;
lustre clients:&lt;br/&gt;
10 nodes intel 6266C*2, memory 372GB&lt;br/&gt;
centos 8.4.2105&lt;br/&gt;
kernel 4.18.0-372.9.1.el8.x86_64&lt;br/&gt;
&lt;br/&gt;
IO500 tag:io500-sc21&lt;br/&gt;
</environment>
        <key id="72835">LU-16245</key>
            <summary>__osd_init_iobuf()) ASSERTION( iobuf-&gt;dr_elapsed_valid == 0 )</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="fengchunsong">Jason Feng</assignee>
                                    <reporter username="fengchunsong">Jason Feng</reporter>
                        <labels>
                    </labels>
                <created>Tue, 18 Oct 2022 04:05:35 +0000</created>
                <updated>Tue, 29 Nov 2022 09:35:30 +0000</updated>
                                            <version>Lustre 2.15.1</version>
                                                        <due>Tue, 18 Oct 2022 00:00:00 +0000</due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="349958" author="JIRAUSER18423" created="Tue, 18 Oct 2022 06:31:07 +0000"  >&lt;p&gt;Do not modify dr_elapsed_valid if osd_fini_iobuf has been invoked.&lt;/p&gt;

&lt;p&gt;The initial value of dr_elapsed_valid is 0. When the I/O is complete, dio_complete_routine will set dr_elapsed_valid&#160; to 1. Finally, dr_elapsed_valid is cleared in osd_fini_iobuf.In the I/O write process, wait_event is not called, and osd_fini_iobuf cannot be executed before dio_complete_routine. As a result, dr_elapsed_valid is not cleared and is asserted when it is used again.&lt;br/&gt;
The initial value of dr_elapsed_valid is 0 and is changed to 2 in osd_fini_iobuf. The value of dr_elapsed_valid is changed to 1 only when the value of dr_elapsed_valid is 0 in dio_complete_routine. This avoids modification after finishing.&lt;/p&gt;</comment>
                            <comment id="349964" author="gerrit" created="Tue, 18 Oct 2022 07:21:31 +0000"  >&lt;p&gt;&quot;fengchunsong &amp;lt;fengchunsong@huawei.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/48905&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/c/fs/lustre-release/+/48905&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-16245&quot; title=&quot;__osd_init_iobuf()) ASSERTION( iobuf-&amp;gt;dr_elapsed_valid == 0 )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-16245&quot;&gt;LU-16245&lt;/a&gt; osd-ldiskfs: prevent dr_elapsed_valid assertion&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 5bc624a9c930f5dfd38b62eb661b706c418682e0&lt;/p&gt;</comment>
                            <comment id="354467" author="xinliang" created="Tue, 29 Nov 2022 09:35:30 +0000"  >&lt;p&gt;I suspect this issue is similar to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12362&quot; title=&quot;kernel warning &amp;#39;do not call blocking ops when !TASK_RUNNING &amp;#39; in ptlrpcd&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12362&quot;&gt;&lt;del&gt;LU-12362&lt;/del&gt;&lt;/a&gt;. Nested sleeping primitives might lead to an infinite wait, making osd_fini_iobuf() won&apos;t be called which causes this crash.&lt;/p&gt;

&lt;p&gt;See about the problem of nested sleeping primitives here: &lt;a href=&quot;https://lwn.net/Articles/628628/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://lwn.net/Articles/628628/.&lt;/a&gt; We might need to fix this issue like &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12362&quot; title=&quot;kernel warning &amp;#39;do not call blocking ops when !TASK_RUNNING &amp;#39; in ptlrpcd&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12362&quot;&gt;&lt;del&gt;LU-12362&lt;/del&gt;&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="72836">LU-16246</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                    <customfield id="customfield_10030" key="com.atlassian.jira.plugin.system.customfieldtypes:labels">
                        <customfieldname>Epic/Theme</customfieldname>
                        <customfieldvalues>
                                        <label>Performance</label>
    
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0333z:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>