<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:15:06 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-15059] Setting several tbf rules at the same time causes crashes</title>
                <link>https://jira.whamcloud.com/browse/LU-15059</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;I was looking to reproduce an another TBF bug (&lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15056&quot; title=&quot;Overflow when setting a tbf rule name&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15056&quot;&gt;&lt;del&gt;LU-15056&lt;/del&gt;&lt;/a&gt;) when I trigger this one.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;reproducer&lt;/b&gt;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
start rule_name1 uid={500} rate=100 &amp;amp; \
start rule_name2 uid={1000} rate=2000 &amp;amp;
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;b&gt;crash&lt;/b&gt;&lt;/p&gt;
&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
[ 5940.060061] BUG: unable to handle kernel paging request at 00000000deadbeef
[ 5940.060112] IP: [&amp;lt;ffffffffa978f100&amp;gt;] strlen+0x0/0x30
[ 5940.060157] PGD 80000000d084d067 PUD 0
[ 5940.060188] Oops: 0000 [#1] SMP 
[ 5940.060198] Modules linked in: osp(OE) mdd(OE) lod(OE) mdt(OE) lfsck(OE) mgs(OE) mgc(OE) osd_ldiskfs(OE) lquota(OE) lustre(OE) lmv(OE) mdc(OE) lov(OE) osc(OE) fid(OE) fld(OE) ksocklnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) ldiskfs(OE) mbcache jbd2 libcfs(OE) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc iosf_mbi crc32_pclmul ppdev ghash_clmulni_intel snd_intel8x0 snd_ac97_codec ac97_bus snd_seq aesni_intel snd_seq_device snd_pcm lrw gf128mul glue_helper ablk_helper cryptd snd_timer snd pcspkr sg soundcore parport_pc vboxguest(OE) i2c_piix4 parport video ip_tables xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci ttm libahci crct10dif_pclmul crct10dif_common
[ 5940.060571]  drm ata_piix serio_raw crc32c_intel libata e1000 drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
[ 5940.060641] CPU: 3 PID: 8460 Comm: lctl Kdump: loaded Tainted: G           OE  ------------   3.10.0-1127.8.2.el7_lustre.x86_64 #1
[ 5940.060682] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 5940.060722] task: ffff8bac04d9d230 ti: ffff8bac9a8f8000 task.ti: ffff8bac9a8f8000
[ 5940.060768] RIP: 0010:[&amp;lt;ffffffffa978f100&amp;gt;]  [&amp;lt;ffffffffa978f100&amp;gt;] strlen+0x0/0x30
[ 5940.060786] RSP: 0018:ffff8bac9a8fbe20  EFLAGS: 00010246
[ 5940.060798] RAX: 0000000000000001 RBX: ffff8baccc8dc000 RCX: 0000000000000000
[ 5940.060813] RDX: 0000000000000713 RSI: 0000000000000000 RDI: 00000000deadbeef
[ 5940.060828] RBP: ffff8bac9a8fbe38 R08: 000000000000076b R09: 0000000000000000
[ 5940.060843] R10: 0000000000000000 R11: 000000000000000f R12: 00000000deadbeef
[ 5940.060858] R13: 0000000000000000 R14: 0000000000000003 R15: ffff8bac741ea240
[ 5940.060874] FS:  00007f56cf31c740(0000) GS:ffff8bacdfd80000(0000) knlGS:0000000000000000
[ 5940.060891] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5940.060904] CR2: 00000000deadbeef CR3: 0000000009fb6000 CR4: 00000000000606e0
[ 5940.060921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5940.060960] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5940.060997] Call Trace:
[ 5940.061054]  [&amp;lt;ffffffffc0e867f3&amp;gt;] ? nrs_tbf_generic_cmd_fini+0x53/0x100 [ptlrpc]
[ 5940.061118]  [&amp;lt;ffffffffc0e86905&amp;gt;] nrs_tbf_cmd_fini.part.30+0x65/0xd0 [ptlrpc]
[ 5940.061607]  [&amp;lt;ffffffffc0e8a91d&amp;gt;] ptlrpc_lprocfs_nrs_tbf_rule_seq_write+0xbdd/0x1030 [ptlrpc]
[ 5940.062062]  [&amp;lt;ffffffffa964d1b0&amp;gt;] vfs_write+0xc0/0x1f0
[ 5940.062579]  [&amp;lt;ffffffffa9b92e15&amp;gt;] ? system_call_after_swapgs+0xa2/0x13a
[ 5940.063031]  [&amp;lt;ffffffffa964df7f&amp;gt;] SyS_write+0x7f/0xf0
[ 5940.063540]  [&amp;lt;ffffffffa9b92e15&amp;gt;] ? system_call_after_swapgs+0xa2/0x13a
[ 5940.063979]  [&amp;lt;ffffffffa9b92ed2&amp;gt;] system_call_fastpath+0x25/0x2a
[ 5940.064482]  [&amp;lt;ffffffffa9b92e15&amp;gt;] ? system_call_after_swapgs+0xa2/0x13a
[ 5940.064923] Code: 89 f8 48 89 e5 f6 82 20 ad c5 a9 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 20 ad c5 a9 20 75 f0 5d c3 66 0f 1f 44 00 00 &amp;lt;80&amp;gt; 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80
[ 5940.066420] RIP  [&amp;lt;ffffffffa978f100&amp;gt;] strlen+0x0/0x30
[ 5940.066856]  RSP &amp;lt;ffff8bac9a8fbe20&amp;gt;
[ 5940.067317] CR2: 00000000deadbeef
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;b&gt;Analysis&lt;/b&gt;&lt;br/&gt;
After some search, it seem that the crash is cause by a double kfree on &quot;nrs_tbf_generic_cmd_fini() cmd-&amp;gt;u.tc_start.ts_conds_str&quot; pointer:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
00000100:00000010:3.0:1633132460.865076:0:8460:0:(nrs_tbf.c:1770:nrs_tbf_expression_free()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;expr&apos;&lt;/span&gt;: 48 at ffff8bac99516340.
00000100:00000010:1.0:1633132460.865077:0:8459:0:(nrs_tbf.c:1808:nrs_tbf_generic_cmd_fini()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;cmd-&amp;gt;u.tc_start.ts_conds_str&apos;&lt;/span&gt;: 35 at ffff8bac99516f40.  &amp;lt;--------------------------
00000100:00000010:3.0:1633132460.865077:0:8460:0:(nrs_tbf.c:1786:nrs_tbf_conjunction_free()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;conjunction&apos;&lt;/span&gt;: 32 at ffff8bac90937840.
00000100:00000010:1.0:1633132460.865078:0:8459:0:(nrs_tbf.c:3641:ptlrpc_lprocfs_nrs_tbf_rule_seq_write()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;cmd&apos;&lt;/span&gt;: 144 at ffff8baccc8dc000.
00000100:00000010:1.0:1633132460.865078:0:8459:0:(nrs_tbf.c:3643:ptlrpc_lprocfs_nrs_tbf_rule_seq_write()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;kernbuf&apos;&lt;/span&gt;: 4096 at ffff8bacc96eb000.
00000100:00000010:3.0:1633132460.865078:0:8460:0:(nrs_tbf.c:1808:nrs_tbf_generic_cmd_fini()) kfreed &lt;span class=&quot;code-quote&quot;&gt;&apos;cmd-&amp;gt;u.tc_start.ts_conds_str&apos;&lt;/span&gt;: 35 at ffff8bac99516f40.  &amp;lt;--------------------------
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is possible because of the following code:&lt;/p&gt;

&lt;div class=&quot;code panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;codeContent panelContent&quot;&gt;
&lt;pre class=&quot;code-java&quot;&gt;
&lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; ssize_t
ptlrpc_lprocfs_nrs_sbf_rule_seq_write(struct file *file,
                                      &lt;span class=&quot;code-keyword&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt; __user *buffer,
                                      size_t count, loff_t *off)
{
        struct seq_file           *m = file-&amp;gt;private_data;
        struct ptlrpc_service     *svc = m-&amp;gt;&lt;span class=&quot;code-keyword&quot;&gt;private&lt;/span&gt;;
        &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt;                      *kernbuf;
        &lt;span class=&quot;code-object&quot;&gt;char&lt;/span&gt;                      *val;
        &lt;span class=&quot;code-object&quot;&gt;int&lt;/span&gt;                        rc;
        &lt;span class=&quot;code-keyword&quot;&gt;static&lt;/span&gt; struct nrs_tbf_cmd *cmd;                                         &amp;lt;----------------
        &lt;span class=&quot;code-keyword&quot;&gt;enum&lt;/span&gt; ptlrpc_nrs_queue_type queue = PTLRPC_NRS_QUEUE_BOTH;
...

        cmd = nrs_tbf_parse_cmd(val, length, nrs_tbf_type_flag(svc, queue));    &amp;lt;----------------
        &lt;span class=&quot;code-keyword&quot;&gt;if&lt;/span&gt; (IS_ERR(cmd))
                GOTO(out_free_kernbuff, rc = PTR_ERR(cmd));
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The &quot;cmd&quot; static pointer is overwriten by the 2sd thread.&lt;br/&gt;
I don&apos;t why this pointer should be static: it is allocated and free in the ptlrpc_lprocfs_nrs_sbf_rule_seq_write() function.&lt;/p&gt;</description>
                <environment></environment>
        <key id="66480">LU-15059</key>
            <summary>Setting several tbf rules at the same time causes crashes</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="4" iconUrl="https://jira.whamcloud.com/images/icons/priorities/minor.svg">Minor</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="eaujames">Etienne Aujames</assignee>
                                    <reporter username="eaujames">Etienne Aujames</reporter>
                        <labels>
                    </labels>
                <created>Tue, 5 Oct 2021 09:25:04 +0000</created>
                <updated>Mon, 22 Jan 2024 11:02:35 +0000</updated>
                            <resolved>Tue, 30 Nov 2021 13:56:42 +0000</resolved>
                                                    <fixVersion>Lustre 2.15.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>5</watches>
                                                                            <comments>
                            <comment id="314863" author="gerrit" created="Wed, 6 Oct 2021 20:32:46 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/45142&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45142&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15059&quot; title=&quot;Setting several tbf rules at the same time causes crashes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15059&quot;&gt;&lt;del&gt;LU-15059&lt;/del&gt;&lt;/a&gt; nrs: do not overwrite &quot;cmd&quot; in nrs_tbf_rule&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: d883ed5ab0ef385ce1fbd6391c4e4bff48ed5616&lt;/p&gt;</comment>
                            <comment id="319477" author="gerrit" created="Tue, 30 Nov 2021 03:47:52 +0000"  >&lt;p&gt;&quot;Oleg Drokin &amp;lt;green@whamcloud.com&amp;gt;&quot; merged in patch &lt;a href=&quot;https://review.whamcloud.com/45142/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/45142/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15059&quot; title=&quot;Setting several tbf rules at the same time causes crashes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15059&quot;&gt;&lt;del&gt;LU-15059&lt;/del&gt;&lt;/a&gt; nrs: do not overwrite &quot;cmd&quot; in nrs_tbf_rule&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: ebef4989e39ef8cae29edcf26fa2ee16b6106ad6&lt;/p&gt;</comment>
                            <comment id="319548" author="pjones" created="Tue, 30 Nov 2021 13:56:42 +0000"  >&lt;p&gt;Landed for 2.15&lt;/p&gt;</comment>
                            <comment id="322066" author="gerrit" created="Fri, 7 Jan 2022 16:55:09 +0000"  >&lt;p&gt;&quot;Etienne AUJAMES &amp;lt;eaujames@ddn.com&amp;gt;&quot; uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/46002&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/46002&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15059&quot; title=&quot;Setting several tbf rules at the same time causes crashes&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15059&quot;&gt;&lt;del&gt;LU-15059&lt;/del&gt;&lt;/a&gt; nrs: do not overwrite &quot;cmd&quot; in nrs_tbf_rule&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_12&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: c1b8794782bd33bf7d6cfc7fabf04c0702f8f49c&lt;/p&gt;</comment>
                            <comment id="400429" author="adilger" created="Sat, 20 Jan 2024 04:29:44 +0000"  >&lt;p&gt;Etienne, I was running an interop test between master (2.15.60) and 2.15.4 running sanityn, and test_77q failed:&lt;br/&gt;
&lt;a href=&quot;https://testing.whamcloud.com/test_sessions/41cf8a97-7a9b-4423-ab3a-9654579f02e7&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://testing.whamcloud.com/test_sessions/41cf8a97-7a9b-4423-ab3a-9654579f02e7&lt;/a&gt;&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;== sanityn test 77q: Parallel TBF rule definitions should not panic ==== 19:09:35 (1705691375)
CMD: trevis-101vm6 /usr/sbin/lctl set_param mds.MDS.mdt.nrs_policies=tbf
mds.MDS.mdt.nrs_policies=tbf
CMD: trevis-101vm6 /usr/sbin/lctl set_param mds.MDS.mdt.nrs_tbf_rule=&apos;start rule77q_1 uid={ 500  11 3}&amp;amp;gid={500 10 33   100 } rate=100&apos;
CMD: trevis-101vm6 /usr/sbin/lctl set_param mds.MDS.mdt.nrs_tbf_rule=&apos;start rule77q_2 uid={1000}&amp;amp;gid={1000} rate=100&apos;
trevis-101vm6: error: set_param: setting /sys/kernel/debug/lustre/mds/MDS/mdt/nrs_tbf_rule=start rule77q_1 uid={ 500  11 3}&amp;amp;gid={500 10 33   100 } rate=100: Invalid argument
mds.MDS.mdt.nrs_tbf_rule=start rule77q_2 uid={1000}&amp;amp;gid={1000} rate=100
pdsh@trevis-101vm1: trevis-101vm6: ssh exited with exit code 22
 sanityn test_77q: @@@@@@ FAIL: 1: Fail to start TBF rule &apos;rule77q_1&apos; 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Could you please look into this.  Does the version check need to be updated?  The patch was landed in commit v2_14_55-153-gebef4989e3 and had a version check for this.&lt;/p&gt;</comment>
                            <comment id="400447" author="adilger" created="Sat, 20 Jan 2024 16:38:41 +0000"  >&lt;p&gt;It looks like test_77r is also failing sanityn interop, and needs to be excluded in one way or another. &lt;/p&gt;</comment>
                            <comment id="400471" author="eaujames" created="Mon, 22 Jan 2024 09:01:47 +0000"  >&lt;p&gt;Yes, I will push a fix.&lt;/p&gt;</comment>
                            <comment id="400486" author="eaujames" created="Mon, 22 Jan 2024 11:02:35 +0000"  >&lt;p&gt;I pushed a fix in &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-17452&quot; title=&quot;fix interop sanityn tests with b2_15 &quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-17452&quot;&gt;LU-17452&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                        <issuelink>
            <issuekey id="80155">LU-17452</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i0268v:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>