<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 02:50:41 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-12221] LBUG: (statahead.c:477:ll_sai_put()) ASSERTION( !sa_has_callback(sai) )</title>
                <link>https://jira.whamcloud.com/browse/LU-12221</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash_arm64&amp;gt; dmesg | grep Build
[  409.666886] Lustre: Lustre: Build Version: 2.11.0.200_cray_79_g4c42971

 crash_arm64&amp;gt; dmesg | grep &apos;: 102742:&apos;
[63861.735486] LustreError: 102742:0:(statahead.c:478:ll_sai_put()) ASSERTION( !sa_has_callback(sai) ) failed:
[63861.744009] LustreError: 102742:0:(statahead.c:478:ll_sai_put()) LBUG

crash_arm64&amp;gt; bt
PID: 102742  TASK: ffff80be57ca8000  CPU: 96  COMMAND: &quot;ll_sa_102181&quot;
 #0 [ffff00001fceba80] __crash_kexec at ffff000008138910
 #1 [ffff00001fcebbb0] (null) at ffffffc4
 #2 [ffff00001fcebbe0] __const_udelay at ffff00000837dba0
 #3 [ffff00001fcebbf0] panic at ffff0000081c6d6c
 #4 [ffff00001fcebcc0] lbug_with_loc at ffff000000a7c9e4 [libcfs]
 #5 [ffff00001fcebce0] ll_sai_put at ffff0000013d5658 [lustre]   @ llite/statahead.c: 476
 #6 [ffff00001fcebd50] ll_statahead_thread at ffff0000013da360 [lustre]    @ llite/statahead.c: 1204
 #7 [ffff00001fcebe70] kthread at ffff0000080cb3a4
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This LBUG was first seen with a Lustre 2.7 client. It can be easily reproduced with a heavy load of LTP statahead tests on a system with ~300 ARM nodes with 64 cores per node.&lt;/p&gt;

&lt;p&gt;The situation at the time of the LBUG: The ll_statahead_thread ll_sa_102181 has been stopped (sai_thread.t_flags == SVC_STOPPED) and is making its final call to ll_sai_put before returning through the out: code block. The LBUG is issued because a new entry has been placed on the sai_interim_entries list - a reply from the MDS was received and handled by ll_statahead_interpret. The ll_statahead_thread did not wait for the rpc to be completely processed before trying to exit.&lt;/p&gt;

&lt;p&gt;Dump available upon request.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Values from bt -FF:
ll_inode_info ffff809d0d6d99c0
inode ffff809d0d6d9a48
ll_statahead_info ffff809d04566800

crash_arm64&amp;gt; epython sb_inodes -l ffff80be10087800 | grep ffff809d0d6d9a48
0xffff80be10087800 lustre       0xffff809d0d6d9a48 144119544061917665         4096      1 0xffff809d80728cc0 /lus/snx11260/ostest.vers/alsorun.20190125183527.25496.ranger/CL_LLTPreadlinkat02.3.N2EP2O.1548522513/reaIck9AT (deleted)
 
struct ll_statahead_info {
  sai_dentry = 0xffff809d80728cc0,
  sai_refcount = {
    counter = 0
  },
  sai_max = 2,
  sai_sent = 1,
  sai_replied = 1,
  sai_index = 2,
  sai_index_wait = 0,
  sai_hit = 0,
  sai_miss = 0,
  sai_consecutive_miss = 0,
  sai_miss_hidden = 0,
  sai_skip_hidden = 0,
  sai_ls_all = 0,
  sai_agl_valid = 0,
  sai_in_readpage = 0,
....
  sai_thread = {
    t_link = {
      next = 0x0,
      prev = 0x0
    },
    t_data = 0x0,
    t_flags = 1,
    t_id = 0,
    t_pid = 102742,
....
  sai_interim_entries = {
    next = 0xffff80be5aae9480,
    prev = 0xffff80be5aae9480
  },
....
struct sa_entry {
  se_list = {
    next = 0xffff809d04566958,
    prev = 0xffff809d04566958
  },
  se_hash = {
    next = 0xffff809d04566988,
    prev = 0xffff809d04566988
  },
  se_index = 1,
  se_handle = 5636371587488280727,
  se_state = SA_ENTRY_INIT,
  se_size = 128,
  se_minfo = 0xffff80be5b17c400,
  se_req = 0xffff80be345d1340,
  se_inode = 0x0,
  se_qstr = {
    {
      {
        hash = 3580636640,
        len = 12
      },
      hash_len = 55120244192
    },
    name = 0xffff80be5aae94f0 &quot;symlink_file&quot;
  },
  se_fid = {
    f_seq = 8590194229,
    f_oid = 31203
    f_ver = 0
  }

crash_arm64&amp;gt; ll_statahead_info.sai_thread ffff809d04566800 | grep t_flags
    t_flags = 1,     == SVC_STOPPED

crash_arm64&amp;gt; epython dumprpc 0xffff80be345d1340
thread         pid    ptlrpc_request      xid                nid                 opc  phase:flags    R:W  sent/deadline          ptlrpc_body
==================================================================================================================================================
               7495   0xffff80be345d1340  1623677299024992   snx11260-MDT0000    COMP:R         0:0  1548522747/1548522842  0xffff80be1295d838
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I think the likely cause of the bug is a tricky timing issue with ARM processors due to their use of a weak memory consistency model. There&apos;s a race between a ptlrpcd daemon handling a statahead reply and the ll_statahead_thread trying to clear entries from its replied list when the thread tries to terminate. The ll_statahead_thread sees the updates made by rpc handler out of order.&lt;/p&gt;

&lt;p&gt;While terminating, the ll_statahead_thread waits for replies to all rpcs it&apos;s already sent. It then clears the entries from the reply list created by the rpc handler by calling sa_handle_callback. If the list is empty, sa_handle_callback does nothing.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;1184         while (sai-&amp;gt;sai_sent != sai-&amp;gt;sai_replied) {
1185                 /* in case we&apos;re not woken up, timeout wait */
1186                 lwi = LWI_TIMEOUT(msecs_to_jiffies(MSEC_PER_SEC &amp;gt;&amp;gt; 3),
1187                                   NULL, NULL);
1188                 l_wait_event(sa_thread-&amp;gt;t_ctl_waitq,
1189                         sai-&amp;gt;sai_sent == sai-&amp;gt;sai_replied, &amp;amp;lwi);
1190         }
1191
1192         /* release resources held by statahead RPCs */
1193         sa_handle_callback(sai);    
.....
1204         ll_sai_put(sai);              &amp;lt;---- LBUG 
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Meanwhile a reply is received and handled by a ptlrpcd daemon. It executes the following code from ll_statahead_interpret:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;700         spin_lock(&amp;amp;lli-&amp;gt;lli_sa_lock);
701         if (rc != 0) {
702                 if (__sa_make_ready(sai, entry, rc))
703                         waitq = &amp;amp;sai-&amp;gt;sai_waitq;
704         } else {
705                 entry-&amp;gt;se_minfo = minfo;
706                 entry-&amp;gt;se_req = ptlrpc_request_addref(req);
....
711                 entry-&amp;gt;se_handle = handle;
712                 if (!sa_has_callback(sai))
713                         waitq = &amp;amp;sai-&amp;gt;sai_thread.t_ctl_waitq;
714
715                 list_add_tail(&amp;amp;entry-&amp;gt;se_list, &amp;amp;sai-&amp;gt;sai_interim_entries);       &amp;lt;--- list of rpc replies
716         }
717         sai-&amp;gt;sai_replied++;
718
719         smp_mb();
720         if (waitq != NULL)
721                 wake_up(waitq);
722         spin_unlock(&amp;amp;lli-&amp;gt;lli_sa_lock);
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;The LBUG occurs because sai_sent equals sai_replied but the sai_interim_entries list is not empty. The likely explanation for this state is that ll_statahead_thread sees the increment of sai_replied made by the ptlrpcd daemon but doesn&apos;t see the update to the sai_interim_entries list. ll_statahead_thread exists the wait event loop and calls sa_handle_callback. sa_handle_callback then sees an empty list and exits. But by the time sai_put is called, the update to sai_interim_entries is visible and triggers the assertion.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt; 645 static void sa_handle_callback(struct ll_statahead_info *sai)
 646 {
 647         struct ll_inode_info *lli;
 648
 649         lli = ll_i2info(sai-&amp;gt;sai_dentry-&amp;gt;d_inode);
 650
 651         while (sa_has_callback(sai)) {         &amp;lt;--- list update not yet visisible
 652                 struct sa_entry *entry;
 653
 654                 spin_lock(&amp;amp;lli-&amp;gt;lli_sa_lock);
 655                 if (unlikely(!sa_has_callback(sai))) {
 656                         spin_unlock(&amp;amp;lli-&amp;gt;lli_sa_lock);
 657                         break;
 658                 }
 659                 entry = list_entry(sai-&amp;gt;sai_interim_entries.next,
 660                                    struct sa_entry, se_list);
 661                 list_del_init(&amp;amp;entry-&amp;gt;se_list);
 662                 spin_unlock(&amp;amp;lli-&amp;gt;lli_sa_lock);
 663
 664                 sa_instantiate(sai, entry);
 665         }
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;There are two possible fixes:&lt;br/&gt;
1. Add a memory barrier in ll_statahead_interpret to enforce the expected update ordering.&lt;br/&gt;
2. Hold the lli_sa_lock spinlock whenever sa_has_callback() is called in sa_handle_callback().&lt;/p&gt;

&lt;p&gt;I think 2 is the better, more easily understood, more maintainable solution. Patch is on the way.&lt;/p&gt;
</description>
                <environment></environment>
        <key id="55483">LU-12221</key>
            <summary>LBUG: (statahead.c:477:ll_sai_put()) ASSERTION( !sa_has_callback(sai) )</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="3" iconUrl="https://jira.whamcloud.com/images/icons/priorities/major.svg">Major</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="amk">Ann Koehler</assignee>
                                    <reporter username="amk">Ann Koehler</reporter>
                        <labels>
                    </labels>
                <created>Wed, 24 Apr 2019 22:13:59 +0000</created>
                <updated>Thu, 16 Sep 2021 07:48:25 +0000</updated>
                            <resolved>Sat, 25 May 2019 05:33:39 +0000</resolved>
                                    <version>Lustre 2.11.0</version>
                                    <fixVersion>Lustre 2.13.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="246370" author="gerrit" created="Thu, 25 Apr 2019 19:25:01 +0000"  >&lt;p&gt;Ann Koehler (amk@cray.com) uploaded a new patch: &lt;a href=&quot;https://review.whamcloud.com/34760&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34760&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12221&quot; title=&quot;LBUG: (statahead.c:477:ll_sai_put()) ASSERTION( !sa_has_callback(sai) )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12221&quot;&gt;&lt;del&gt;LU-12221&lt;/del&gt;&lt;/a&gt; statahead: sa_handle_callback get lli_sa_lock earlier&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 702009c4b4f74ad77faf3b6743a21223faeca862&lt;/p&gt;</comment>
                            <comment id="246380" author="adilger" created="Thu, 25 Apr 2019 22:27:00 +0000"  >&lt;p&gt;I had to laugh a bit at your comment:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;It can be easily reproduced with a heavy load of LTP statahead tests on a system with ~300 ARM nodes with 64 cores per node.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Only in HPC/Cray is a 300 clients x 24 cores test an &quot;easy reproducer&quot;. &lt;img class=&quot;emoticon&quot; src=&quot;https://jira.whamcloud.com/images/icons/emoticons/smile.png&quot; height=&quot;16&quot; width=&quot;16&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/p&gt;

&lt;p&gt;Thanks as always for your excellent analysis. I agree that using the spinlock is more easily understood and maintained compared to a memory barrier. &lt;/p&gt;</comment>
                            <comment id="246563" author="amk" created="Tue, 30 Apr 2019 18:45:30 +0000"  >&lt;p&gt;Andreas, Glad you caught the humor. I chuckled a bit myself when I wrote that. But in the Cray world, being able to consistently reproduce a bug with a small test suite on a specific configuration does constitute easy even if the config is not one many people have.&lt;/p&gt;</comment>
                            <comment id="247685" author="gerrit" created="Sat, 25 May 2019 04:53:42 +0000"  >&lt;p&gt;Oleg Drokin (green@whamcloud.com) merged in patch &lt;a href=&quot;https://review.whamcloud.com/34760/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://review.whamcloud.com/34760/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-12221&quot; title=&quot;LBUG: (statahead.c:477:ll_sai_put()) ASSERTION( !sa_has_callback(sai) )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-12221&quot;&gt;&lt;del&gt;LU-12221&lt;/del&gt;&lt;/a&gt; statahead: sa_handle_callback get lli_sa_lock earlier&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 31ef093c219798696e25535941f48636af662802&lt;/p&gt;</comment>
                            <comment id="247700" author="pjones" created="Sat, 25 May 2019 05:33:39 +0000"  >&lt;p&gt;Landed for 2.13&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i00fbj:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>