<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 03:31:16 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-16941] MDT deadlock involving lquota_wb</title>
                <link>https://jira.whamcloud.com/browse/LU-16941</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Hello! Two days ago, we upgraded the Lustre servers of Sherlock&apos;s scratch (Fir) from 2.12.9 to 2.15.3. The clients were already running 2.15.x. The server upgrade went smoothly but I noticed one issue today: we got an alert that a Robinhood changelog reader&#160; (also running 2.15.3) was stuck, and indeed:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Jul  4 11:39:40 fir-rbh01 kernel: INFO: task robinhood:8863 blocked for more than 120 seconds.
Jul  4 11:39:40 fir-rbh01 kernel: &quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
Jul  4 11:39:40 fir-rbh01 kernel: robinhood       D ffff9bacf6375280     0  8863      1 0x00000080
Jul  4 11:39:40 fir-rbh01 kernel: Call Trace:
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac7b8bf9&amp;gt;] schedule_preempt_disabled+0x29/0x70
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac7b6ca7&amp;gt;] __mutex_lock_slowpath+0xc7/0x1e0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac7b602f&amp;gt;] mutex_lock+0x1f/0x33
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac7ae6ed&amp;gt;] lookup_slow+0x33/0xab
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac26bf8e&amp;gt;] path_lookupat+0x89e/0x8d0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac235c6b&amp;gt;] ? kmem_cache_alloc+0x19b/0x1f0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac26e16f&amp;gt;] ? getname_flags+0x4f/0x1a0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac26bfeb&amp;gt;] filename_lookup+0x2b/0xd0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac26f337&amp;gt;] user_path_at_empty+0x67/0xc0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac25e493&amp;gt;] ? fput+0x13/0x20
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac26f3a1&amp;gt;] user_path_at+0x11/0x20
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac2618a3&amp;gt;] vfs_fstatat+0x63/0xd0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac261d01&amp;gt;] SYSC_newlstat+0x31/0x70
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac1468f6&amp;gt;] ? __audit_syscall_exit+0x1f6/0x2b0
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac2621ce&amp;gt;] SyS_newlstat+0xe/0x20
Jul  4 11:39:40 fir-rbh01 kernel: [&amp;lt;ffffffffac7c539a&amp;gt;] system_call_fastpath+0x25/0x2a
Jul  4 11:39:40 fir-rbh01 kernel: INFO: task robinhood:8864 blocked for more than 120 seconds.
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I tried to restart the client and Robinhood multiple times without success, same issue. When I look at the MDT (fir-MDT0001), which is currently still running, I can notice a constant CPU load from this process:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                         
 12875 root      20   0       0      0      0 S  13.6  0.0  39:41.42 lquota_wb_fir-M   
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Looking further with crash, there seems to be two lquota_wb threads (12871 and 12875):&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; bt 12871
PID: 12871  TASK: ffff8ccd6c9b0000  CPU: 6   COMMAND: &quot;lquota_wb_fir-M&quot;
 #0 [ffff8ccd6ca93cf0] __schedule at ffffffffb07b78d8
 #1 [ffff8ccd6ca93d58] schedule at ffffffffb07b7ca9
 #2 [ffff8ccd6ca93d68] schedule_timeout at ffffffffb07b5778
 #3 [ffff8ccd6ca93e10] qsd_upd_thread at ffffffffc175a665 [lquota]
 #4 [ffff8ccd6ca93ec8] kthread at ffffffffb00cb621
 #5 [ffff8ccd6ca93f50] ret_from_fork_nospec_begin at ffffffffb07c51dd
crash&amp;gt; bt 12875
PID: 12875  TASK: ffff8ccd6cba1080  CPU: 2   COMMAND: &quot;lquota_wb_fir-M&quot;
 #0 [ffff8ccd6cbabcf0] __schedule at ffffffffb07b78d8
 #1 [ffff8ccd6cbabd58] schedule at ffffffffb07b7ca9
 #2 [ffff8ccd6cbabd68] schedule_timeout at ffffffffb07b5778
 #3 [ffff8ccd6cbabe10] qsd_upd_thread at ffffffffc175a665 [lquota]
 #4 [ffff8ccd6cbabec8] kthread at ffffffffb00cb621
 #5 [ffff8ccd6cbabf50] ret_from_fork_nospec_begin at ffffffffb07c51dd
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;At first glance, as this involves lquota_wb, this looks similar to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15283&quot; title=&quot;The quota reint thread maybe dead lock with lquota_wb thread&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15283&quot;&gt;&lt;del&gt;LU-15283&lt;/del&gt;&lt;/a&gt;, but I can&apos;t find the same backtrace signature. And the issue should be fixed in 2.15.3.&lt;/p&gt;

&lt;p&gt;However, the following thread is suspect for me, as it involves lquota and seems stuck:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;crash&amp;gt; bt 32263
PID: 32263  TASK: ffff8cdd78a21080  CPU: 5   COMMAND: &quot;mdt01_011&quot;
 #0 [ffff8ccd6b5a3640] __schedule at ffffffffb07b78d8
 #1 [ffff8ccd6b5a36a8] schedule at ffffffffb07b7ca9
 #2 [ffff8ccd6b5a36b8] schedule_timeout at ffffffffb07b5778
 #3 [ffff8ccd6b5a3760] schedule_timeout_interruptible at ffffffffb07b58fe
 #4 [ffff8ccd6b5a3770] qsd_acquire at ffffffffc175e915 [lquota]
 #5 [ffff8ccd6b5a3810] qsd_op_begin0 at ffffffffc175f6bf [lquota]
 #6 [ffff8ccd6b5a38b0] qsd_op_begin at ffffffffc1760142 [lquota]
 #7 [ffff8ccd6b5a38f8] osd_declare_qid at ffffffffc17ffcb3 [osd_ldiskfs]
 #8 [ffff8ccd6b5a3950] osd_declare_attr_qid at ffffffffc17bbdfc [osd_ldiskfs]
 #9 [ffff8ccd6b5a39a8] osd_declare_attr_set at ffffffffc17bffbe [osd_ldiskfs]
#10 [ffff8ccd6b5a39f8] lod_sub_declare_attr_set at ffffffffc1a7752c [lod]
#11 [ffff8ccd6b5a3a48] lod_declare_attr_set at ffffffffc1a5854b [lod]
#12 [ffff8ccd6b5a3ac8] mdd_attr_set at ffffffffc1d1ab32 [mdd]
#13 [ffff8ccd6b5a3b48] mdt_attr_set at ffffffffc193ac36 [mdt]
#14 [ffff8ccd6b5a3b98] mdt_reint_setattr at ffffffffc193bd3a [mdt]
#15 [ffff8ccd6b5a3c10] mdt_reint_rec at ffffffffc193e69a [mdt]
#16 [ffff8ccd6b5a3c38] mdt_reint_internal at ffffffffc1913f3c [mdt]
#17 [ffff8ccd6b5a3c78] mdt_reint at ffffffffc191e807 [mdt]
#18 [ffff8ccd6b5a3ca8] tgt_request_handle at ffffffffc13b725f [ptlrpc]
#19 [ffff8ccd6b5a3d38] ptlrpc_server_handle_request at ffffffffc1360aa3 [ptlrpc]
#20 [ffff8ccd6b5a3df0] ptlrpc_main at ffffffffc1362734 [ptlrpc]
#21 [ffff8ccd6b5a3ec8] kthread at ffffffffb00cb621
#22 [ffff8ccd6b5a3f50] ret_from_fork_nospec_begin at ffffffffb07c51dd
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;For now, I am attaching a live &quot;foreach bt&quot; taken from this MDS as &lt;span class=&quot;nobr&quot;&gt;&lt;a href=&quot;https://jira.whamcloud.com/secure/attachment/49607/49607_fir-md1-s2_fir-MDT0001_foreach_bt_lquota_wb_20230704.txt&quot; title=&quot;fir-md1-s2_fir-MDT0001_foreach_bt_lquota_wb_20230704.txt attached to LU-16941&quot;&gt;fir-md1-s2_fir-MDT0001_foreach_bt_lquota_wb_20230704.txt&lt;sup&gt;&lt;img class=&quot;rendericon&quot; src=&quot;https://jira.whamcloud.com/images/icons/link_attachment_7.gif&quot; height=&quot;7&quot; width=&quot;7&quot; align=&quot;absmiddle&quot; alt=&quot;&quot; border=&quot;0&quot;/&gt;&lt;/sup&gt;&lt;/a&gt;&lt;/span&gt;. Let me know if you have any idea on how to fix this. Thanks much.&lt;/p&gt;</description>
                <environment></environment>
        <key id="76829">LU-16941</key>
            <summary>MDT deadlock involving lquota_wb</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="1" iconUrl="https://jira.whamcloud.com/images/icons/statuses/open.png" description="The issue is open and ready for the assignee to start work on it.">Open</status>
                    <statusCategory id="2" key="new" colorName="default"/>
                                    <resolution id="-1">Unresolved</resolution>
                                        <assignee username="wc-triage">WC Triage</assignee>
                                    <reporter username="sthiell">Stephane Thiell</reporter>
                        <labels>
                    </labels>
                <created>Tue, 4 Jul 2023 19:49:06 +0000</created>
                <updated>Thu, 6 Jul 2023 05:58:13 +0000</updated>
                                            <version>Lustre 2.15.3</version>
                                                        <due></due>
                            <votes>0</votes>
                                    <watches>4</watches>
                                                                            <comments>
                            <comment id="377511" author="sthiell" created="Wed, 5 Jul 2023 16:09:48 +0000"  >&lt;p&gt;So it looks like we have a user doing parallel chgrp operations, on the same group of files, which might be triggering this issue.&lt;/p&gt;

&lt;p&gt;From fir-MDT0001:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:00020000:8.0:1688572070.004396:0:11568:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1688571770, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0001_UUID lock: ffff891c70dc2400/0x7f8123703e0ca85 lrc: 3/0,1 mode: --/PW res: [0x24007e29d:0xb9ce:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT gid 0 flags: 0x40210400000020 nid: local remote: 0x0 expref: -99 pid: 11568 timeout: 0 lvb_type: 0
00010000:00020000:19.0:1688572076.226395:0:11683:0:(ldlm_request.c:124:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1688571776, 300s ago); not entering recovery in server code, just going back to sleep ns: mdt-fir-MDT0001_UUID lock: ffff8942c4abb600/0x7f81237041bc52a lrc: 3/1,0 mode: --/PR res: [0x24007e29d:0xb9ce:0x0].0x0 bits 0x13/0x48 rrc: 5 type: IBT gid 0 flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 11683 timeout: 0 lvb_type: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;[root@fir-rbh01 ~]# lfs fid2path /fir 0x24007e29d:0xb9ce:0x0
/fir/users/sschulz/porteus_all_cells/01_Combined_Germline_Sigprofiler.pdf

# it&apos;s on MDT0001:
[root@fir-rbh01 ~]# lfs getdirstripe /fir/users/sschulz/porteus_all_cells
lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Looking on the cluster, at least three different jobs doing &lt;b&gt;chgrp&lt;/b&gt; are stuck on the same file.&lt;/p&gt;

&lt;p&gt;If I restart fir-MDT0000, the situation gets resolved for a few minutes, after the recovery. Debug logs when fir-MDT0000 was restarted:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00010000:02000000:0.0:1688571753.052615:0:16567:0:(ldlm_lib.c:1760:target_finish_recovery()) fir-MDT0000: Recovery over after 3:38, of 1691 clients 1690 recovered and 1 was evicted.
...
00040000:00000400:17.0:1688571753.142247:0:16540:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 42000 seconds
00040000:00000400:9.0:1688571753.146880:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 42000 seconds
00040000:00000400:33.0:1688571753.186435:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 84000 seconds
00040000:00000400:33.0:1688571753.512182:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 126000 seconds
00040000:00000400:17.0:1688571753.613086:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 168000 seconds
00040000:00000400:41.0:1688571753.697687:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 210000 seconds
00040000:00000400:33.0:1688571753.929210:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 252000 seconds
00040000:00000400:25.0:1688571754.073319:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 294000 seconds
00040000:00000400:33.0:1688571754.161063:0:16544:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 336000 seconds
00040000:00000400:25.0:1688571754.161428:0:16540:0:(qsd_writeback.c:539:qsd_upd_thread()) fir-MDT0000: The reintegration thread [2] blocked more than 84000 seconds
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;but then, quickly, the lquota_wb thread loads again on fir-MDT0001 and Robinhood gets stuck again, this time on another file of this user involving chgrp.&lt;/p&gt;</comment>
                            <comment id="377514" author="sthiell" created="Wed, 5 Jul 2023 16:20:02 +0000"  >&lt;p&gt;I confirmed that the user is simply doing a chgrp at the end of their job script (using a glob, which is not great, but nothing really crazy).&#160; We also verified that one of the files in this glob is hanging.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;...
rm -rf $RESULTS_DIR/tmp*                                                        
echo &quot;results directory is ${RESULTS_DIR}&quot;                                      
echo &quot;final directory is ${FINAL_DIR}&quot;                                          
                                                                                
rm -rf $RESULTS_DIR/*output                                                     
mkdir -p $FINAL_DIR                                                             
                                                                                
                                                                                
chgrp oak_cgawad $RESULTS_DIR/*                # &amp;lt;&amp;lt;&amp;lt;                                 
chmod g+rwx $RESULTS_DIR/*                                                      
rsync -a $RESULTS_DIR/* $FINAL_DIR                                              
                                                                                
echo &quot;current directory is `pwd`&quot;                                               
#rm -rf $RESULTS_DIR                                                            
                                                                                
echo &quot;### Manta SV done  ###&quot;  &lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;This was running fine with Lustre 2.12.9 servers before the upgrade to 2.15.3.&lt;/p&gt;</comment>
                            <comment id="377658" author="sthiell" created="Thu, 6 Jul 2023 05:58:13 +0000"  >&lt;p&gt;This might be related to &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-15880&quot; title=&quot;ASSERTION( lqe-&amp;gt;u.se.lse_pending_write == 0 )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-15880&quot;&gt;&lt;del&gt;LU-15880&lt;/del&gt;&lt;/a&gt;. I backported a patch (&lt;a href=&quot;https://review.whamcloud.com/c/fs/lustre-release/+/51588&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;quota: fix issues in reserving quota&lt;/a&gt;) that is missing in 2.15.3 that seems to fix quota issues with chgrp. We&apos;re trying it in production now.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                            <attachment id="49607" name="fir-md1-s2_fir-MDT0001_foreach_bt_lquota_wb_20230704.txt" size="615432" author="sthiell" created="Tue, 4 Jul 2023 19:46:16 +0000"/>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|i03pkn:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9223372036854775807</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>