<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:38:10 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3931] mdt_hsm_release() calls ldlm_lock_cancel() but does not reprocess resource</title>
                <link>https://jira.whamcloud.com/browse/LU-3931</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;To reproduce, setup HSM, and do:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
# cd /mnt/lustre
# touch f0
# lfs hsm_archive f0
# # Wait for achive to complete.
# echo +hsm &amp;gt; /proc/sys/lnet/debug
# lctl clear; while true; do lfs hsm_release f0; done
...
Cannot send HSM request (use of f0): Device or resource busy
...

## In another shell:
# cd /mnt/lustre
# while true; do sys_open f0 r; done ## calls open(&quot;f0&quot;, O_RDONLY)
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;Soon the MDT handler for open() will wedge trying to get a CR OPEN lock on f0.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;# stack lfs
21597 lfs
[&amp;lt;ffffffffa03029bd&amp;gt;] mdc_enqueue+0x22d/0x1a10 [mdc]
[&amp;lt;ffffffffa030439d&amp;gt;] mdc_intent_lock+0x1fd/0x64a [mdc]
[&amp;lt;ffffffffa02c01b3&amp;gt;] lmv_intent_open+0x213/0x8d0 [lmv]
[&amp;lt;ffffffffa02c0b2b&amp;gt;] lmv_intent_lock+0x2bb/0x380 [lmv]
[&amp;lt;ffffffffa0923b25&amp;gt;] ll_revalidate_it+0x275/0x1b20 [lustre]
[&amp;lt;ffffffffa0925503&amp;gt;] ll_revalidate_nd+0x133/0x3e0 [lustre]
[&amp;lt;ffffffff81191cf6&amp;gt;] do_lookup+0x66/0x230
[&amp;lt;ffffffff811925f4&amp;gt;] __link_path_walk+0x734/0x1030
[&amp;lt;ffffffff8119317a&amp;gt;] path_walk+0x6a/0xe0
[&amp;lt;ffffffff8119334b&amp;gt;] do_path_lookup+0x5b/0xa0
[&amp;lt;ffffffff8119428b&amp;gt;] do_filp_open+0xfb/0xdc0
[&amp;lt;ffffffff8117f849&amp;gt;] do_sys_open+0x69/0x140
[&amp;lt;ffffffff8117f960&amp;gt;] sys_open+0x20/0x30
[&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff

# stack sys_open
21596 sys_open
[&amp;lt;ffffffffa0de0641&amp;gt;] cfs_waitq_timedwait+0x11/0x20 [libcfs]
[&amp;lt;ffffffffa1080730&amp;gt;] ptlrpc_set_wait+0x2f0/0x8c0 [ptlrpc]
[&amp;lt;ffffffffa1080d87&amp;gt;] ptlrpc_queue_wait+0x87/0x220 [ptlrpc]
[&amp;lt;ffffffffa105cee5&amp;gt;] ldlm_cli_enqueue+0x365/0x790 [ptlrpc]
[&amp;lt;ffffffffa0302a4e&amp;gt;] mdc_enqueue+0x2be/0x1a10 [mdc]
[&amp;lt;ffffffffa030439d&amp;gt;] mdc_intent_lock+0x1fd/0x64a [mdc]
[&amp;lt;ffffffffa02c01b3&amp;gt;] lmv_intent_open+0x213/0x8d0 [lmv]
[&amp;lt;ffffffffa02c0b2b&amp;gt;] lmv_intent_lock+0x2bb/0x380 [lmv]
[&amp;lt;ffffffffa0923b25&amp;gt;] ll_revalidate_it+0x275/0x1b20 [lustre]
[&amp;lt;ffffffffa0925503&amp;gt;] ll_revalidate_nd+0x133/0x3e0 [lustre]
[&amp;lt;ffffffff81191cf6&amp;gt;] do_lookup+0x66/0x230
[&amp;lt;ffffffff811925f4&amp;gt;] __link_path_walk+0x734/0x1030
[&amp;lt;ffffffff8119317a&amp;gt;] path_walk+0x6a/0xe0
[&amp;lt;ffffffff8119334b&amp;gt;] do_path_lookup+0x5b/0xa0
[&amp;lt;ffffffff8119428b&amp;gt;] do_filp_open+0xfb/0xdc0
[&amp;lt;ffffffff8117f849&amp;gt;] do_sys_open+0x69/0x140
[&amp;lt;ffffffff8117f960&amp;gt;] sys_open+0x20/0x30
[&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b
[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff

# stack mdt00_002
20391 mdt00_002
[&amp;lt;ffffffffa0de0641&amp;gt;] cfs_waitq_timedwait+0x11/0x20 [libcfs]
[&amp;lt;ffffffffa10620ad&amp;gt;] ldlm_completion_ast+0x4ed/0x960 [ptlrpc]
[&amp;lt;ffffffffa10617d0&amp;gt;] ldlm_cli_enqueue_local+0x1f0/0x5e0 [ptlrpc]
[&amp;lt;ffffffffa0726c9b&amp;gt;] mdt_object_lock0+0x33b/0xaf0 [mdt]
[&amp;lt;ffffffffa0727514&amp;gt;] mdt_object_lock+0x14/0x20 [mdt]
[&amp;lt;ffffffffa0750084&amp;gt;] mdt_object_open_lock+0x744/0x990 [mdt]
[&amp;lt;ffffffffa0757a3f&amp;gt;] mdt_reint_open+0xf8f/0x20a0 [mdt]
[&amp;lt;ffffffffa0740e71&amp;gt;] mdt_reint_rec+0x41/0xe0 [mdt]
[&amp;lt;ffffffffa0728c63&amp;gt;] mdt_reint_internal+0x4c3/0x780 [mdt]
[&amp;lt;ffffffffa07291ed&amp;gt;] mdt_intent_reint+0x1ed/0x520 [mdt]
[&amp;lt;ffffffffa07248ce&amp;gt;] mdt_intent_policy+0x3ae/0x770 [mdt]
[&amp;lt;ffffffffa1042441&amp;gt;] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
[&amp;lt;ffffffffa106b0ef&amp;gt;] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc]
[&amp;lt;ffffffffa0724d96&amp;gt;] mdt_enqueue+0x46/0xe0 [mdt]
[&amp;lt;ffffffffa072ba5a&amp;gt;] mdt_handle_common+0x52a/0x1470 [mdt]
[&amp;lt;ffffffffa0765775&amp;gt;] mds_regular_handle+0x15/0x20 [mdt]
[&amp;lt;ffffffffa109aaa5&amp;gt;] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
[&amp;lt;ffffffffa109bded&amp;gt;] ptlrpc_main+0xacd/0x1710 [ptlrpc]
[&amp;lt;ffffffff81096a36&amp;gt;] kthread+0x96/0xa0
[&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
[&amp;lt;ffffffffffffffff&amp;gt;] 0xffffffffffffffff
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;
&lt;p&gt;I think the close handler (for HSM release) cancels the EX OPEN lock before the LDLM_BL_CALLBACK can respond. Since the lock is already cancelled the resource never gets reprocessed and so the normal open lock is not granted.&lt;br/&gt;
Dumping the locks on the MDT side resource for f0 confirms this. We have a waiting CR lock which does not conflict with the grated locks.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;--- Resource: [0x200000400:0x2:0x0].0 (ffff880198cc9800) refcount = 4
### ### ns: mdt-lustre-MDT0000_UUID lock: ffff88021a008dc0/0x9af58bd61d165d82 lrc: 2/0,0 mode: CR/CR res: [0x200000400:0x2:0x0].0 bits 0x9 rrc: 4 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x9af58bd61d165d66 expref: 10 pid: 20722 timeout: 0 lvb_type: 0
### ### ns: mdt-lustre-MDT0000_UUID lock: ffff880198c29940/0x9af58bd61d165d5f lrc: 2/0,0 mode: PR/PR res: [0x200000400:0x2:0x0].0 bits 0x1b rrc: 4 type: IBT flags: 0x40200000000000 nid: 0@lo remote: 0x9af58bd61d165d43 expref: 10 pid: 20722 timeout: 0 lvb_type: 0
### ### ns: mdt-lustre-MDT0000_UUID lock: ffff8801a6497980/0x9af58bd61d16ee59 lrc: 3/1,0 mode: --/CR res: [0x200000400:0x2:0x0].0 bits 0x4 rrc: 4 type: IBT flags: 0x40210000000000 nid: local remote: 0x0 expref: -99 pid: 20391 timeout: 0 lvb_type: 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment></environment>
        <key id="20906">LU-3931</key>
            <summary>mdt_hsm_release() calls ldlm_lock_cancel() but does not reprocess resource</summary>
                <type id="7" iconUrl="https://jira.whamcloud.com/images/icons/issuetypes/task_agile.png">Technical task</type>
                            <parent id="20020">LU-3647</parent>
                                    <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="6" iconUrl="https://jira.whamcloud.com/images/icons/statuses/closed.png" description="The issue is considered finished, the resolution is correct. Issues which are closed can be reopened.">Closed</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="jhammond">John Hammond</assignee>
                                    <reporter username="jhammond">John Hammond</reporter>
                        <labels>
                            <label>HSM</label>
                    </labels>
                <created>Wed, 11 Sep 2013 16:53:10 +0000</created>
                <updated>Tue, 24 Sep 2013 18:24:05 +0000</updated>
                            <resolved>Tue, 24 Sep 2013 18:23:57 +0000</resolved>
                                    <version>Lustre 2.5.0</version>
                                    <fixVersion>Lustre 2.5.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>3</watches>
                                                                            <comments>
                            <comment id="66377" author="jay" created="Wed, 11 Sep 2013 18:10:06 +0000"  >&lt;p&gt;indeed, can you please work out a patch for this?&lt;/p&gt;</comment>
                            <comment id="66425" author="jhammond" created="Wed, 11 Sep 2013 21:58:31 +0000"  >&lt;p&gt;Please see &lt;a href=&quot;http://review.whamcloud.com/7621&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/7621&lt;/a&gt;.&lt;/p&gt;</comment>
                    </comments>
                    <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzw1vz:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>10387</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                                                                                </customfields>
    </item>
</channel>
</rss>