<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:36:18 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-3716] mds-survey ASSERTION( lu_device_is_mdt(o-&gt;lo_dev) )</title>
                <link>https://jira.whamcloud.com/browse/LU-3716</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;Observed fairly repeatably when running mds-survey.sh after the first few million creates.  It appears unlikely to occur when creating less than 100,00 objects (the default).&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-08-06 15:05:51 LustreError: 4379:0:(mdt_handler.c:2352:mdt_obj()) ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) ) failed:
2013-08-06 15:05:51 LustreError: 4379:0:(mdt_handler.c:2352:mdt_obj()) LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The failures where observed with code post 2.4.0 which includes the &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-2758&quot; title=&quot;LBUG in mdt_obj() when accessing objects by FID&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-2758&quot;&gt;&lt;del&gt;LU-2758&lt;/del&gt;&lt;/a&gt; fix.&lt;/p&gt;</description>
                <environment></environment>
        <key id="20217">LU-3716</key>
            <summary>mds-survey ASSERTION( lu_device_is_mdt(o-&gt;lo_dev) )</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="2" iconUrl="https://jira.whamcloud.com/images/icons/priorities/critical.svg">Critical</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="yujian">Jian Yu</assignee>
                                    <reporter username="behlendorf">Brian Behlendorf</reporter>
                        <labels>
                            <label>HB</label>
                            <label>llnl</label>
                            <label>server</label>
                            <label>zfs</label>
                    </labels>
                <created>Tue, 6 Aug 2013 22:19:32 +0000</created>
                <updated>Mon, 30 Jan 2017 16:04:30 +0000</updated>
                            <resolved>Fri, 23 Jan 2015 05:21:45 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                    <version>Lustre 2.5.1</version>
                                    <fixVersion>Lustre 2.7.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>10</watches>
                                                                            <comments>
                            <comment id="63737" author="pjones" created="Tue, 6 Aug 2013 22:37:59 +0000"  >&lt;p&gt;Yu Jian&lt;/p&gt;

&lt;p&gt;Could you please look into this one?&lt;/p&gt;

&lt;p&gt;Thanks&lt;/p&gt;

&lt;p&gt;Peter&lt;/p&gt;</comment>
                            <comment id="63752" author="bzzz" created="Wed, 7 Aug 2013 04:45:57 +0000"  >&lt;p&gt;Brian, can you paste the backtrace please?&lt;/p&gt;</comment>
                            <comment id="63770" author="yujian" created="Wed, 7 Aug 2013 13:03:43 +0000"  >&lt;p&gt;Hi Brian,&lt;/p&gt;

&lt;p&gt;Could you please tell me what the number of &quot;file_count&quot; you specified while running mds-survey.sh to repeat the failure?&lt;/p&gt;

&lt;p&gt;I tested the following numbers separately on the latest Lustre b2_4 branch (build &lt;a href=&quot;http://build.whamcloud.com/job/lustre-b2_4/27/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://build.whamcloud.com/job/lustre-b2_4/27/&lt;/a&gt;), all the test runs passed.&lt;/p&gt;

&lt;p&gt;file_count=150000&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/test_sets/b6c02f58-ff4a-11e2-a3fb-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/b6c02f58-ff4a-11e2-a3fb-52540035b04c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;file_count=1000000&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/test_sets/a45000fe-ff5f-11e2-a3fb-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/a45000fe-ff5f-11e2-a3fb-52540035b04c&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;file_count=1610021&lt;br/&gt;
&lt;a href=&quot;https://maloo.whamcloud.com/test_sets/b50fc3b6-ff5f-11e2-a3fb-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/b50fc3b6-ff5f-11e2-a3fb-52540035b04c&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="63778" author="behlendorf" created="Wed, 7 Aug 2013 16:03:12 +0000"  >&lt;p&gt;Sorry, I meant to add the full backtrace was distracted.  Here it is.&lt;/p&gt;

&lt;p&gt;I&apos;ve hit this failure a few more times now and while I haven&apos;t really investigated I&apos;m less inclined now to think it has anything to do with the number of files.  I haven&apos;t been able to reproduce the problem when the filesystem is idle aside from mds-survey.  This makes a lot of sense looking at the stack since it appears the offending call path occurred while handling a request from a real client.  When there&apos;s a user workload on the system as well I can trigger this fairly easily.  We do have full crash dumps available if needed and since this is a test system it&apos;s easy to add a debugging patch.&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;2013-08-06 15:22:10 Lustre: lcz-MDT0000: Recovery over after 1:15, of 146 clients 146 recovered and 0 were evicted.
2013-08-06 15:23:34 Lustre: Echo OBD driver; http://www.lustre.org/
2013-08-06 15:23:34 LustreError: 4487:0:(echo_client.c:1959:echo_md_destroy_internal()) Can not unlink child tests: rc = -39
2013-08-06 15:23:34 LustreError: 4489:0:(echo_client.c:1959:echo_md_destroy_internal()) Can not unlink child tests1: rc = -39
2013-08-06 15:24:13 LustreError: 4557:0:(echo_client.c:1744:echo_md_lookup()) lookup tests: rc = -2
2013-08-06 15:24:13 LustreError: 4557:0:(echo_client.c:1943:echo_md_destroy_internal()) Can&apos;t find child tests: rc = -2
2013-08-06 15:24:13 LustreError: 4559:0:(echo_client.c:1744:echo_md_lookup()) lookup tests1: rc = -2
2013-08-06 15:24:13 LustreError: 4559:0:(echo_client.c:1943:echo_md_destroy_internal()) Can&apos;t find child tests1: rc = -2
2013-08-06 15:25:58 LustreError: 5279:0:(echo_client.c:1744:echo_md_lookup()) lookup tests: rc = -2
2013-08-06 15:25:58 LustreError: 5279:0:(echo_client.c:1744:echo_md_lookup()) Skipped 2 previous similar messages
2013-08-06 15:25:58 LustreError: 5279:0:(echo_client.c:1943:echo_md_destroy_internal()) Can&apos;t find child tests: rc = -2
2013-08-06 15:25:58 LustreError: 5279:0:(echo_client.c:1943:echo_md_destroy_internal()) Skipped 2 previous similar messages
2013-08-06 15:27:38 LustreError: 4181:0:(mdt_handler.c:2352:mdt_obj()) ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) ) failed:
2013-08-06 15:27:38 LustreError: 4181:0:(mdt_handler.c:2352:mdt_obj()) LBUG
2013-08-06 15:27:38 Pid: 4181, comm: mdt01_010
2013-08-06 15:27:38
2013-08-06 15:27:38 Call Trace:
2013-08-06 15:27:38  [&amp;lt;ffffffffa071f8f5&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
2013-08-06 15:27:38  [&amp;lt;ffffffffa071fef7&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
2013-08-06 15:27:38  [&amp;lt;ffffffffa0f9f365&amp;gt;] mdt_obj+0x55/0x80 [mdt]
2013-08-06 15:27:38  [&amp;lt;ffffffffa0fa2db6&amp;gt;] mdt_object_find+0x66/0x170 [mdt]
2013-08-06 15:27:38  [&amp;lt;ffffffffa0fa317c&amp;gt;] mdt_unpack_req_pack_rep+0x2bc/0x4d0 [mdt]
2013-08-06 15:27:38  [&amp;lt;ffffffffa0fa5ef3&amp;gt;] mdt_intent_policy+0x353/0x720 [mdt]
2013-08-06 15:27:38  [&amp;lt;ffffffffa09d559e&amp;gt;] ldlm_lock_enqueue+0x11e/0x960 [ptlrpc]
2013-08-06 15:27:38  [&amp;lt;ffffffffa09fc2bf&amp;gt;] ldlm_handle_enqueue0+0x4ef/0x10b0 [ptlrpc]
2013-08-06 15:27:38  [&amp;lt;ffffffffa0fa63c6&amp;gt;] mdt_enqueue+0x46/0xe0 [mdt]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0facab8&amp;gt;] mdt_handle_common+0x648/0x1660 [mdt]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0fe6145&amp;gt;] mds_regular_handle+0x15/0x20 [mdt]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a2e738&amp;gt;] ptlrpc_server_handle_request+0x398/0xc60 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffffa072063e&amp;gt;] ? cfs_timer_arm+0xe/0x10 [libcfs]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0731def&amp;gt;] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a25a99&amp;gt;] ? ptlrpc_wait_event+0xa9/0x290 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffff81055ab3&amp;gt;] ? __wake_up+0x53/0x70
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a2face&amp;gt;] ptlrpc_main+0xace/0x1700 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a2f000&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a2f000&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffffa0a2f000&amp;gt;] ? ptlrpc_main+0x0/0x1700 [ptlrpc]
2013-08-06 15:27:39  [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="63883" author="bzzz" created="Thu, 8 Aug 2013 16:08:03 +0000"  >&lt;p&gt;is it the only filesystem or you have few running on the same node?&lt;/p&gt;</comment>
                            <comment id="63890" author="behlendorf" created="Thu, 8 Aug 2013 16:46:35 +0000"  >&lt;p&gt;We have the mdt and mgs for a single Lustre filesystem running in different zfs datasets on the same node.&lt;/p&gt;</comment>
                            <comment id="67141" author="di.wang" created="Fri, 20 Sep 2013 17:22:06 +0000"  >&lt;p&gt;Usually this LBUG means mds-survey and other clients(probably normal lustre client) access the same object at the same time, or the object has been accessed  by normal clients previously, and object stack is still in the server cache. &lt;/p&gt;

&lt;p&gt;Are there any normal lustre client being attached at the same time when you were running the test. I would suggest you umount all of clients, and remount the server(to clear the cache) before run mds-survey.   &lt;/p&gt;</comment>
                            <comment id="67382" author="behlendorf" created="Tue, 24 Sep 2013 16:12:20 +0000"  >&lt;p&gt;Yes, we do have other Lustre clients mounted during the test.  It&apos;s very inconvenient to have to unmount them.  It would be best if it were possible to run mds-survey safely with mounted clients.&lt;/p&gt;</comment>
                            <comment id="79251" author="jamesanunez" created="Thu, 13 Mar 2014 16:51:45 +0000"  >&lt;p&gt;I ran into this assertion when running sanity test 225a &lt;a href=&quot;https://maloo.whamcloud.com/test_sessions/f6c5b8e6-aac8-11e3-a41c-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sessions/f6c5b8e6-aac8-11e3-a41c-52540035b04c&lt;/a&gt; . &lt;br/&gt;
Unfortunately, there are no logs on the MDS. I&apos;ll try and run again to see what I can collect.&lt;/p&gt;</comment>
                            <comment id="79367" author="jamesanunez" created="Fri, 14 Mar 2014 19:47:12 +0000"  >&lt;p&gt;Running 2.5.1-RC4, using ldiskfs, I ran into this assertion when running sanity test_225a. The test results are here &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/13e6472a-aba9-11e3-a696-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/13e6472a-aba9-11e3-a696-52540035b04c&lt;/a&gt;, but there are not MDS logs. I was able to capture some information from the crash dump.&lt;/p&gt;

&lt;p&gt;From the MDSs console&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Message from syslogd@c16 at Mar 13 16:10:16 ...
 kernel:LustreError: 32557:0:(mdt_handler.c:2414:mdt_obj()) ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) ) failed: 

Message from syslogd@c16 at Mar 13 16:10:16 ...
 kernel:LustreError: 32557:0:(mdt_handler.c:2414:mdt_obj()) LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;From dmesg save by crash:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;&amp;lt;4&amp;gt;Lustre: DEBUG MARKER: == sanity test 225a: Metadata survey sanity with zero-stripe == 16:10:12 (1394752212)
&amp;lt;6&amp;gt;Lustre: Echo OBD driver; http://www.lustre.org/
&amp;lt;3&amp;gt;LustreError: 2019:0:(echo_client.c:1743:echo_md_lookup()) lookup MDT0000-tests: rc = -2
&amp;lt;3&amp;gt;LustreError: 2019:0:(echo_client.c:1942:echo_md_destroy_internal()) Can&apos;t find child MDT0000-tests: rc = -2
&amp;lt;3&amp;gt;LustreError: 2041:0:(echo_client.c:1743:echo_md_lookup()) lookup MDT0000-tests1: rc = -2
&amp;lt;3&amp;gt;LustreError: 2041:0:(echo_client.c:1942:echo_md_destroy_internal()) Can&apos;t find child MDT0000-tests1: rc = -2
&amp;lt;0&amp;gt;LustreError: 32557:0:(mdt_handler.c:2414:mdt_obj()) ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) ) failed: 
&amp;lt;0&amp;gt;LustreError: 32557:0:(mdt_handler.c:2414:mdt_obj()) LBUG
&amp;lt;4&amp;gt;Pid: 32557, comm: mdt00_004
&amp;lt;4&amp;gt;
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa052e895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa052ee97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e37365&amp;gt;] mdt_obj+0x55/0x80 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3bc46&amp;gt;] mdt_object_find+0x66/0x170 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e4ef84&amp;gt;] mdt_getattr_name_lock+0x7f4/0x1aa0 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0821ba5&amp;gt;] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0848bb6&amp;gt;] ? __req_capsule_get+0x166/0x710 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0823e34&amp;gt;] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e504c9&amp;gt;] mdt_intent_getattr+0x299/0x480 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3ea0e&amp;gt;] mdt_intent_policy+0x3ae/0x770 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa07d9511&amp;gt;] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0802abf&amp;gt;] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3eed6&amp;gt;] mdt_enqueue+0x46/0xe0 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e45bca&amp;gt;] mdt_handle_common+0x52a/0x1470 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e80545&amp;gt;] mds_regular_handle+0x15/0x20 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0832a45&amp;gt;] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa05403df&amp;gt;] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa082a0e9&amp;gt;] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0833dad&amp;gt;] ptlrpc_main+0xaed/0x1740 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa08332c0&amp;gt;] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109aee6&amp;gt;] kthread+0x96/0xa0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c20a&amp;gt;] child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109ae50&amp;gt;] ? kthread+0x0/0xa0
&amp;lt;3&amp;gt;LustreError: 2242:0:(echo_client.c:1016:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
&amp;lt;4&amp;gt;
&amp;lt;0&amp;gt;Kernel panic - not syncing: LBUG
&amp;lt;4&amp;gt;Pid: 32557, comm: mdt00_004 Not tainted 2.6.32-431.5.1.el6_lustre.x86_64 #1
&amp;lt;4&amp;gt;Call Trace:
&amp;lt;4&amp;gt; [&amp;lt;ffffffff81527983&amp;gt;] ? panic+0xa7/0x16f
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa052eeeb&amp;gt;] ? lbug_with_loc+0x9b/0xb0 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e37365&amp;gt;] ? mdt_obj+0x55/0x80 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3bc46&amp;gt;] ? mdt_object_find+0x66/0x170 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e4ef84&amp;gt;] ? mdt_getattr_name_lock+0x7f4/0x1aa0 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0821ba5&amp;gt;] ? lustre_msg_buf+0x55/0x60 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0848bb6&amp;gt;] ? __req_capsule_get+0x166/0x710 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0823e34&amp;gt;] ? lustre_msg_get_flags+0x34/0xb0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e504c9&amp;gt;] ? mdt_intent_getattr+0x299/0x480 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3ea0e&amp;gt;] ? mdt_intent_policy+0x3ae/0x770 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa07d9511&amp;gt;] ? ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc]
&amp;lt;3&amp;gt;LustreError: 2242:0:(echo_client.c:1016:echo_device_free()) echo_client still has objects at cleanup time, wait for 1 second
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0802abf&amp;gt;] ? ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e3eed6&amp;gt;] ? mdt_enqueue+0x46/0xe0 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e45bca&amp;gt;] ? mdt_handle_common+0x52a/0x1470 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0e80545&amp;gt;] ? mds_regular_handle+0x15/0x20 [mdt]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0832a45&amp;gt;] ? ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa05403df&amp;gt;] ? lc_watchdog_touch+0x6f/0x170 [libcfs]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa082a0e9&amp;gt;] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa0833dad&amp;gt;] ? ptlrpc_main+0xaed/0x1740 [ptlrpc]
&amp;lt;4&amp;gt; [&amp;lt;ffffffffa08332c0&amp;gt;] ? ptlrpc_main+0x0/0x1740 [ptlrpc]
&amp;lt;4&amp;gt;Lustre: DEBUG MARKER: sanity test_225a: @@@@@@ FAIL: LBUG/LASSERT detected
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109aee6&amp;gt;] ? kthread+0x96/0xa0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c20a&amp;gt;] ? child_rip+0xa/0x20
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8109ae50&amp;gt;] ? kthread+0x0/0xa0
&amp;lt;4&amp;gt; [&amp;lt;ffffffff8100c200&amp;gt;] ? child_rip+0x0/0x20
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I experienced this crash both times I ran sanity test 225a.&lt;/p&gt;</comment>
                            <comment id="79859" author="yujian" created="Thu, 20 Mar 2014 14:58:14 +0000"  >&lt;p&gt;Both sanity test 225a and 225b run mds-survey.&lt;/p&gt;

&lt;p&gt;I saw the following error in the test output of sanity test 225a in &lt;a href=&quot;https://maloo.whamcloud.com/test_sets/f9ad31f6-aac8-11e3-a41c-52540035b04c&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;https://maloo.whamcloud.com/test_sets/f9ad31f6-aac8-11e3-a41c-52540035b04c&lt;/a&gt; :&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;ERROR: Module obdecho is in use
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;So, this is the same situation that Wang Di described above:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Usually this LBUG means mds-survey and other clients(probably normal lustre client) access the same object at the same time, or the object has been accessed by normal clients previously, and object stack is still in the server cache.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Wang Di, should I improve mds-survey to check whether obdecho is in use or not so as to avoid the LBUG?&lt;/p&gt;</comment>
                            <comment id="79912" author="di.wang" created="Thu, 20 Mar 2014 18:13:51 +0000"  >&lt;p&gt;Hmm, the right fix might be disconnect echo objects to the name space, i.e. these objects can only be accessed by echo client, then echo client and normal client can be attached at the same time. Probably echo test should be done in a special directory.&lt;/p&gt;</comment>
                            <comment id="81942" author="yujian" created="Fri, 18 Apr 2014 14:00:02 +0000"  >&lt;p&gt;For echo client, the echo md objects are created by echo_md_create_internal()-&amp;gt;mdd_create():&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00008000:00100000:1.0:1397818339.647451:0:3611:0:(echo_client.c:1602:echo_md_create_internal()) Start creating object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0
00000004:00000001:1.0:1397818339.647453:0:3611:0:(mdd_dir.c:2164:mdd_create()) Process entered
......
00000004:00000001:1.0:1397818339.647535:0:3611:0:(mdd_object.c:375:mdd_object_create_internal()) Process entered
......
00000004:00000001:1.0:1397818339.647564:0:3611:0:(mdd_object.c:383:mdd_object_create_internal()) Process leaving (rc=0 : 0 : 0)
......
00000004:00000001:1.0:1397818339.647606:0:3611:0:(mdd_dir.c:2272:mdd_create()) Process leaving
00008000:00100000:1.0:1397818339.647614:0:3611:0:(echo_client.c:1614:echo_md_create_internal()) End creating object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0 rc  = 0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And looked up by echo_lookup_object()-&amp;gt;mdd_lookup():&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00008000:00100000:1.0:1397818505.932739:0:3626:0:(echo_client.c:1897:echo_lookup_object()) Start lookup object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0
00000004:00000001:1.0:1397818505.932741:0:3626:0:(mdd_dir.c:115:mdd_lookup()) Process entered
00000004:00000001:1.0:1397818505.932741:0:3626:0:(mdd_dir.c:72:__mdd_lookup()) Process entered
......
00000004:00000001:1.0:1397818505.932760:0:3626:0:(mdd_dir.c:106:__mdd_lookup()) Process leaving (rc=0 : 0 : 0)
00000004:00000001:1.0:1397818505.932761:0:3626:0:(mdd_dir.c:122:mdd_lookup()) Process leaving (rc=0 : 0 : 0)
00008000:00100000:1.0:1397818505.932762:0:3626:0:(echo_client.c:1905:echo_lookup_object()) End lookup object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And destroyed by echo_md_destroy_internal()-&amp;gt;mdd_unlink():&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;00008000:00100000:9.0:1397818817.663055:0:3650:0:(echo_client.c:1944:echo_md_destroy_internal()) Start destroy object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0
00000004:00000001:9.0:1397818817.663057:0:3650:0:(mdd_dir.c:1499:mdd_unlink()) Process entered
......
00000004:00000001:9.0:1397818817.663166:0:3650:0:(mdd_dir.c:1603:mdd_unlink()) Process leaving
00008000:00100000:9.0:1397818817.663182:0:3650:0:(echo_client.c:1953:echo_md_destroy_internal()) End destroy object [0x200000400:0x1:0x0] 3373 ffff88031cd5a5a0
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Hi Di,&lt;/p&gt;

&lt;p&gt;If we allow echo client and normal client attached and operate at the same time, we should distinguish echo md objects from normal md objects, do I understand correctly?&lt;/p&gt;</comment>
                            <comment id="81950" author="di.wang" created="Fri, 18 Apr 2014 15:53:35 +0000"  >&lt;p&gt;Yes, that is what I mean, i.e. we should separate echo objects from the normal namespace, so normal clients will not access these echo objects. &lt;/p&gt;</comment>
                            <comment id="82039" author="yujian" created="Mon, 21 Apr 2014 12:33:57 +0000"  >&lt;p&gt;Currently, in mdd_create(), both echo object and normal object are created via:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mdd_object_create_internal()-&amp;gt;...-&amp;gt;osd_object_ea_create()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and inserted into normal namespace via:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;__mdd_index_insert()-&amp;gt;...-&amp;gt;osd_index_ea_insert()-&amp;gt;...-&amp;gt;__osd_ea_add_rec()-&amp;gt;osd_ldiskfs_add_entry()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The objects are looked up via:&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;mdd_lookup()-&amp;gt;...-&amp;gt;osd_index_ea_lookup()-&amp;gt;osd_ea_lookup_rec()-&amp;gt;osd_ldiskfs_find_entry()
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To separate echo objects from normal namespace, it seems we can check &quot;FID_SEQ_ECHO&quot; sequence number after mdd_object_create() and do not run __mdd_index_insert() for echo objects. However, this will make the echo objects unable to be looked up.&lt;/p&gt;

&lt;p&gt;A possible way is to make sure the &quot;parent&quot; object for echo object is always different from the one for normal object, as Di mentioned in previous comment that &quot;Probably echo test should be done in a special directory&quot;.&lt;/p&gt;

&lt;p&gt;Hi Di, is this a proper way to resolve the issue? If yes, could you please suggest what special directory I should create in jt_obd_md_common()? Thanks.&lt;/p&gt;</comment>
                            <comment id="82237" author="di.wang" created="Wed, 23 Apr 2014 04:12:24 +0000"  >&lt;p&gt;IMHO, I do not think you need(should) change MDD. I think you create create a separate root object for echo access,  so in echo_resolve_path, it will always start from that root object, and form its own tree, instead of using file system ROOT. And normal client will never see this tree under this special root. &lt;/p&gt;

&lt;p&gt;You can see how MDD create the root object in mdd_prepare.&lt;/p&gt;</comment>
                            <comment id="82626" author="yujian" created="Mon, 28 Apr 2014 16:12:12 +0000"  >&lt;p&gt;Patch for master branch: &lt;a href=&quot;http://review.whamcloud.com/10130&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10130&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="100239" author="doug" created="Fri, 28 Nov 2014 18:54:17 +0000"  >&lt;p&gt;Can this patch be rebased?&lt;/p&gt;</comment>
                            <comment id="100247" author="yujian" created="Sat, 29 Nov 2014 05:02:49 +0000"  >&lt;blockquote&gt;&lt;p&gt;Can this patch be rebased?&lt;/p&gt;&lt;/blockquote&gt;
&lt;p&gt;Sure, will do.&lt;/p&gt;</comment>
                            <comment id="104458" author="gerrit" created="Fri, 23 Jan 2015 02:01:29 +0000"  >&lt;p&gt;Oleg Drokin (oleg.drokin@intel.com) merged in patch &lt;a href=&quot;http://review.whamcloud.com/10130/&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/10130/&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3716&quot; title=&quot;mds-survey ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3716&quot;&gt;&lt;del&gt;LU-3716&lt;/del&gt;&lt;/a&gt; obdecho: create a separate root object for echo access&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: master&lt;br/&gt;
Current Patch Set: &lt;br/&gt;
Commit: 897580eb1562e0509bbe8ea72a273ed71f878eaa&lt;/p&gt;</comment>
                            <comment id="104475" author="pjones" created="Fri, 23 Jan 2015 05:21:45 +0000"  >&lt;p&gt;Landed for 2.7&lt;/p&gt;</comment>
                            <comment id="104822" author="gerrit" created="Tue, 27 Jan 2015 05:40:53 +0000"  >&lt;p&gt;Jian Yu (jian.yu@intel.com) uploaded a new patch: &lt;a href=&quot;http://review.whamcloud.com/13530&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/13530&lt;/a&gt;&lt;br/&gt;
Subject: &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-3716&quot; title=&quot;mds-survey ASSERTION( lu_device_is_mdt(o-&amp;gt;lo_dev) )&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-3716&quot;&gt;&lt;del&gt;LU-3716&lt;/del&gt;&lt;/a&gt; obdecho: create a separate root object for echo access&lt;br/&gt;
Project: fs/lustre-release&lt;br/&gt;
Branch: b2_5&lt;br/&gt;
Current Patch Set: 1&lt;br/&gt;
Commit: 9a208e3e205a1a31b2773893f8333dbc275e0e6e&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10011">
                    <name>Related</name>
                                            <outwardlinks description="is related to ">
                                                        </outwardlinks>
                                                                <inwardlinks description="is related to">
                                                        </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvxbb:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>9570</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>