<!-- 
RSS generated by JIRA (9.4.14#940014-sha1:734e6822bbf0d45eff9af51f82432957f73aa32c) at Sat Feb 10 01:29:18 UTC 2024

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary append 'field=key&field=summary' to the URL of your request.
-->
<rss version="0.92" >
<channel>
    <title>Whamcloud Community JIRA</title>
    <link>https://jira.whamcloud.com</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>9.4.14</version>
        <build-number>940014</build-number>
        <build-date>05-12-2023</build-date>
    </build-info>


<item>
            <title>[LU-2911] After upgrading from 1.8.9 to master, hit ASSERTION( seq != ((void *)0) ) failed</title>
                <link>https://jira.whamcloud.com/browse/LU-2911</link>
                <project id="10000" key="LU">Lustre</project>
                    <description>&lt;p&gt;After upgrading the system from 1.8.9 to 2.4, one of the client hit LBUG:&lt;/p&gt;

&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Lustre: DEBUG MARKER: Using TIMEOUT=20
Lustre: DEBUG MARKER: ===== Check the directories/files in the OST pool
Lustre: DEBUG MARKER: ===== Pass
Lustre: DEBUG MARKER: ===== Check Lustre quotas usage/limits
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: cancel_lru_locks mdc start
Lustre: DEBUG MARKER: cancel_lru_locks osc start
Lustre: DEBUG MARKER: ===== Verify the data
LustreError: 9327:0:(fid_request.c:329:seq_client_alloc_fid()) ASSERTION( seq != ((void *)0) ) failed: 
LustreError: 9327:0:(fid_request.c:329:seq_client_alloc_fid()) LBUG
Pid: 9327, comm: mkdir

Call Trace:
 [&amp;lt;ffffffffa0371895&amp;gt;] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
 [&amp;lt;ffffffffa0371e97&amp;gt;] lbug_with_loc+0x47/0xb0 [libcfs]
 [&amp;lt;ffffffffa080bea9&amp;gt;] seq_client_alloc_fid+0x379/0x440 [fid]
 [&amp;lt;ffffffffa03822e1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 [&amp;lt;ffffffffa082470b&amp;gt;] mdc_fid_alloc+0xbb/0xf0 [mdc]
 [&amp;lt;ffffffffa0832b1c&amp;gt;] mdc_create+0xcc/0x780 [mdc]
 [&amp;lt;ffffffffa09c487b&amp;gt;] ll_new_node+0x19b/0x6a0 [lustre]
 [&amp;lt;ffffffffa09c50a7&amp;gt;] ll_mkdir+0x97/0x1f0 [lustre]
 [&amp;lt;ffffffff8120db8f&amp;gt;] ? security_inode_permission+0x1f/0x30
 [&amp;lt;ffffffff811833f7&amp;gt;] vfs_mkdir+0xa7/0x100
 [&amp;lt;ffffffff8118656e&amp;gt;] sys_mkdirat+0xfe/0x120
 [&amp;lt;ffffffff810d3d27&amp;gt;] ? audit_syscall_entry+0x1d7/0x200
 [&amp;lt;ffffffff811865a8&amp;gt;] sys_mkdir+0x18/0x20
 [&amp;lt;ffffffff8100b072&amp;gt;] system_call_fastpath+0x16/0x1b

Kernel panic - not syncing: LBUG
Pid: 9327, comm: mkdir Not tainted 2.6.32-279.19.1.el6.x86_64 #1
Call Trace:
 [&amp;lt;ffffffff814e9541&amp;gt;] ? panic+0xa0/0x168
 [&amp;lt;ffffffffa0371eeb&amp;gt;] ? lbug_with_loc+0x9b/0xb0 [libcfs]
 [&amp;lt;ffffffffa080bea9&amp;gt;] ? seq_client_alloc_fid+0x379/0x440 [fid]
 [&amp;lt;ffffffffa03822e1&amp;gt;] ? libcfs_debug_msg+0x41/0x50 [libcfs]
 [&amp;lt;ffffffffa082470b&amp;gt;] ? mdc_fid_alloc+0xbb/0xf0 [mdc]
 [&amp;lt;ffffffffa0832b1c&amp;gt;] ? mdc_create+0xcc/0x780 [mdc]
 [&amp;lt;ffffffffa09c487b&amp;gt;] ? ll_new_node+0x19b/0x6a0 [lustre]
 [&amp;lt;ffffffffa09c50a7&amp;gt;] ? ll_mkdir+0x97/0x1f0 [lustre]
 [&amp;lt;ffffffff8120db8f&amp;gt;] ? security_inode_permission+0x1f/0x30
 [&amp;lt;ffffffff811833f7&amp;gt;] ? vfs_mkdir+0xa7/0x100
 [&amp;lt;ffffffff8118656e&amp;gt;] ? sys_mkdirat+0xfe/0x120
 [&amp;lt;ffffffff810d3d27&amp;gt;] ? audit_syscall_entry+0x1d7/0x200
 [&amp;lt;ffffffff811865a8&amp;gt;] ? sys_mkdir+0x18/0x20
 [&amp;lt;ffffffff8100b072&amp;gt;] ? system_call_fastpath+0x16/0x1b
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</description>
                <environment>Before upgrade servers are running 1.8.9 RHEL5;  two clients are running 1.8.9 RHEL6; one client is running 1.8.9 RHEL5&lt;br/&gt;
&lt;br/&gt;
After upgrade both servers and clients are running lustre-master build# 1280 RHEL6&lt;br/&gt;
</environment>
        <key id="17768">LU-2911</key>
            <summary>After upgrading from 1.8.9 to master, hit ASSERTION( seq != ((void *)0) ) failed</summary>
                <type id="1" iconUrl="https://jira.whamcloud.com/secure/viewavatar?size=xsmall&amp;avatarId=11303&amp;avatarType=issuetype">Bug</type>
                                            <priority id="1" iconUrl="https://jira.whamcloud.com/images/icons/priorities/blocker.svg">Blocker</priority>
                        <status id="5" iconUrl="https://jira.whamcloud.com/images/icons/statuses/resolved.png" description="A resolution has been taken, and it is awaiting verification by reporter. From here issues are either reopened, or are closed.">Resolved</status>
                    <statusCategory id="3" key="done" colorName="success"/>
                                    <resolution id="1">Fixed</resolution>
                                        <assignee username="emoly.liu">Emoly Liu</assignee>
                                    <reporter username="sarah">Sarah Liu</reporter>
                        <labels>
                            <label>LB</label>
                    </labels>
                <created>Tue, 5 Mar 2013 15:26:33 +0000</created>
                <updated>Thu, 21 Mar 2013 04:37:57 +0000</updated>
                            <resolved>Thu, 21 Mar 2013 04:37:57 +0000</resolved>
                                    <version>Lustre 2.4.0</version>
                                    <fixVersion>Lustre 2.4.0</fixVersion>
                                        <due></due>
                            <votes>0</votes>
                                    <watches>8</watches>
                                                                            <comments>
                            <comment id="53389" author="adilger" created="Tue, 5 Mar 2013 15:45:54 +0000"  >&lt;p&gt;Sarah, what test is being run here?  Is this conf-sanity.sh test_32 or something else?&lt;/p&gt;</comment>
                            <comment id="53397" author="sarah" created="Tue, 5 Mar 2013 18:23:25 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;

&lt;p&gt;Not conf-sanity, it occurred when extracting a kernel tar ball.  &lt;/p&gt;</comment>
                            <comment id="53676" author="emoly.liu" created="Mon, 11 Mar 2013 04:25:00 +0000"  >&lt;p&gt;I did this upgrade test on my local 3 VMs, following the wiki page &lt;a href=&quot;http://wiki.whamcloud.com/display/ENG/Upgrade+and+Downgrade+Testing&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://wiki.whamcloud.com/display/ENG/Upgrade+and+Downgrade+Testing&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Clean upgrade test steps:&lt;/p&gt;

&lt;p&gt;1.setup and start Lustre 1.8.x filesystem&lt;br/&gt;
2.extract a kernel tarball into a Lustre test directory&lt;br/&gt;
3.shutdown the entire Lustre filesystem&lt;br/&gt;
4.upgrade all Lustre servers and clients at once to Lustre 2.x&lt;br/&gt;
5.start the entire Lustre filesystem&lt;br/&gt;
6.extract the same kernel tarball into a new Lustre test directory&lt;br/&gt;
7.compare the old and new test directories (no difference)&lt;br/&gt;
8.run auster test suite&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;This LBUG happened in step 6, after I made the other directory to extract the same kernel tar ball.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;
[root@centos6-1 lustre]# ls
kernel1
[root@centos6-1 lustre]# mkdir kernel2

Message from syslogd@localhost at Mar 11 12:47:37 ...
 kernel:LustreError: 3380:0:(fid_request.c:329:seq_client_alloc_fid()) ASSERTION( seq != ((void *)0) ) failed: 

Message from syslogd@localhost at Mar 11 12:47:37 ...
 kernel:LustreError: 3380:0:(fid_request.c:329:seq_client_alloc_fid()) LBUG

Message from syslogd@localhost at Mar 11 12:47:37 ...
 kernel:Kernel panic - not syncing: LBUG
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;</comment>
                            <comment id="53887" author="emoly.liu" created="Wed, 13 Mar 2013 06:07:39 +0000"  >&lt;p&gt;I want to add some logs to see if seq is allocated correctly in client_fid_init(), but this upgrading process is a little complicated on my 3 VMs from centos5 to centos6.&lt;/p&gt;

&lt;p&gt;I also tried to use two VMs instead, one for master and the other for b18, sharing the lustre devices. During the test, I used tunefs.lustre to reset mgsnode, which erased all config logs and params. Probably due to this operation, I failed to reproduce.&lt;/p&gt;

&lt;p&gt;Next, I will try to use two VMs with the same IP. If fail again, I will create a patch for test only and reproduce it on Toro nodes.&lt;/p&gt;

&lt;p&gt;Btw, this LBUG still happens in our new build. &lt;/p&gt;</comment>
                            <comment id="54025" author="emoly.liu" created="Thu, 14 Mar 2013 12:37:07 +0000"  >&lt;p&gt;Fortunately, I can reproduce this LBUG on two VMs with the same IP address.&lt;/p&gt;

&lt;p&gt;The /var/log/message showed that seq was allocated during MDT upgrade loading.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Mar 14 19:42:55 centos5-3 kernel: LDISKFS-fs (loop0): mounted filesystem with ordered data mode. quota=off. Opts: 
Mar 14 19:42:55 centos5-3 kernel: Lustre: 4799:0:(mgs_llog.c:238:mgs_fsdb_handler()) MDT using 1.8 OSC name scheme
Mar 14 19:42:55 centos5-3 kernel: Lustre: 4800:0:(obd_config.c:1428:class_config_llog_handler()) For 1.8 interoperability, rename obd type from mds to mdt
Mar 14 19:42:55 centos5-3 kernel: Lustre: lustre-MDT0000: used disk, loading

Mar 14 19:42:55 centos5-3 kernel: client_fid_init: cli-cli-lustre-OST0000-osc: Allocated sequence [0x0]
Mar 14 19:42:55 centos5-3 kernel: client_fid_init: cli-cli-lustre-OST0001-osc: Allocated sequence [0x0]

Mar 14 19:42:55 centos5-3 kernel: LustreError: 11-0: lustre-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11.
Mar 14 19:42:55 centos5-3 kernel: LDISKFS-fs (loop1): mounted filesystem with ordered data mode. quota=off. Opts: 
Mar 14 19:42:56 centos5-3 kernel: LDISKFS-fs (loop2): mounted filesystem with ordered data mode. quota=off. Opts: 
Mar 14 19:42:56 centos5-3 kernel: Lustre: lustre-MDT0000: Will be in recovery for at least 5:00, or until 1 client reconnects
Mar 14 19:42:56 centos5-3 kernel: Lustre: lustre-MDT0000: Denying connection for new client a1ea11db-09a3-2aca-f450-b0505ca2899c (at 0@lo), waiting for all 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 5:00
Mar 14 19:42:56 centos5-3 kernel: LustreError: 11-0: lustre-MDT0000-mdc-ffff88003d6bb800: Communicating with 0@lo, operation mds_connect failed with -16.
Mar 14 19:43:00 centos5-3 kernel: Lustre: 4440:0:(client.c:1868:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1363261375/real 1363261375]  req@ffff8800118fac00 x1429483048402984/t0(0) o8-&amp;gt;lustre-OST0001-osc@0@lo:28/4 lens 400/544 e 0 to 1 dl 1363261380 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
Mar 14 19:43:21 centos5-3 kernel: Lustre: lustre-OST0000: Will be in recovery for at least 5:00, or until 1 client reconnects
Mar 14 19:43:21 centos5-3 kernel: Lustre: lustre-OST0000: Denying connection for new client lustre-mdtlov_UUID (at 0@lo), waiting for all 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 5:00
Mar 14 19:43:21 centos5-3 kernel: LustreError: 11-0: lustre-OST0000-osc: Communicating with 0@lo, operation ost_connect failed with -16.
Mar 14 19:43:46 centos5-3 kernel: Lustre: lustre-OST0000: Denying connection for new client lustre-mdtlov_UUID (at 0@lo), waiting for all 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 4:34
Mar 14 19:43:46 centos5-3 kernel: Lustre: Skipped 2 previous similar messages
Mar 14 19:43:46 centos5-3 kernel: LustreError: 11-0: lustre-OST0000-osc: Communicating with 0@lo, operation ost_connect failed with -16.
...
Mar 14 19:46:16 centos5-3 kernel: LustreError: Skipped 5 previous similar messages
Mar 14 19:46:32 centos5-3 kernel: INFO: task tgt_recov:4809 blocked for more than 120 seconds.
Mar 14 19:46:32 centos5-3 kernel: &quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
Mar 14 19:46:32 centos5-3 kernel: tgt_recov     D 0000000000000000     0  4809      2 0x00000080
Mar 14 19:46:32 centos5-3 kernel: ffff880011b27e00 0000000000000046 0000000000000000 0000000000000001
Mar 14 19:46:32 centos5-3 kernel: ffff880011b27d70 ffffffff8102a259 ffff880011b27d80 ffffffff8104e048
Mar 14 19:46:32 centos5-3 kernel: ffff880011b25af8 ffff880011b27fd8 000000000000fb88 ffff880011b25af8
Mar 14 19:46:32 centos5-3 kernel: Call Trace:
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8102a259&amp;gt;] ? native_smp_send_reschedule+0x49/0x60
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8104e048&amp;gt;] ? resched_task+0x68/0x80
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa08410c0&amp;gt;] ? check_for_clients+0x0/0x70 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0842a1d&amp;gt;] target_recovery_overseer+0x9d/0x230 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0840ec0&amp;gt;] ? exp_connect_healthy+0x0/0x20 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff81090990&amp;gt;] ? autoremove_wake_function+0x0/0x40
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa084970e&amp;gt;] target_recovery_thread+0x58e/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
Mar 14 19:46:32 centos5-3 kernel: INFO: task tgt_recov:4853 blocked for more than 120 seconds.
Mar 14 19:46:32 centos5-3 kernel: &quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
Mar 14 19:46:32 centos5-3 kernel: tgt_recov     D 0000000000000001     0  4853      2 0x00000080
Mar 14 19:46:32 centos5-3 kernel: ffff880010245e00 0000000000000046 0000000000000001 0000000000000000
Mar 14 19:46:32 centos5-3 kernel: ffff880010245d70 ffffffff8102a259 ffff880010245d80 ffffffff8104e048
Mar 14 19:46:32 centos5-3 kernel: ffff88001023d058 ffff880010245fd8 000000000000fb88 ffff88001023d058
Mar 14 19:46:32 centos5-3 kernel: Call Trace:
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8102a259&amp;gt;] ? native_smp_send_reschedule+0x49/0x60
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8104e048&amp;gt;] ? resched_task+0x68/0x80
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff81090c7e&amp;gt;] ? prepare_to_wait+0x4e/0x80
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa08410c0&amp;gt;] ? check_for_clients+0x0/0x70 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0842a1d&amp;gt;] target_recovery_overseer+0x9d/0x230 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0840ec0&amp;gt;] ? exp_connect_healthy+0x0/0x20 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff81090990&amp;gt;] ? autoremove_wake_function+0x0/0x40
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa084970e&amp;gt;] target_recovery_thread+0x58e/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
Mar 14 19:46:32 centos5-3 kernel: INFO: task tgt_recov:4873 blocked for more than 120 seconds.
Mar 14 19:46:32 centos5-3 kernel: &quot;echo 0 &amp;gt; /proc/sys/kernel/hung_task_timeout_secs&quot; disables this message.
Mar 14 19:46:32 centos5-3 kernel: tgt_recov     D 0000000000000000     0  4873      2 0x00000080
Mar 14 19:46:32 centos5-3 kernel: ffff88000fbcbe00 0000000000000046 0000000000000000 0000000000000000
Mar 14 19:46:32 centos5-3 kernel: ffff88000fbcbd70 ffffffff8102a259 ffff88000fbcbd80 ffffffff8104e048
Mar 14 19:46:32 centos5-3 kernel: ffff88001024b098 ffff88000fbcbfd8 000000000000fb88 ffff88001024b098
Mar 14 19:46:32 centos5-3 kernel: Call Trace:
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8102a259&amp;gt;] ? native_smp_send_reschedule+0x49/0x60
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8104e048&amp;gt;] ? resched_task+0x68/0x80
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa08410c0&amp;gt;] ? check_for_clients+0x0/0x70 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0842a1d&amp;gt;] target_recovery_overseer+0x9d/0x230 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0840ec0&amp;gt;] ? exp_connect_healthy+0x0/0x20 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff81090990&amp;gt;] ? autoremove_wake_function+0x0/0x40
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa084970e&amp;gt;] target_recovery_thread+0x58e/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0ca&amp;gt;] child_rip+0xa/0x20
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffffa0849180&amp;gt;] ? target_recovery_thread+0x0/0x1970 [ptlrpc]
Mar 14 19:46:32 centos5-3 kernel: [&amp;lt;ffffffff8100c0c0&amp;gt;] ? child_rip+0x0/0x20
Mar 14 19:46:41 centos5-3 kernel: Lustre: lustre-OST0000: Denying connection for new client lustre-mdtlov_UUID (at 0@lo), waiting for all 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 1:39
Mar 14 19:46:41 centos5-3 kernel: Lustre: Skipped 5 previous similar messages
Mar 14 19:47:31 centos5-3 kernel: LustreError: 11-0: lustre-OST0000-osc: Communicating with 0@lo, operation ost_connect failed with -16.
Mar 14 19:47:31 centos5-3 kernel: LustreError: Skipped 8 previous similar messages
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Five minutes for recovery later, the servers evicted stale clients and export, and deleted orphan objects from 0x0:11018 to 11233. And then when running mkdir, seq became null.&lt;/p&gt;
&lt;div class=&quot;preformatted panel&quot; style=&quot;border-width: 1px;&quot;&gt;&lt;div class=&quot;preformattedContent panelContent&quot;&gt;
&lt;pre&gt;Mar 14 19:47:56 centos5-3 kernel: Lustre: lustre-MDT0000: recovery is timed out, evict stale exports
Mar 14 19:47:56 centos5-3 kernel: Lustre: lustre-MDT0000: disconnecting 1 stale clients
Mar 14 19:47:56 centos5-3 kernel: Lustre: lustre-MDT0000: Recovery over after 5:00, of 1 clients 0 recovered and 1 was evicted.
Mar 14 19:47:56 centos5-3 kernel: Lustre: lustre-OST0000: Denying connection for new client lustre-mdtlov_UUID (at 0@lo), waiting for all 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:24
Mar 14 19:47:56 centos5-3 kernel: Lustre: Skipped 8 previous similar messages
Mar 14 19:48:21 centos5-3 kernel: Lustre: lustre-OST0000: recovery is timed out, evict stale exports
Mar 14 19:48:21 centos5-3 kernel: Lustre: lustre-OST0000: disconnecting 1 stale clients
Mar 14 19:48:21 centos5-3 kernel: Lustre: lustre-OST0001: Recovery over after 5:00, of 1 clients 0 recovered and 1 was evicted.
Mar 14 19:48:21 centos5-3 kernel: Lustre: lustre-OST0001: deleting orphan objects from 0x0:11018 to 11233
Mar 14 19:48:21 centos5-3 kernel: LustreError: 4833:0:(ldlm_resource.c:1161:ldlm_resource_get()) lvbo_init failed for resource 11169: rc -2
Mar 14 19:48:21 centos5-3 kernel: LustreError: 4833:0:(ldlm_resource.c:1161:ldlm_resource_get()) lvbo_init failed for resource 11168: rc -2
Mar 14 19:48:46 centos5-3 kernel: Lustre: Layout lock feature supported.
Mar 14 19:48:46 centos5-3 kernel: Lustre: Mounted lustre-client
Mar 14 19:49:11 centos5-3 kernel: seq_client_alloc_fid: seq is null
&lt;/pre&gt;
&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;I am looking at dk logs and will update if I find anything suspected.&lt;/p&gt;</comment>
                            <comment id="54096" author="emoly.liu" created="Fri, 15 Mar 2013 07:00:36 +0000"  >&lt;p&gt;patch tracking at &lt;a href=&quot;http://review.whamcloud.com/5733&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/5733&lt;/a&gt;&lt;/p&gt;</comment>
                            <comment id="54246" author="emoly.liu" created="Mon, 18 Mar 2013 09:31:26 +0000"  >&lt;p&gt;Andreas,&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Unfortunately, we cannot go back and fix all clients that are going to work with 2.4 servers, so we need to get a server-side fix for this.&lt;br/&gt;
I&apos;m not against fixing old clients anyway (in case there is a new release), but there must have been something landed to master recently that is causing this problem, and the server-side needs to be fixed.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;This LBUG was probably caused by &lt;a href=&quot;https://jira.whamcloud.com/browse/LU-1445&quot; title=&quot;fid on OST landing&quot; class=&quot;issue-link&quot; data-issue-key=&quot;LU-1445&quot;&gt;&lt;del&gt;LU-1445&lt;/del&gt;&lt;/a&gt; &lt;a href=&quot;http://review.whamcloud.com/4787&quot; class=&quot;external-link&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener&quot;&gt;http://review.whamcloud.com/4787&lt;/a&gt;, which removed obsolete obd_fid_init/fini in llite. So I revert that part back to make the code compatible with b18.&lt;/p&gt;</comment>
                            <comment id="54532" author="pjones" created="Thu, 21 Mar 2013 04:37:57 +0000"  >&lt;p&gt;Landed for 2.4&lt;/p&gt;</comment>
                    </comments>
                <issuelinks>
                            <issuelinktype id="10010">
                    <name>Duplicate</name>
                                                                <inwardlinks description="is duplicated by">
                                        <issuelink>
            <issuekey id="17883">LU-2958</issuekey>
        </issuelink>
                            </inwardlinks>
                                    </issuelinktype>
                    </issuelinks>
                <attachments>
                    </attachments>
                <subtasks>
                    </subtasks>
                <customfields>
                                                                                                                                                                                            <customfield id="customfield_10890" key="com.atlassian.jira.plugins.jira-development-integration-plugin:devsummary">
                        <customfieldname>Development</customfieldname>
                        <customfieldvalues>
                            
                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        <customfield id="customfield_10390" key="com.pyxis.greenhopper.jira:gh-lexo-rank">
                        <customfieldname>Rank</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>1|hzvk8f:</customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                <customfield id="customfield_10090" key="com.pyxis.greenhopper.jira:gh-global-rank">
                        <customfieldname>Rank (Obsolete)</customfieldname>
                        <customfieldvalues>
                            <customfieldvalue>7004</customfieldvalue>
                        </customfieldvalues>
                    </customfield>
                                                                                            <customfield id="customfield_10060" key="com.atlassian.jira.plugin.system.customfieldtypes:select">
                        <customfieldname>Severity</customfieldname>
                        <customfieldvalues>
                                <customfieldvalue key="10022"><![CDATA[3]]></customfieldvalue>

                        </customfieldvalues>
                    </customfield>
                                                                                                                                                                                                                                                                                                                                                        </customfields>
    </item>
</channel>
</rss>